[PATCH V4] Rework 128-bit complex multiply and divide.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH V4] Rework 128-bit complex multiply and divide.
@ 2023-03-10  1:40 Michael Meissner
  2023-03-17 19:35 ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Meissner @ 2023-03-10  1:40 UTC (permalink / raw)
  To: gcc-patches, Michael Meissner, Segher Boessenkool, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

This patch reworks how the complex multiply and divide built-in functions are
done.  Previously GCC created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit.  However, it did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

Originally, the patch was part of a larger patch set and the comments reflected
this.  I have removed the comments referring to the other patches.  While this
patch was originally developed as part of those other patches, it is a stand
alone patch.

I have tried to take the comments in the last patch review in this patch.
Note, I will be away from the computer from March 10 through the 13th.  So I
would not be checking in the patches until I get back.  But I thought I would
share the results of the changes that were asked for.

I fixed the complex_multiply_builtin_code and complex_divide_builtin_code
functions to have an assert tht the mode is within the proper modes.  I have
tried to make the code a little bit clearer.

I have cleaned up the tests to eliminate the target powerpc in the tests.  I
have elimited the -mpower8-vector option.  I have changed the scan assembler
lines jut to look for __divtc3 or __multc3, and not depend on the format of the
'bl' call to those functions.  I have kept the -Wno-psabi option, because this
is needed to prevent spurious errors on systems with older libraries (like big
endian) that don't have IEEE 128-bit support.

2023-03-09   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	PR target/109067
	* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
	(init_float128_ieee): Delete code to switch complex multiply and divide
	for long double.
	(complex_multiply_builtin_code): New helper function.
	(complex_divide_builtin_code): Likewise.
	(rs6000_mangle_decl_assembler_name): Add support for mangling the name
	of complex 128-bit multiply and divide built-in functions.

gcc/testsuite/

	PR target/109067
	* gcc.target/powerpc/divic3-1.c: New test.
	* gcc.target/powerpc/divic3-2.c: Likewise.
	* gcc.target/powerpc/mulic3-1.c: Likewise.
	* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000.cc                 | 111 +++++++++++---------
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  21 ++++
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  25 +++++
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  21 ++++
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  25 +++++
 5 files changed, 156 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 8e0b0d022db..fa5f93a874f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11154,26 +11154,6 @@ init_float128_ibm (machine_mode mode)
     }
 }
 
-/* Create a decl for either complex long double multiply or complex long double
-   divide when long double is IEEE 128-bit floating point.  We can't use
-   __multc3 and __divtc3 because the original long double using IBM extended
-   double used those names.  The complex multiply/divide functions are encoded
-   as builtin functions with a complex result and 4 scalar inputs.  */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
-  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
-				      name, NULL_TREE);
-
-  set_builtin_decl (fncode, fndecl, true);
-
-  if (TARGET_DEBUG_BUILTIN)
-    fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
-  return;
-}
-
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
    arguments can be passed in a vector register.  The historical PowerPC
    implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -11185,32 +11165,6 @@ init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
     {
-      static bool complex_muldiv_init_p = false;
-
-      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  If
-	 we have clone or target attributes, this will be called a second
-	 time.  We want to create the built-in function only once.  */
-     if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
-       {
-	 complex_muldiv_init_p = true;
-	 built_in_function fncode_mul =
-	   (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
-				- MIN_MODE_COMPLEX_FLOAT);
-	 built_in_function fncode_div =
-	   (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode
-				- MIN_MODE_COMPLEX_FLOAT);
-
-	 tree fntype = build_function_type_list (complex_long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 NULL_TREE);
-
-	 create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
-	 create_complex_muldiv ("__divkc3", fncode_div, fntype);
-       }
-
       set_optab_libfunc (add_optab, mode, "__addkf3");
       set_optab_libfunc (sub_optab, mode, "__subkf3");
       set_optab_libfunc (neg_optab, mode, "__negkf2");
@@ -28228,6 +28182,27 @@ rs6000_starting_frame_offset (void)
   return RS6000_STARTING_FRAME_OFFSET;
 }
 \f
+/* Internal function to return the built-in function id for the complex
+   multiply operation for a given mode.  */
+
+static inline built_in_function
+complex_multiply_builtin_code (machine_mode mode)
+{
+  gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, MAX_MODE_COMPLEX_FLOAT));
+  int func = BUILT_IN_COMPLEX_MUL_MIN + mode - MIN_MODE_COMPLEX_FLOAT;
+  return (built_in_function) func;
+}
+
+/* Internal function to return the built-in function id for the complex divide
+   operation for a given mode.  */
+
+static inline built_in_function
+complex_divide_builtin_code (machine_mode mode)
+{
+  gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, MAX_MODE_COMPLEX_FLOAT));
+  int func = BUILT_IN_COMPLEX_DIV_MIN + mode - MIN_MODE_COMPLEX_FLOAT;
+  return (built_in_function) func;
+}
 
 /* On 64-bit Linux and Freebsd systems, possibly switch the long double library
    function names from <foo>l to <foo>f128 if the default long double type is
@@ -28246,11 +28221,53 @@ rs6000_starting_frame_offset (void)
    only do this transformation if the __float128 type is enabled.  This
    prevents us from doing the transformation on older 32-bit ports that might
    have enabled using IEEE 128-bit floating point as the default long double
-   type.  */
+   type.
+
+   We also use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the
+   function names used for complex multiply and divide to the appropriate
+   names.  */
 
 static tree
 rs6000_mangle_decl_assembler_name (tree decl, tree id)
 {
+  /* Handle complex multiply/divide.  For IEEE 128-bit, use __mulkc3 or
+     __divkc3 and for IBM 128-bit use __multc3 and __divtc3.  */
+  if (TARGET_FLOAT128_TYPE
+      && TREE_CODE (decl) == FUNCTION_DECL
+      && DECL_IS_UNDECLARED_BUILTIN (decl)
+      && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
+    {
+      built_in_function id = DECL_FUNCTION_CODE (decl);
+      const char *newname = NULL;
+
+      if (id == complex_multiply_builtin_code (KCmode))
+	newname = "__mulkc3";
+
+      else if (id == complex_multiply_builtin_code (ICmode))
+	newname = "__multc3";
+
+      else if (id == complex_multiply_builtin_code (TCmode))
+	newname = (TARGET_IEEEQUAD) ? "__mulkc3" : "__multc3";
+
+      else if (id == complex_divide_builtin_code (KCmode))
+	newname = "__divkc3";
+
+      else if (id == complex_divide_builtin_code (ICmode))
+	newname = "__divtc3";
+
+      else if (id == complex_divide_builtin_code (TCmode))
+	newname = (TARGET_IEEEQUAD) ? "__divkc3" : "__divtc3";
+
+      if (newname)
+	{
+	  if (TARGET_DEBUG_BUILTIN)
+	    fprintf (stderr, "Map complex mul/div => %s\n", newname);
+
+	  return get_identifier (newname);
+	}
+    }
+
+  /* Map long double built-in functions if long double is IEEE 128-bit.  */
   if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
       && TREE_CODE (decl) == FUNCTION_DECL
       && DECL_IS_UNDECLARED_BUILTIN (decl)
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-1.c b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
new file mode 100644
index 00000000000..e9759b85106
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.  */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+   double is IEEE 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "__divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-2.c b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
new file mode 100644
index 00000000000..819331e1296
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.
+
+   We also need to require that the default long double is 128-bits, otherwise
+   the TC/TF modes might not be available.  */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+   double is IBM 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "__divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-1.c b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
new file mode 100644
index 00000000000..d4045177371
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.  */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+   double is IEEE 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "__multc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-2.c b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
new file mode 100644
index 00000000000..97057d3b327
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.
+
+   We also need to require that the default long double is 128-bits, otherwise
+   the TC/TF modes might not be available.  */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+   double is IBM 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "__multc3" } } */
-- 
2.39.2


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V4] Rework 128-bit complex multiply and divide.
  2023-03-10  1:40 [PATCH V4] Rework 128-bit complex multiply and divide Michael Meissner
@ 2023-03-17 19:35 ` Segher Boessenkool
  2023-03-20 17:43   ` Michael Meissner
  0 siblings, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2023-03-17 19:35 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

Hi!

On Thu, Mar 09, 2023 at 08:40:36PM -0500, Michael Meissner wrote:
> 	PR target/109067
> 	* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
> 	(init_float128_ieee): Delete code to switch complex multiply and divide
> 	for long double.
> 	(complex_multiply_builtin_code): New helper function.
> 	(complex_divide_builtin_code): Likewise.
> 	(rs6000_mangle_decl_assembler_name): Add support for mangling the name
> 	of complex 128-bit multiply and divide built-in functions.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
> +/* { dg-final { scan-assembler "__divtc3" } } */

/* { dg-final { scan-assembler {\m__divtc3\M} } } */

It might well be that we can use a sloppier regexp here, but why would
we do that?  It is a good thing to use the \m and \M constraint escapes
pretty much always.

Similar for the other three testcases of course.

This patch is okay for trunk, if you have tested it on all
configurations (powerpc-linux, powerpc64-linux, powerpc64le-linux with
and without default IEEE128 long double at least).  Thank you!

Does this need backports?


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V4] Rework 128-bit complex multiply and divide.
  2023-03-17 19:35 ` Segher Boessenkool
@ 2023-03-20 17:43   ` Michael Meissner
  2023-03-20 20:25     ` Michael Meissner
  2023-03-20 23:00     ` Segher Boessenkool
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Meissner @ 2023-03-20 17:43 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

On Fri, Mar 17, 2023 at 02:35:16PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Mar 09, 2023 at 08:40:36PM -0500, Michael Meissner wrote:
> > 	PR target/109067
> > 	* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
> > 	(init_float128_ieee): Delete code to switch complex multiply and divide
> > 	for long double.
> > 	(complex_multiply_builtin_code): New helper function.
> > 	(complex_divide_builtin_code): Likewise.
> > 	(rs6000_mangle_decl_assembler_name): Add support for mangling the name
> > 	of complex 128-bit multiply and divide built-in functions.
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
> > +/* { dg-final { scan-assembler "__divtc3" } } */
> 
> /* { dg-final { scan-assembler {\m__divtc3\M} } } */
> 
> It might well be that we can use a sloppier regexp here, but why would
> we do that?  It is a good thing to use the \m and \M constraint escapes
> pretty much always.

The last time I posted the patch, you said:

| > +/* { dg-final { scan-assembler "bl __divtc3" } } */
|
| This name depends on what object format and ABI is in use (some have an
| extra leading underscore, or a dot, or whatever).

So the patch was an attempt to match the other cases.

> Similar for the other three testcases of course.
> 
> This patch is okay for trunk, if you have tested it on all
> configurations (powerpc-linux, powerpc64-linux, powerpc64le-linux with
> and without default IEEE128 long double at least).  Thank you!
> 
> Does this need backports?

I think we will need backports for GCC 12.  The issue exists in GCC 11, but I
don't think that GCC 11 can really work on systems with IEEE long double, since
a lot of the stuff to really finish up the support was not in GCC 11.  I think
I tried dropping the patch into GCC 12, and it looks like something else may be
needed.  I will look into it.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V4] Rework 128-bit complex multiply and divide.
  2023-03-20 17:43   ` Michael Meissner
@ 2023-03-20 20:25     ` Michael Meissner
  2023-03-20 23:00     ` Segher Boessenkool
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Meissner @ 2023-03-20 20:25 UTC (permalink / raw)
  To: Michael Meissner, Segher Boessenkool, gcc-patches, Kewen.Lin,
	David Edelsohn, Peter Bergner, Will Schmidt

On Mon, Mar 20, 2023 at 01:43:41PM -0400, Michael Meissner wrote:
> I think we will need backports for GCC 12.  The issue exists in GCC 11, but I
> don't think that GCC 11 can really work on systems with IEEE long double, since
> a lot of the stuff to really finish up the support was not in GCC 11.  I think
> I tried dropping the patch into GCC 12, and it looks like something else may be
> needed.  I will look into it.

The current patch applies to GCC 12 without changes, and it does fix the
problem.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V4] Rework 128-bit complex multiply and divide.
  2023-03-20 17:43   ` Michael Meissner
  2023-03-20 20:25     ` Michael Meissner
@ 2023-03-20 23:00     ` Segher Boessenkool
  1 sibling, 0 replies; 5+ messages in thread
From: Segher Boessenkool @ 2023-03-20 23:00 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Kewen.Lin, David Edelsohn,
	Peter Bergner, Will Schmidt

On Mon, Mar 20, 2023 at 01:43:41PM -0400, Michael Meissner wrote:
> On Fri, Mar 17, 2023 at 02:35:16PM -0500, Segher Boessenkool wrote:
> > On Thu, Mar 09, 2023 at 08:40:36PM -0500, Michael Meissner wrote:
> > /* { dg-final { scan-assembler {\m__divtc3\M} } } */
> > 
> > It might well be that we can use a sloppier regexp here, but why would
> > we do that?  It is a good thing to use the \m and \M constraint escapes
> > pretty much always.
> 
> The last time I posted the patch, you said:
> 
> | > +/* { dg-final { scan-assembler "bl __divtc3" } } */
> |
> | This name depends on what object format and ABI is in use (some have an
> | extra leading underscore, or a dot, or whatever).
> 
> So the patch was an attempt to match the other cases.

I also said you could do {\mbl .?__divtc3\M} or similar, in cases where
you need to consider multiple ABIs.  Just searching for a substring is
very suboptimal always.

> > Does this need backports?
> 
> I think we will need backports for GCC 12.  The issue exists in GCC 11, but I
> don't think that GCC 11 can really work on systems with IEEE long double, since
> a lot of the stuff to really finish up the support was not in GCC 11.  I think
> I tried dropping the patch into GCC 12, and it looks like something else may be
> needed.  I will look into it.

Okay, thanks.  We'll see.


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-20 23:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10  1:40 [PATCH V4] Rework 128-bit complex multiply and divide Michael Meissner
2023-03-17 19:35 ` Segher Boessenkool
2023-03-20 17:43   ` Michael Meissner
2023-03-20 20:25     ` Michael Meissner
2023-03-20 23:00     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).