[PATCH 2/2] Rework 128-bit complex multiply and divide.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Michael Meissner <meissner@linux.ibm.com>
To: Michael Meissner <meissner@linux.ibm.com>,
	gcc-patches@gcc.gnu.org,
	Segher Boessenkool <segher@kernel.crashing.org>,
	"Kewen.Lin" <linkw@linux.ibm.com>,
	David Edelsohn <dje.gcc@gmail.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	Will Schmidt <will_schmidt@vnet.ibm.com>
Subject: [PATCH 2/2] Rework 128-bit complex multiply and divide.
Date: Fri, 3 Feb 2023 00:53:05 -0500	[thread overview]
Message-ID: <Y9yhQQdUaM+z6IYD@toto.the-meissners.org> (raw)
In-Reply-To: <Y9yfDC9bMaaB86ZP@toto.the-meissners.org>

This patch reworks how the complex multiply and divide built-in functions are
done.  Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit.  The old code also did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

This patch was originally posted on December 13th, 2022:

| Date: Tue, 13 Dec 2022 01:21:06 -0500
| Subject: [PATCH V2] Rework 128-bit complex multiply and divide, PR target/107299
| Message-ID: <Y5gZ0o1nzCq9MmR9@toto.the-meissners.org>

In terms of history, I wrote the original code just as I was starting to test
GCC on systems where IEEE 128-bit long double was the default.  At the time, we
had not yet started mangling the built-in function names as a way to bridge
going from a system with 128-bit IBM long double to 128-bin IEEE long double.

The original code depends on there only being two 128-bit types invovled.  With
the next patch in this series, this assumption will no longer be true.  When
long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the
explicit __float128/_Float128 type and one for long double).

The problem is we cannot create two separate built-in functions that resolve to
the same name.  This is a requirement of add_builtin_function and the C front
end.  That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
only use 2 of them.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

When I wrote these patches, I discovered that __ibm128 complex multiply and
divide had originally not been supported if long double is IEEE 128-bit as it
would generate calls to __mulic3 and __divic3.  I added tests in the testsuite
to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
case.

I had previously sent this patch out on November 1st.  Compared to that version,
this version no longer disables the special mapping when you are building
libgcc, as it turns out we don't need it.

I tested all 3 patchs for PR target/107299 on:

    1)	LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
    2)	LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
    3)	LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
    4)	BE Power8  using --with-cpu=power8  --with-long-double-format=ibm

Once all 3 patches have been applied, we can once again build GCC when long
double is IEEE 128-bit.  There were no other regressions with these patches.
Can I check these patches into the trunk?

Note, it is Friday February 3rd, 2023.  I will be on vacation Tuesday February
7th through February 14th.

2023-02-02   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	PR target/107299
	* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
	(init_float128_ieee): Delete code to switch complex multiply and divide
	for long double.
	(complex_multiply_builtin_code): New helper function.
	(complex_divide_builtin_code): Likewise.
	(rs6000_mangle_decl_assembler_name): Add support for mangling the name
	of complex 128-bit multiply and divide built-in functions.

gcc/testsuite/

	PR target/107299
	* gcc.target/powerpc/divic3-1.c: New test.
	* gcc.target/powerpc/divic3-2.c: Likewise.
	* gcc.target/powerpc/mulic3-1.c: Likewise.
	* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000.cc                 | 109 +++++++++++---------
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  18 ++++
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  17 +++
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  18 ++++
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  17 +++
 5 files changed, 132 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 16ca3a31757..7e76c37fdab 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11151,26 +11151,6 @@ init_float128_ibm (machine_mode mode)
     }
 }
 
-/* Create a decl for either complex long double multiply or complex long double
-   divide when long double is IEEE 128-bit floating point.  We can't use
-   __multc3 and __divtc3 because the original long double using IBM extended
-   double used those names.  The complex multiply/divide functions are encoded
-   as builtin functions with a complex result and 4 scalar inputs.  */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
-  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
-				      name, NULL_TREE);
-
-  set_builtin_decl (fncode, fndecl, true);
-
-  if (TARGET_DEBUG_BUILTIN)
-    fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
-  return;
-}
-
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
    arguments can be passed in a vector register.  The historical PowerPC
    implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -11182,32 +11162,6 @@ init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
     {
-      static bool complex_muldiv_init_p = false;
-
-      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  If
-	 we have clone or target attributes, this will be called a second
-	 time.  We want to create the built-in function only once.  */
-     if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
-       {
-	 complex_muldiv_init_p = true;
-	 built_in_function fncode_mul =
-	   (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
-				- MIN_MODE_COMPLEX_FLOAT);
-	 built_in_function fncode_div =
-	   (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode
-				- MIN_MODE_COMPLEX_FLOAT);
-
-	 tree fntype = build_function_type_list (complex_long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 NULL_TREE);
-
-	 create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
-	 create_complex_muldiv ("__divkc3", fncode_div, fntype);
-       }
-
       set_optab_libfunc (add_optab, mode, "__addkf3");
       set_optab_libfunc (sub_optab, mode, "__subkf3");
       set_optab_libfunc (neg_optab, mode, "__negkf2");
@@ -28225,6 +28179,25 @@ rs6000_starting_frame_offset (void)
   return RS6000_STARTING_FRAME_OFFSET;
 }
 \f
+/* Internal function to return the built-in function id for the complex
+   multiply operation for a given mode.  */
+
+static inline built_in_function
+complex_multiply_builtin_code (machine_mode mode)
+{
+  return (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + mode
+			      - MIN_MODE_COMPLEX_FLOAT);
+}
+
+/* Internal function to return the built-in function id for the complex divide
+   operation for a given mode.  */
+
+static inline built_in_function
+complex_divide_builtin_code (machine_mode mode)
+{
+  return (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + mode
+			      - MIN_MODE_COMPLEX_FLOAT);
+}
 
 /* On 64-bit Linux and Freebsd systems, possibly switch the long double library
    function names from <foo>l to <foo>f128 if the default long double type is
@@ -28243,11 +28216,53 @@ rs6000_starting_frame_offset (void)
    only do this transformation if the __float128 type is enabled.  This
    prevents us from doing the transformation on older 32-bit ports that might
    have enabled using IEEE 128-bit floating point as the default long double
-   type.  */
+   type.
+
+   We also use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the
+   function names used for complex multiply and divide to the appropriate
+   names.  */
 
 static tree
 rs6000_mangle_decl_assembler_name (tree decl, tree id)
 {
+  /* Handle complex multiply/divide.  For IEEE 128-bit, use __mulkc3 or
+     __divkc3 and for IBM 128-bit use __multc3 and __divtc3.  */
+  if (TARGET_FLOAT128_TYPE
+      && TREE_CODE (decl) == FUNCTION_DECL
+      && DECL_IS_UNDECLARED_BUILTIN (decl)
+      && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
+    {
+      built_in_function id = DECL_FUNCTION_CODE (decl);
+      const char *newname = NULL;
+
+      if (id == complex_multiply_builtin_code (KCmode))
+	newname = "__mulkc3";
+
+      else if (id == complex_multiply_builtin_code (ICmode))
+	newname = "__multc3";
+
+      else if (id == complex_multiply_builtin_code (TCmode))
+	newname = (TARGET_IEEEQUAD) ? "__mulkc3" : "__multc3";
+
+      else if (id == complex_divide_builtin_code (KCmode))
+	newname = "__divkc3";
+
+      else if (id == complex_divide_builtin_code (ICmode))
+	newname = "__divtc3";
+
+      else if (id == complex_divide_builtin_code (TCmode))
+	newname = (TARGET_IEEEQUAD) ? "__divkc3" : "__divtc3";
+
+      if (newname)
+	{
+	  if (TARGET_DEBUG_BUILTIN)
+	    fprintf (stderr, "Map complex mul/div => %s\n", newname);
+
+	  return get_identifier (newname);
+	}
+    }
+
+  /* Map long double built-in functions if long double is IEEE 128-bit.  */
   if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
       && TREE_CODE (decl) == FUNCTION_DECL
       && DECL_IS_UNDECLARED_BUILTIN (decl)
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-1.c b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
new file mode 100644
index 00000000000..1cc6b1be904
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+   double is IEEE 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-2.c b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
new file mode 100644
index 00000000000..8ff342e0116
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+   double is IBM 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-1.c b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
new file mode 100644
index 00000000000..4cd773c4b06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+   double is IEEE 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-2.c b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
new file mode 100644
index 00000000000..36fe8bc3061
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+   double is IBM 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
-- 
2.39.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com

next prev parent reply	other threads:[~2023-02-03  5:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-03  5:43 [PATCH 0/2] Repost of patches for solving the build on Fedora 36 problem Michael Meissner
2023-02-03  5:49 ` [PATCH 1/2] PR target/107299: Fix build issue when long double is IEEE 128-bit Michael Meissner
2023-02-22 10:37   ` Kewen.Lin
2023-02-22 18:11     ` Michael Meissner
2023-03-02 22:45   ` Ping: " Michael Meissner
2023-03-03 20:56   ` Segher Boessenkool
2023-02-03  5:53 ` Michael Meissner [this message]
2023-02-22 10:13   ` [PATCH 2/2] Rework 128-bit complex multiply and divide Kewen.Lin
2023-02-22 18:01     ` Michael Meissner
2023-03-02 22:46   ` Ping: " Michael Meissner
2023-03-03 21:35   ` Segher Boessenkool
2023-03-09 16:11     ` Michael Meissner
2023-03-09 22:16       ` Segher Boessenkool
2023-03-09 23:23         ` Michael Meissner
2023-03-09 16:42     ` Michael Meissner
2023-02-03  7:42 ` [PATCH 0/2] Repost of patches for solving the build on Fedora 36 problem Richard Biener
2023-02-06 18:28   ` Peter Bergner
2023-02-07 14:22     ` Richard Biener
2023-02-07 14:31       ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9yhQQdUaM+z6IYD@toto.the-meissners.org \
    --to=meissner@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    --cc=will_schmidt@vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).