From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=KCUi=7C=linux.ibm.com=meissner@sourceware.org>
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1])
	by sourceware.org (Postfix) with ESMTPS id C6A283858C5E
	for <gcc-patches@gcc.gnu.org>; Fri, 10 Mar 2023 01:40:50 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C6A283858C5E
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com
Received: from pps.filterd (m0098396.ppops.net [127.0.0.1])
	by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32A0oGjM004559;
	Fri, 10 Mar 2023 01:40:48 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to :
 subject : message-id : mime-version : content-type; s=pp1;
 bh=hpfCqgyyZ5R7NT6VIiOWDoDpr5waYkpSRmu4gUldsDI=;
 b=KHh4q7E5IAG9YacGPRCnB53m5iiJI0VaWXJElnfdnCOQAQteDb+zXV8WXVzvuFevNX3q
 io9pkeAFJAD2cW9AIBhwX6cIQuZnP9JZKf3sh+YiWtrlLdKov55EbZ3fYwN2eBykhvm7
 +waKMv8RXAXZQmbBXeK2CDxC53hfmKZDOC/r8AtwL5MuEWPdP9dh8kHREWgLv4wFXMPG
 Jugvh5urEgIeLNfC8UHf1g2N77ZmP766iHDyLtVGxcyQcFffUxLWHngsE6YRTrH/se2V
 nMgcHgZS1NaVUN9W0TWsvQGKTXPCPgLGl9cu71mNEgm/7Huqd7Ze1I3tzJyzKatHvJEO 6A== 
Received: from pps.reinject (localhost [127.0.0.1])
	by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3p6rdhjja6-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Fri, 10 Mar 2023 01:40:48 +0000
Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1])
	by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32A1MFcj020924;
	Fri, 10 Mar 2023 01:40:41 GMT
Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27])
	by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3p6rdhjj9s-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Fri, 10 Mar 2023 01:40:41 +0000
Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1])
	by ppma05wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32A16fAp014457;
	Fri, 10 Mar 2023 01:40:40 GMT
Received: from smtprelay03.wdc07v.mail.ibm.com ([9.208.129.113])
	by ppma05wdc.us.ibm.com (PPS) with ESMTPS id 3p6fhhvcsn-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Fri, 10 Mar 2023 01:40:40 +0000
Received: from smtpav04.dal12v.mail.ibm.com (smtpav04.dal12v.mail.ibm.com [10.241.53.103])
	by smtprelay03.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32A1ecLI23069428
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
	Fri, 10 Mar 2023 01:40:38 GMT
Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1])
	by IMSVA (Postfix) with ESMTP id 26FF558062;
	Fri, 10 Mar 2023 01:40:38 +0000 (GMT)
Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1])
	by IMSVA (Postfix) with ESMTP id A1DDE5805E;
	Fri, 10 Mar 2023 01:40:37 +0000 (GMT)
Received: from toto.the-meissners.org (unknown [9.65.233.34])
	by smtpav04.dal12v.mail.ibm.com (Postfix) with ESMTPS;
	Fri, 10 Mar 2023 01:40:37 +0000 (GMT)
Date: Thu, 9 Mar 2023 20:40:36 -0500
From: Michael Meissner <meissner@linux.ibm.com>
To: gcc-patches@gcc.gnu.org, Michael Meissner <meissner@linux.ibm.com>,
        Segher Boessenkool <segher@kernel.crashing.org>,
        "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>,
        Peter Bergner <bergner@linux.ibm.com>,
        Will Schmidt <will_schmidt@vnet.ibm.com>
Subject: [PATCH V4] Rework 128-bit complex multiply and divide.
Message-ID: <ZAqKlNLJl4jMFGVa@toto.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.ibm.com>,
	gcc-patches@gcc.gnu.org,
	Segher Boessenkool <segher@kernel.crashing.org>,
	"Kewen.Lin" <linkw@linux.ibm.com>,
	David Edelsohn <dje.gcc@gmail.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	Will Schmidt <will_schmidt@vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: 1qNocGp5h_SlAa2kuoGA5UuuT8sbMZCQ
X-Proofpoint-ORIG-GUID: Sz2jxKO7k1FH9YGNjUWjh-HZrllHmP4_
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22
 definitions=2023-03-09_14,2023-03-09_01,2023-02-09_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0
 lowpriorityscore=0 bulkscore=0 suspectscore=0 mlxlogscore=999
 priorityscore=1501 clxscore=1015 adultscore=0 mlxscore=0 spamscore=0
 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.12.0-2212070000 definitions=main-2303100007
X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_MANYTO,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

This patch reworks how the complex multiply and divide built-in functions are
done.  Previously GCC created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit.  However, it did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

Originally, the patch was part of a larger patch set and the comments reflected
this.  I have removed the comments referring to the other patches.  While this
patch was originally developed as part of those other patches, it is a stand
alone patch.

I have tried to take the comments in the last patch review in this patch.
Note, I will be away from the computer from March 10 through the 13th.  So I
would not be checking in the patches until I get back.  But I thought I would
share the results of the changes that were asked for.

I fixed the complex_multiply_builtin_code and complex_divide_builtin_code
functions to have an assert tht the mode is within the proper modes.  I have
tried to make the code a little bit clearer.

I have cleaned up the tests to eliminate the target powerpc in the tests.  I
have elimited the -mpower8-vector option.  I have changed the scan assembler
lines jut to look for __divtc3 or __multc3, and not depend on the format of the
'bl' call to those functions.  I have kept the -Wno-psabi option, because this
is needed to prevent spurious errors on systems with older libraries (like big
endian) that don't have IEEE 128-bit support.

2023-03-09   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	PR target/109067
	* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
	(init_float128_ieee): Delete code to switch complex multiply and divide
	for long double.
	(complex_multiply_builtin_code): New helper function.
	(complex_divide_builtin_code): Likewise.
	(rs6000_mangle_decl_assembler_name): Add support for mangling the name
	of complex 128-bit multiply and divide built-in functions.

gcc/testsuite/

	PR target/109067
	* gcc.target/powerpc/divic3-1.c: New test.
	* gcc.target/powerpc/divic3-2.c: Likewise.
	* gcc.target/powerpc/mulic3-1.c: Likewise.
	* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000.cc                 | 111 +++++++++++---------
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  21 ++++
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  25 +++++
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  21 ++++
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  25 +++++
 5 files changed, 156 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 8e0b0d022db..fa5f93a874f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11154,26 +11154,6 @@ init_float128_ibm (machine_mode mode)
     }
 }
 
-/* Create a decl for either complex long double multiply or complex long double
-   divide when long double is IEEE 128-bit floating point.  We can't use
-   __multc3 and __divtc3 because the original long double using IBM extended
-   double used those names.  The complex multiply/divide functions are encoded
-   as builtin functions with a complex result and 4 scalar inputs.  */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
-  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
-				      name, NULL_TREE);
-
-  set_builtin_decl (fncode, fndecl, true);
-
-  if (TARGET_DEBUG_BUILTIN)
-    fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
-  return;
-}
-
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
    arguments can be passed in a vector register.  The historical PowerPC
    implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -11185,32 +11165,6 @@ init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
     {
-      static bool complex_muldiv_init_p = false;
-
-      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  If
-	 we have clone or target attributes, this will be called a second
-	 time.  We want to create the built-in function only once.  */
-     if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
-       {
-	 complex_muldiv_init_p = true;
-	 built_in_function fncode_mul =
-	   (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
-				- MIN_MODE_COMPLEX_FLOAT);
-	 built_in_function fncode_div =
-	   (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode
-				- MIN_MODE_COMPLEX_FLOAT);
-
-	 tree fntype = build_function_type_list (complex_long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 long_double_type_node,
-						 NULL_TREE);
-
-	 create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
-	 create_complex_muldiv ("__divkc3", fncode_div, fntype);
-       }
-
       set_optab_libfunc (add_optab, mode, "__addkf3");
       set_optab_libfunc (sub_optab, mode, "__subkf3");
       set_optab_libfunc (neg_optab, mode, "__negkf2");
@@ -28228,6 +28182,27 @@ rs6000_starting_frame_offset (void)
   return RS6000_STARTING_FRAME_OFFSET;
 }
 
+/* Internal function to return the built-in function id for the complex
+   multiply operation for a given mode.  */
+
+static inline built_in_function
+complex_multiply_builtin_code (machine_mode mode)
+{
+  gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, MAX_MODE_COMPLEX_FLOAT));
+  int func = BUILT_IN_COMPLEX_MUL_MIN + mode - MIN_MODE_COMPLEX_FLOAT;
+  return (built_in_function) func;
+}
+
+/* Internal function to return the built-in function id for the complex divide
+   operation for a given mode.  */
+
+static inline built_in_function
+complex_divide_builtin_code (machine_mode mode)
+{
+  gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, MAX_MODE_COMPLEX_FLOAT));
+  int func = BUILT_IN_COMPLEX_DIV_MIN + mode - MIN_MODE_COMPLEX_FLOAT;
+  return (built_in_function) func;
+}
 
 /* On 64-bit Linux and Freebsd systems, possibly switch the long double library
    function names from <foo>l to <foo>f128 if the default long double type is
@@ -28246,11 +28221,53 @@ rs6000_starting_frame_offset (void)
    only do this transformation if the __float128 type is enabled.  This
    prevents us from doing the transformation on older 32-bit ports that might
    have enabled using IEEE 128-bit floating point as the default long double
-   type.  */
+   type.
+
+   We also use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the
+   function names used for complex multiply and divide to the appropriate
+   names.  */
 
 static tree
 rs6000_mangle_decl_assembler_name (tree decl, tree id)
 {
+  /* Handle complex multiply/divide.  For IEEE 128-bit, use __mulkc3 or
+     __divkc3 and for IBM 128-bit use __multc3 and __divtc3.  */
+  if (TARGET_FLOAT128_TYPE
+      && TREE_CODE (decl) == FUNCTION_DECL
+      && DECL_IS_UNDECLARED_BUILTIN (decl)
+      && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
+    {
+      built_in_function id = DECL_FUNCTION_CODE (decl);
+      const char *newname = NULL;
+
+      if (id == complex_multiply_builtin_code (KCmode))
+	newname = "__mulkc3";
+
+      else if (id == complex_multiply_builtin_code (ICmode))
+	newname = "__multc3";
+
+      else if (id == complex_multiply_builtin_code (TCmode))
+	newname = (TARGET_IEEEQUAD) ? "__mulkc3" : "__multc3";
+
+      else if (id == complex_divide_builtin_code (KCmode))
+	newname = "__divkc3";
+
+      else if (id == complex_divide_builtin_code (ICmode))
+	newname = "__divtc3";
+
+      else if (id == complex_divide_builtin_code (TCmode))
+	newname = (TARGET_IEEEQUAD) ? "__divkc3" : "__divtc3";
+
+      if (newname)
+	{
+	  if (TARGET_DEBUG_BUILTIN)
+	    fprintf (stderr, "Map complex mul/div => %s\n", newname);
+
+	  return get_identifier (newname);
+	}
+    }
+
+  /* Map long double built-in functions if long double is IEEE 128-bit.  */
   if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
       && TREE_CODE (decl) == FUNCTION_DECL
       && DECL_IS_UNDECLARED_BUILTIN (decl)
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-1.c b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
new file mode 100644
index 00000000000..e9759b85106
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.  */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+   double is IEEE 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "__divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-2.c b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
new file mode 100644
index 00000000000..819331e1296
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/divic3-2.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.
+
+   We also need to require that the default long double is 128-bits, otherwise
+   the TC/TF modes might not be available.  */
+
+/* Check that complex divide generates the right call for __ibm128 when long
+   double is IBM 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "__divtc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-1.c b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
new file mode 100644
index 00000000000..d4045177371
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.  */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+   double is IEEE 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "__multc3" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-2.c b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
new file mode 100644
index 00000000000..97057d3b327
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* When GCC is configured with an older library that does not support IEEE
+   128-bit, it issues a warning if you change the long double type. We use
+   -Wno-psabi to silence this warning.  Since this is a code generation test,
+   it does not matter if the library has full IEEE 128-bit support.
+
+   We also need to require that the default long double is 128-bits, otherwise
+   the TC/TF modes might not be available.  */
+
+/* Check that complex multiply generates the right call for __ibm128 when long
+   double is IBM 128-bit floating point.  */
+
+typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__)));
+
+void
+multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "__multc3" } } */
-- 
2.39.2


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meissner@linux.ibm.com