From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9F9E83832363; Tue, 6 Dec 2022 09:37:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9F9E83832363 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B678Cb8024284; Tue, 6 Dec 2022 09:37:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : references : cc : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=O6Vf8toDkwHu9eokNIoox+kYVjeU3KImLdWsdPXvhMs=; b=r/3R3O6/2w17Dmw1mQzKMVlpihLCEe0L5Gsh9TrjheGwGh2+lNtRzZAJcZ0wbHOpaVKY VvBPvFOrw+5WnTHj5kdtTDXMPDqeqdVRESCbgVGj7ir5NrTX6O4ihJT/kYj7sMcxFBxj JjSEAnmDatw36/raASk3fbkPNaeFsJQG5/wR9ayxv3iYCJm8/NUfN93vUza/3TPNPy4E FKr0stpdrmQNPm1DFdVH7t48GkNhfmyk4oSQyVLg7jRPN9txwiuEIgi/oqtKZYCITGt6 EEQWA7BKjpkPxh43j5AND7+WklevgdstH4WIr2j3Uc+Si2pozIaOzvG15mvAgLa9twB1 Hw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3m8g4kemgc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 06 Dec 2022 09:37:05 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2B67rZZA008934; Tue, 6 Dec 2022 09:37:05 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3m8g4kemfr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 06 Dec 2022 09:37:05 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 2B69FxE1008325; Tue, 6 Dec 2022 09:37:03 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3m9m5y14dv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 06 Dec 2022 09:37:02 +0000 Received: from d06av24.portsmouth.uk.ibm.com ([9.149.105.60]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2B69axo036962618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 6 Dec 2022 09:36:59 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 136C142041; Tue, 6 Dec 2022 09:36:59 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 053AD4203F; Tue, 6 Dec 2022 09:36:56 +0000 (GMT) Received: from [9.197.224.206] (unknown [9.197.224.206]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 6 Dec 2022 09:36:55 +0000 (GMT) Message-ID: <997752a6-8cd4-abc5-d6e3-2e75eaa37d57@linux.ibm.com> Date: Tue, 6 Dec 2022 17:36:54 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299 Content-Language: en-US To: Michael Meissner References: Cc: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Peter Bergner , Will Schmidt , William Seurer From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: _6BWnm17yTKkgEO8X9b_dlWaDPBUpUbE X-Proofpoint-ORIG-GUID: 1igiEZXVuFTRmR_rcH1RKa14D0kV2nlj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_05,2022-12-05_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=999 lowpriorityscore=0 adultscore=0 phishscore=0 priorityscore=1501 impostorscore=0 spamscore=0 malwarescore=0 clxscore=1011 suspectscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2212060078 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_NUMSUBJECT,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Mike, Thanks for fixing this! on 2022/11/2 10:40, Michael Meissner wrote: > This function reworks how the complex multiply and divide built-in functions are > done. Previously we created built-in declarations for doing long double complex > multiply and divide when long double is IEEE 128-bit. The old code also did not > support __ibm128 complex multiply and divide if long double is IEEE 128-bit. > > In terms of history, I wrote the original code just as I was starting to test > GCC on systems where IEEE 128-bit long double was the default. At the time, we > had not yet started mangling the built-in function names as a way to bridge > going from a system with 128-bit IBM long double to 128-bin IEEE long double. ~~~ bit > > The original code depends on there only being two 128-bit types invovled. With ~~~~~~ involved. > the next patch in this series, this assumption will no longer be true. When > long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the > explicit __float128/_Float128 type and one for long double). > > The problem is we cannot create two separate built-in functions that resolve to > the same name. This is a requirement of add_builtin_function and the C front > end. That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can > only use 2 of them. > > This code does not create the built-in declaration with the changed name. > Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name > before it is written out to the assembler file like it now does for all of the > other long double built-in functions. > > We need to disable using this mapping when we are building libgcc, specifically > when it is building the floating point 128-bit multiply and divide functions. > The flag that is used when libgcc is built (-fbuilding-libcc) is only available > in the C/C++ front ends. We need to remember that we are building libgcc in the > rs6000-c.cc support to be able to use this later to decided whether to mangle > the decl assembler name or not. IIUC, for the building of floating point 128-bit multiply and divide functions, the mapping seems still to work fine. Using the multiply as example here, when compiling _multc3.o, it's with -mabi=ibmlongdouble, it maps the name with "tc" which is consistent with what we need. While compiling _mulkc3*.o, we would check the macro __LONG_DOUBLE_IEEE128__ and use either KCmode or TCmode, either of the mapping result would be "kc". Could you help to elaborate why we need to disable it during libgcc building? BR, Kewen > > When I wrote these patches, I discovered that __ibm128 complex multiply and > divide had originally not been supported if long double is IEEE 128-bit as it > would generate calls to __mulic3 and __divic3. I added tests in the testsuite > to verify that the correct name (i.e. __multc3 and __divtc3) is used in this > case. > > I tested all 3 patchs for PR target/107299 on: > > 1) LE Power10 using --with-cpu=power10 --with-long-double-format=ieee > 2) LE Power10 using --with-cpu=power10 --with-long-double-format=ibm > 3) LE Power9 using --with-cpu=power9 --with-long-double-format=ibm > 4) BE Power8 using --with-cpu=power8 --with-long-double-format=ibm > > Once all 3 patches have been applied, we can once again build GCC when long > double is IEEE 128-bit. There were no other regressions with these patches. > Can I check these patches into the trunk? > > 2022-11-01 Michael Meissner > > gcc/ > > PR target/107299 > * config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Set > building_libgcc. > * config/rs6000/rs6000.cc (create_complex_muldiv): Delete. > (init_float128_ieee): Delete code to switch complex multiply and divide > for long double. > (complex_multiply_builtin_code): New helper function. > (complex_divide_builtin_code): Likewise. > (rs6000_mangle_decl_assembler_name): Add support for mangling the name > of complex 128-bit multiply and divide built-in functions. > * config/rs6000/rs6000.opt (building_libgcc): New target variable. > > gcc/testsuite/ > > PR target/107299 > * gcc.target/powerpc/divic3-1.c: New test. > * gcc.target/powerpc/divic3-2.c: Likewise. > * gcc.target/powerpc/mulic3-1.c: Likewise. > * gcc.target/powerpc/mulic3-2.c: Likewise. > --- > gcc/config/rs6000/rs6000-c.cc | 8 ++ > gcc/config/rs6000/rs6000.cc | 110 +++++++++++--------- > gcc/config/rs6000/rs6000.opt | 4 + > gcc/testsuite/gcc.target/powerpc/divic3-1.c | 18 ++++ > gcc/testsuite/gcc.target/powerpc/divic3-2.c | 17 +++ > gcc/testsuite/gcc.target/powerpc/mulic3-1.c | 18 ++++ > gcc/testsuite/gcc.target/powerpc/mulic3-2.c | 17 +++ > 7 files changed, 145 insertions(+), 47 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c > > diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc > index 56609462629..5c2f3bcee9f 100644 > --- a/gcc/config/rs6000/rs6000-c.cc > +++ b/gcc/config/rs6000/rs6000-c.cc > @@ -780,6 +780,14 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile) > || DEFAULT_ABI == ABI_ELFv2 > || (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm)) > builtin_define ("__STRUCT_PARM_ALIGN__=16"); > + > + /* Store whether or not we are building libgcc. This is needed to disable > + generating the alternate names for 128-bit complex multiply and divide. > + We need to disable generating __multc3, __divtc3, __mulkc3, and __divkc3 > + when we are building those functions in libgcc. The variable > + flag_building_libgcc is only available for the C family of front-ends. > + We set this variable here to disable generating the alternate names. */ > + building_libgcc = flag_building_libgcc; > } > > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index a85d7630b41..cfb6227e27b 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -11123,26 +11123,6 @@ init_float128_ibm (machine_mode mode) > } > } > > -/* Create a decl for either complex long double multiply or complex long double > - divide when long double is IEEE 128-bit floating point. We can't use > - __multc3 and __divtc3 because the original long double using IBM extended > - double used those names. The complex multiply/divide functions are encoded > - as builtin functions with a complex result and 4 scalar inputs. */ > - > -static void > -create_complex_muldiv (const char *name, built_in_function fncode, tree fntype) > -{ > - tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL, > - name, NULL_TREE); > - > - set_builtin_decl (fncode, fndecl, true); > - > - if (TARGET_DEBUG_BUILTIN) > - fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode); > - > - return; > -} > - > /* Set up IEEE 128-bit floating point routines. Use different names if the > arguments can be passed in a vector register. The historical PowerPC > implementation of IEEE 128-bit floating point used _q_ for the names, so > @@ -11154,32 +11134,6 @@ init_float128_ieee (machine_mode mode) > { > if (FLOAT128_VECTOR_P (mode)) > { > - static bool complex_muldiv_init_p = false; > - > - /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble. If > - we have clone or target attributes, this will be called a second > - time. We want to create the built-in function only once. */ > - if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p) > - { > - complex_muldiv_init_p = true; > - built_in_function fncode_mul = > - (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode > - - MIN_MODE_COMPLEX_FLOAT); > - built_in_function fncode_div = > - (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode > - - MIN_MODE_COMPLEX_FLOAT); > - > - tree fntype = build_function_type_list (complex_long_double_type_node, > - long_double_type_node, > - long_double_type_node, > - long_double_type_node, > - long_double_type_node, > - NULL_TREE); > - > - create_complex_muldiv ("__mulkc3", fncode_mul, fntype); > - create_complex_muldiv ("__divkc3", fncode_div, fntype); > - } > - > set_optab_libfunc (add_optab, mode, "__addkf3"); > set_optab_libfunc (sub_optab, mode, "__subkf3"); > set_optab_libfunc (neg_optab, mode, "__negkf2"); > @@ -28142,6 +28096,25 @@ rs6000_starting_frame_offset (void) > return RS6000_STARTING_FRAME_OFFSET; > } > > +/* Internal function to return the built-in function id for the complex > + multiply operation for a given mode. */ > + > +static inline built_in_function > +complex_multiply_builtin_code (machine_mode mode) > +{ > + return (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + mode > + - MIN_MODE_COMPLEX_FLOAT); > +} > + > +/* Internal function to return the built-in function id for the complex divide > + operation for a given mode. */ > + > +static inline built_in_function > +complex_divide_builtin_code (machine_mode mode) > +{ > + return (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + mode > + - MIN_MODE_COMPLEX_FLOAT); > +} > > /* On 64-bit Linux and Freebsd systems, possibly switch the long double library > function names from l to f128 if the default long double type is > @@ -28160,11 +28133,54 @@ rs6000_starting_frame_offset (void) > only do this transformation if the __float128 type is enabled. This > prevents us from doing the transformation on older 32-bit ports that might > have enabled using IEEE 128-bit floating point as the default long double > - type. */ > + type. > + > + We also use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the > + function names used for complex multiply and divide to the appropriate > + names. */ > > static tree > rs6000_mangle_decl_assembler_name (tree decl, tree id) > { > + /* Handle complex multiply/divide. For IEEE 128-bit, use __mulkc3 or > + __divkc3 and for IBM 128-bit use __multc3 and __divtc3. */ > + if (TARGET_FLOAT128_TYPE > + && !building_libgcc > + && TREE_CODE (decl) == FUNCTION_DECL > + && DECL_IS_UNDECLARED_BUILTIN (decl) > + && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) > + { > + built_in_function id = DECL_FUNCTION_CODE (decl); > + const char *newname = NULL; > + > + if (id == complex_multiply_builtin_code (KCmode)) > + newname = "__mulkc3"; > + > + else if (id == complex_multiply_builtin_code (ICmode)) > + newname = "__multc3"; > + > + else if (id == complex_multiply_builtin_code (TCmode)) > + newname = (TARGET_IEEEQUAD) ? "__mulkc3" : "__multc3"; > + > + else if (id == complex_divide_builtin_code (KCmode)) > + newname = "__divkc3"; > + > + else if (id == complex_divide_builtin_code (ICmode)) > + newname = "__divtc3"; > + > + else if (id == complex_divide_builtin_code (TCmode)) > + newname = (TARGET_IEEEQUAD) ? "__divkc3" : "__divtc3"; > + > + if (newname) > + { > + if (TARGET_DEBUG_BUILTIN) > + fprintf (stderr, "Map complex mul/div => %s\n", newname); > + > + return get_identifier (newname); > + } > + } > + > + /* Map long double built-in functions if long double is IEEE 128-bit. */ > if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 > && TREE_CODE (decl) == FUNCTION_DECL > && DECL_IS_UNDECLARED_BUILTIN (decl) > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > index b63a5d443af..e2de03dda92 100644 > --- a/gcc/config/rs6000/rs6000.opt > +++ b/gcc/config/rs6000/rs6000.opt > @@ -100,6 +100,10 @@ unsigned int rs6000_recip_control > TargetVariable > unsigned int rs6000_debug > > +;; Whether we are building libgcc or not. > +TargetVariable > +bool building_libgcc = false > + > ;; Whether to enable the -mfloat128 stuff without necessarily enabling the > ;; __float128 keyword. > TargetSave > diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-1.c b/gcc/testsuite/gcc.target/powerpc/divic3-1.c > new file mode 100644 > index 00000000000..1cc6b1be904 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-require-effective-target longdouble128 } */ > +/* { dg-require-effective-target ppc_float128_sw } */ > +/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */ > + > +/* Check that complex divide generates the right call for __ibm128 when long > + double is IEEE 128-bit floating point. */ > + > +typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__))); > + > +void > +divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r) > +{ > + *p = *q / *r; > +} > + > +/* { dg-final { scan-assembler "bl __divtc3" } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/divic3-2.c b/gcc/testsuite/gcc.target/powerpc/divic3-2.c > new file mode 100644 > index 00000000000..8ff342e0116 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/divic3-2.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-require-effective-target longdouble128 } */ > +/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */ > + > +/* Check that complex divide generates the right call for __ibm128 when long > + double is IBM 128-bit floating point. */ > + > +typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__))); > + > +void > +divide (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r) > +{ > + *p = *q / *r; > +} > + > +/* { dg-final { scan-assembler "bl __divtc3" } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-1.c b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c > new file mode 100644 > index 00000000000..4cd773c4b06 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/mulic3-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-require-effective-target longdouble128 } */ > +/* { dg-require-effective-target ppc_float128_sw } */ > +/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */ > + > +/* Check that complex multiply generates the right call for __ibm128 when long > + double is IEEE 128-bit floating point. */ > + > +typedef _Complex long double c_ibm128_t __attribute__((mode(__IC__))); > + > +void > +multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r) > +{ > + *p = *q * *r; > +} > + > +/* { dg-final { scan-assembler "bl __multc3" } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/mulic3-2.c b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c > new file mode 100644 > index 00000000000..36fe8bc3061 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/mulic3-2.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-require-effective-target longdouble128 } */ > +/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */ > + > +/* Check that complex multiply generates the right call for __ibm128 when long > + double is IBM 128-bit floating point. */ > + > +typedef _Complex long double c_ibm128_t __attribute__((mode(__TC__))); > + > +void > +multiply (c_ibm128_t *p, c_ibm128_t *q, c_ibm128_t *r) > +{ > + *p = *q * *r; > +} > + > +/* { dg-final { scan-assembler "bl __multc3" } } */