From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 6ADE238558A7 for ; Mon, 5 Jun 2023 08:45:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6ADE238558A7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3558OGNN005806; Mon, 5 Jun 2023 08:45:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=/obQXGC6l8ForPOqkWQLqOHxEGOr17dvyf0re+ymCuI=; b=fcQbR77fgWD/1YLtjKj8do3h8196qzMK9SjryIug7XNsQWtgaJEeSea5G4nUBwLkehU/ GkQdi9zPtUM1+zeRkFbbqfRLMoXBn2HDhK4P03iNrmP/L/0waIgDj4YgsSsnq+UZhUYb SRe2sH7mx1OkVlfqNY5jdbpq0tLe8ihbvoxsLTji4vqxd2xaXQMyOU4f8uWDN9TRte2q wvmz7Qi/E7RF4tJwO34E0FxGOrJoFWSqihnLJ5S8hMTjoyoSfWd4Xh6+V8NBMkFSv6dq gl4uMvP6us6qvkmzJ+piXlVoUapPjfdafkXJ5XnrNpkf+zn8uKqYF9llUosG8SL1UaKP zg== Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r1c5u8eyt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 05 Jun 2023 08:45:11 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3555pfEB022635; Mon, 5 Jun 2023 08:45:10 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma02fra.de.ibm.com (PPS) with ESMTPS id 3qyxdf8ypr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 05 Jun 2023 08:45:09 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3558j7pd37683710 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 5 Jun 2023 08:45:07 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 78F0720040; Mon, 5 Jun 2023 08:45:07 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2FA3920043; Mon, 5 Jun 2023 08:45:05 +0000 (GMT) Received: from [9.177.27.14] (unknown [9.177.27.14]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 5 Jun 2023 08:45:04 +0000 (GMT) Message-ID: Date: Mon, 5 Jun 2023 16:45:03 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values Content-Language: en-US To: Carl Love Cc: Peter Bergner , Segher Boessenkool , gcc-patches@gcc.gnu.org References: From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: nDo4Bi3hPrFdM1BAmG4FS2aT1zWD4tr8 X-Proofpoint-ORIG-GUID: nDo4Bi3hPrFdM1BAmG4FS2aT1zWD4tr8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-03_08,2023-06-02_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 clxscore=1015 impostorscore=0 mlxlogscore=999 priorityscore=1501 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2306050077 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Carl, on 2023/5/2 23:52, Carl Love via Gcc-patches wrote: > GCC maintainers: > > The following patch adds three buitins for inserting and extracting the > exponent and significand for an IEEE 128-bit floating point values. > The builtins are valid for Power 9 and Power 10. We already have: unsigned long long int scalar_extract_exp (__ieee128 source); unsigned __int128 scalar_extract_sig (__ieee128 source); ieee_128 scalar_insert_exp (unsigned __int128 significand, unsigned long long int exponent); ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long long int exponent); you need to say something about the requirements or the justification for adding more, for this patch itself, some comments are inline below, thanks! > > The patch has been tested on both Power 9 and Power 10. > > Please let me know if this patch is acceptable for mainline. Thanks. > > Carl > > > ---------------------- > From a20cc81f98cce1140fc95775a7c25b55d1ca7cba Mon Sep 17 00:00:00 2001 > From: Carl Love > Date: Wed, 12 Apr 2023 17:46:37 -0400 > Subject: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values > > Add support for the following builtins: > > __vector unsigned long long int __builtin_extractf128_exp (__ieee128); Could you make the name similar to the existing one? The existing one unsigned long long int scalar_extract_exp (__ieee128 source); has nothing like f128 on its name, this variant is just to change the return type to vector type, how about scalar_extract_exp_to_vec? > __vector unsigned __int128 __builtin_extractf128_sig (__ieee128); Ditto. > __ieee128 __builtin_insertf128_exp (__vector unsigned __int128, > __vector unsigned long long); This one can just overload the existing scalar_insert_exp? > > gcc/ > * config/rs6000/rs6000-buildin.def (__builtin_extractf128_exp, > __builtin_extractf128_sig, __builtin_insertf128_exp): Add new > builtin definitions. > * config/rs6000.md (extractf128_exp_, insertf128_exp_, > extractf128_sig_): Add define_expand for new builtins. > (xsxexpqp_f128_, xsxsigqp_f128_, siexpqpf_f128_): > Add define_insn for new builtins. > * doc/extend.texi (__builtin_extractf128_exp, __builtin_extractf128_sig, > __builtin_insertf128_exp): Add documentation for new builtins. > > gcc/testsuite/ > * gcc.target/powerpc/bfp/extract-exp-ieee128.c: New test case. > * gcc.target/powerpc/bfp/extract-sig-ieee128.c: New test case. > * gcc.target/powerpc/bfp/insert-exp-ieee128.c: New test case. > --- > gcc/config/rs6000/rs6000-builtins.def | 9 +++ > gcc/config/rs6000/vsx.md | 66 ++++++++++++++++++- > gcc/doc/extend.texi | 28 ++++++++ > .../powerpc/bfp/extract-exp-ieee128.c | 49 ++++++++++++++ > .../powerpc/bfp/extract-sig-ieee128.c | 56 ++++++++++++++++ > .../powerpc/bfp/insert-exp-ieee128.c | 58 ++++++++++++++++ > 6 files changed, 265 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c > > diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def > index 638d0bc72ca..3247a7f7673 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -2876,6 +2876,15 @@ > pure vsc __builtin_vsx_xl_len_r (void *, signed long); > XL_LEN_R xl_len_r {} > > + vull __builtin_extractf128_exp (_Float128); > + EEXPKF extractf128_exp_kf {} > + > + vuq __builtin_extractf128_sig (_Float128); > + ESIGKF extractf128_sig_kf {} > + > + _Float128 __builtin_insertf128_exp (vuq, vull); > + IEXPKF_VULL insertf128_exp_kf {} > + Put them to be near its related ones like __builtin_vsx_scalar_extract_expq etc. > > ; Builtins requiring hardware support for IEEE-128 floating-point. > [ieee128-hw] > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index 7d845df5c2d..2a9f875ba57 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -369,7 +369,10 @@ > UNSPEC_XXSPLTI32DX > UNSPEC_XXBLEND > UNSPEC_XXPERMX > - ]) > + UNSPEC_EXTRACTEXPIEEE > + UNSPEC_EXTRACTSIGIEEE > + UNSPEC_INSERTEXPIEEE These are not necessary, just use the existing UNSPEC_VSX_SXEXPDP etc. > +]) > > (define_int_iterator XVCVBF16 [UNSPEC_VSX_XVCVSPBF16 > UNSPEC_VSX_XVCVBF16SPN]) > @@ -4155,6 +4158,38 @@ > "vinsrx %0,%1,%2" > [(set_attr "type" "vecsimple")]) > > +(define_expand "extractf128_exp_" > + [(set (match_operand:V2DI 0 "altivec_register_operand") > + (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand")] > + UNSPEC_EXTRACTEXPIEEE))] > +"TARGET_P9_VECTOR" > +{ > + emit_insn (gen_xsxexpqp_f128_ (operands[0], operands[1])); > + DONE; > +}) > + > +(define_expand "insertf128_exp_" > + [(set (match_operand:IEEE128 0 "altivec_register_operand") > + (unspec:IEEE128 [(match_operand:V1TI 1 "altivec_register_operand") > + (match_operand:V2DI 2 "altivec_register_operand")] > + UNSPEC_INSERTEXPIEEE))] > +"TARGET_P9_VECTOR" > +{ > + emit_insn (gen_xsiexpqpf_f128_ (operands[0], operands[1], > + operands[2])); > + DONE; > +}) > + > +(define_expand "extractf128_sig_" > + [(set (match_operand:V2DI 0 "altivec_register_operand") > + (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand")] > + UNSPEC_EXTRACTSIGIEEE))] > +"TARGET_P9_VECTOR" > +{ > + emit_insn (gen_xsxsigqp_f128_ (operands[0], operands[1])); > + DONE; > +}) These define_expands are not necessary, the called gen functions from the define_insns can be directly used as bif pattern in rs6000-builtins.def. > + > (define_expand "vreplace_elt_" > [(set (match_operand:REPLACE_ELT 0 "register_operand") > (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 "register_operand") > @@ -5016,6 +5051,15 @@ > "xsxexpqp %0,%1" > [(set_attr "type" "vecmove")]) > > +;; VSX Scalar to Vector Extract Exponent IEEE 128-bit floating point format > +(define_insn "xsxexpqp_f128_" > + [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") > + (unspec:V2DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")] > + UNSPEC_VSX_SXEXPDP))] > + "TARGET_P9_VECTOR" > + "xsxexpqp %0,%1" > + [(set_attr "type" "vecmove")]) I think you can just merge this into the existing xsxexpqp_ by extending with one mode like xsxexpqp__ for unspec:xxx. xxx can be one mode iterator for DI and V2DI. > + > ;; VSX Scalar Extract Exponent Double-Precision > (define_insn "xsxexpdp" > [(set (match_operand:DI 0 "register_operand" "=r") > @@ -5034,6 +5078,15 @@ > "xsxsigqp %0,%1" > [(set_attr "type" "vecmove")]) > > +;; VSX Scalar to Vector Extract Significand IEEE 128-bit floating point format > +(define_insn "xsxsigqp_f128_" > + [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") > + (unspec:V2DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")] > + UNSPEC_VSX_SXSIG))] > + "TARGET_P9_VECTOR" > + "xsxsigqp %0,%1" > + [(set_attr "type" "vecmove")]) Ditto, the mode V2DI looks unexpected, V1TI instead? > + > ;; VSX Scalar Extract Significand Double-Precision > (define_insn "xsxsigdp" > [(set (match_operand:DI 0 "register_operand" "=r") > @@ -5054,6 +5107,17 @@ > "xsiexpqp %0,%1,%2" > [(set_attr "type" "vecmove")]) > > +;; VSX Insert Exponent IEEE 128-bit Floating point format > +(define_insn "xsiexpqpf_f128_" > + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") > + (unspec:IEEE128 > + [(match_operand:V1TI 1 "altivec_register_operand" "v") > + (match_operand:V2DI 2 "altivec_register_operand" "v")] > + UNSPEC_VSX_SIEXPQP))] > + "TARGET_P9_VECTOR" > + "xsiexpqp %0,%1,%2" > + [(set_attr "type" "vecmove")]> + > ;; VSX Scalar Insert Exponent Quad-Precision > (define_insn "xsiexpqp_" > [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index e426a2eb7d8..456e5a03d69 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -18511,6 +18511,34 @@ the FPSCR. The instruction is a lower latency version of the @code{mffs} > instruction. If the @code{mffsl} instruction is not available, then the > builtin uses the older @code{mffs} instruction to read the FPSCR. > > +@smallexample > +@exdent vector unsigned long long int > +@exdent __builtin_extractf128_exp (__ieee128); > +@end smallexample > + > +The exponent from the IEEE 128-bit floating point value is returned in the > +lower bits of vector element 1 on Little Endian systems. > + > +@smallexample > +@exdent vector unsigned __int128 > +@exdent __builtin_extractf128_sig (__ieee128); > +@end smallexample > + > +The significand from the IEEE 128-bit floating point value is returned in the > +vector element 0 bits [111:0]. Bit 112 is set to 0 if the input value was zero > +or Denormal, 1 otherwise. Bits [127:113] are set to zero. > + > +@smallexample > +@exdent _ieee128 > +@exdent __builtin_insertf128_exp (vector unsigned __int128, > +vector unsigned long long int); > +@end smallexample > + > +The first argument contains the significand in bits [111:0] for the IEEE > +128-bi floating point tresult and the sign bit [127] for the result. The > +second argument contains the exponenet bits for the result in bits [14:0] of > +vector element 1. Just put the new prototypes after the existing ones and extend the current related documentation if needed. > + > @node Basic PowerPC Built-in Functions Available on ISA 3.1 > @subsubsection Basic PowerPC Built-in Functions Available on ISA 3.1 > > diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c b/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c > new file mode 100644 > index 00000000000..47e2c43962f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c > @@ -0,0 +1,49 @@ > +/* { dg-do run { target { powerpc*-*-* } } } */> +/* { dg-require-effective-target power10_ok } */ It needs _hw for run and it doesn't require power10, so p9vector_hw. > +/* { dg-options "-mdejagnu-cpu=power9" } */ You can refer to the conditions for the existing test cases on ieee128 extract_{exp,sig}, insert_..., the underlying insns of these newly introduced are the same, the condition should be the same. BR, Kewen > + > +#include > +#include > + > +#if DEBUG > +#include > +#endif > + > +vector unsigned long long int > +get_exponents (__ieee128 *p) > +{ > + __ieee128 source = *p; > + > + return __builtin_extractf128_exp (source); > +} > + > +int > +main () > +{ > + vector unsigned long long int result, exp_result; > + union conv128_t > + { > + __ieee128 val_ieee128; > + __int128 val_int128; > + } source; > + > + exp_result[0] = 0x0ULL; > + exp_result[1] = 0x1234ULL; > + source.val_int128 = 0x923456789ABCDEF0ULL; > + source.val_int128 = (source.val_int128 << 64) | 0x123456789ABCDEFULL; > + > + result = get_exponents (&source.val_ieee128); > + > + if ((result[0] != exp_result[0]) || (result[1] != exp_result[1])) > +#if DEBUG > + { > + printf("result[0] = 0x%llx; exp_result[0] = 0x%llx\n", > + result[0], exp_result[0]); > + printf("result[1] = 0x%llx; exp_result[1] = 0x%llx\n", > + result[1], exp_result[1]); > + } > +#else > + abort(); > +#endif > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c b/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c > new file mode 100644 > index 00000000000..4bd50ca2281 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c > @@ -0,0 +1,56 @@ > +/* { dg-do run { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power9" } */ > + > +#include > +#include > + > +#if DEBUG > +#include > +#endif > + > +vector unsigned __int128 > +get_significand (__ieee128 *p) > +{ > + __ieee128 source = *p; > + > + return __builtin_extractf128_sig (source); > +} > + > +int > +main () > +{ > + #define NOT_ZERO_OR_DENORMAL 0x1000000000000 > + > + union conv128_t > + { > + __ieee128 val_ieee128; > + unsigned long long int val_ull[2]; > + unsigned __int128 val_uint128; > + __vector unsigned __int128 val_vuint128; > + } source, result, exp_result; > + > + /* Result is not zero or denormal. */ > + exp_result.val_ull[1] = 0x00056789ABCDEF0ULL | NOT_ZERO_OR_DENORMAL; > + exp_result.val_ull[0] = 0x123456789ABCDEFULL; > + source.val_uint128 = 0x923456789ABCDEF0ULL; > + source.val_uint128 = (source.val_uint128 << 64) | 0x123456789ABCDEFULL; > + > + /* Note, bits[0:14] are set to 0, bit[15] is 0 if the input was zero or > + Denormal, 1 otherwise. */ > + result.val_vuint128 = get_significand (&source.val_ieee128); > + > + if ((result.val_ull[0] != exp_result.val_ull[0]) > + || (result.val_ull[1] != exp_result.val_ull[1])) > +#if DEBUG > + { > + printf("result[0] = 0x%llx; exp_result[0] = 0x%llx\n", > + result.val_ull[0], exp_result.val_ull[0]); > + printf("result[1] = 0x%llx; exp_result[1] = 0x%llx\n", > + result.val_ull[1], exp_result.val_ull[1]); > + } > +#else > + abort(); > +#endif > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c b/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c > new file mode 100644 > index 00000000000..5bd33ed5b7d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c > @@ -0,0 +1,58 @@ > +/* { dg-do run { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power9" } */ > + > +#include > +#include > + > +#ifdef DEBUG > +#include > +#endif > + > +__ieee128 > +insert_exponent (__vector unsigned __int128 *significand_p, > + __vector unsigned long long int *exponent_p) > +{ > + __vector unsigned __int128 significand = *significand_p; > + __vector unsigned long long int exponent = *exponent_p; > + > + return __builtin_insertf128_exp (significand, exponent); > +} > + > + > +int > +main () > +{ > + union conv128_t > + { > + __ieee128 val_ieee128; > + __vector unsigned __int128 val_vint128; > + __vector unsigned long long int val_vull; > + } result, exp_result, significand; > + > + __vector unsigned long long int exponent; > + > + significand.val_vull[0] = 0xFEDCBA9876543210ULL; > + significand.val_vull[1] = 0x7FFF12345678ABCDULL; /* positive value */ > + > + exponent[0] = 0x5678; > + exponent[1] = 0x1234; > + > + exp_result.val_vull[0] = 0xFEDCBA9876543210ULL; > + exp_result.val_vull[1] = 0x123412345678ABCDULL; > + > + result.val_ieee128 = insert_exponent(&significand.val_vint128, &exponent); > + > + if (result.val_ieee128 != exp_result.val_ieee128) > +#ifdef DEBUG > + { > + printf("result.val_vull[0] = 0x%llx, exp_result.val_vull[0] = 0x%llx\n", > + result.val_vull[0], exp_result.val_vull[0]); > + printf("result.val_vull[1] = 0x%llx, exp_result.val_vull[1] = 0x%llx\n", > + result.val_vull[1], exp_result.val_vull[1]); > + } > +#else > + abort (); > +#endif > + > +}