Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Carl Love <cel@us.ibm.com>
Cc: Peter Bergner <bergner@linux.ibm.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values
Date: Mon, 5 Jun 2023 16:45:03 +0800	[thread overview]
Message-ID: <bdefd37a-8911-7a53-49eb-f6c1f433f224@linux.ibm.com> (raw)
In-Reply-To: <d481896de5fbe840039ad944da9bdd1ae69a78cc.camel@us.ibm.com>

Hi Carl,

on 2023/5/2 23:52, Carl Love via Gcc-patches wrote:
> GCC maintainers:
> 
> The following patch adds three buitins for inserting and extracting the
> exponent and significand for an IEEE 128-bit floating point values. 
> The builtins are valid for Power 9 and Power 10.  

We already have:

unsigned long long int scalar_extract_exp (__ieee128 source);
unsigned __int128 scalar_extract_sig (__ieee128 source);
ieee_128 scalar_insert_exp (unsigned __int128 significand,
                            unsigned long long int exponent);
ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);

you need to say something about the requirements or the justification for
adding more, for this patch itself, some comments are inline below, thanks!

> 
> The patch has been tested on both Power 9 and Power 10.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.
> 
>             Carl 
> 
> 
> ----------------------
> From a20cc81f98cce1140fc95775a7c25b55d1ca7cba Mon Sep 17 00:00:00 2001
> From: Carl Love <cel@us.ibm.com>
> Date: Wed, 12 Apr 2023 17:46:37 -0400
> Subject: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values
> 
> Add support for the following builtins:
> 
>  __vector unsigned long long int __builtin_extractf128_exp (__ieee128);

Could you make the name similar to the existing one?  The existing one
  
  unsigned long long int scalar_extract_exp (__ieee128 source);

has nothing like f128 on its name, this variant is just to change the
return type to vector type, how about scalar_extract_exp_to_vec?

>  __vector unsigned __int128 __builtin_extractf128_sig (__ieee128);

Ditto.

>  __ieee128 __builtin_insertf128_exp (__vector unsigned __int128,
> 			             __vector unsigned long long);

This one can just overload the existing scalar_insert_exp?

> 
> gcc/
> 	* config/rs6000/rs6000-buildin.def (__builtin_extractf128_exp,
> 	 __builtin_extractf128_sig, __builtin_insertf128_exp): Add new
> 	builtin definitions.
> 	* config/rs6000.md (extractf128_exp_<mode>, insertf128_exp_<mode>,
> 	extractf128_sig_<mode>): Add define_expand for new builtins.
> 	(xsxexpqp_f128_<mode>, xsxsigqp_f128_<mode>, siexpqpf_f128_<mode>):
> 	Add define_insn for new builtins.
> 	* doc/extend.texi (__builtin_extractf128_exp, __builtin_extractf128_sig,
> 	__builtin_insertf128_exp): Add documentation for new builtins.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/bfp/extract-exp-ieee128.c: New test case.
> 	* gcc.target/powerpc/bfp/extract-sig-ieee128.c: New test case.
> 	* gcc.target/powerpc/bfp/insert-exp-ieee128.c: New test case.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |  9 +++
>  gcc/config/rs6000/vsx.md                      | 66 ++++++++++++++++++-
>  gcc/doc/extend.texi                           | 28 ++++++++
>  .../powerpc/bfp/extract-exp-ieee128.c         | 49 ++++++++++++++
>  .../powerpc/bfp/extract-sig-ieee128.c         | 56 ++++++++++++++++
>  .../powerpc/bfp/insert-exp-ieee128.c          | 58 ++++++++++++++++
>  6 files changed, 265 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..3247a7f7673 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2876,6 +2876,15 @@
>    pure vsc __builtin_vsx_xl_len_r (void *, signed long);
>      XL_LEN_R xl_len_r {}
>  
> +  vull __builtin_extractf128_exp (_Float128);
> +    EEXPKF extractf128_exp_kf {}
> +
> +  vuq __builtin_extractf128_sig (_Float128);
> +    ESIGKF extractf128_sig_kf {}
> +
> +  _Float128 __builtin_insertf128_exp (vuq, vull);
> +    IEXPKF_VULL insertf128_exp_kf {}
> +

Put them to be near its related ones like __builtin_vsx_scalar_extract_expq
etc.

>  
>  ; Builtins requiring hardware support for IEEE-128 floating-point.
>  [ieee128-hw]
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 7d845df5c2d..2a9f875ba57 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -369,7 +369,10 @@
>     UNSPEC_XXSPLTI32DX
>     UNSPEC_XXBLEND
>     UNSPEC_XXPERMX
> -  ])
> +   UNSPEC_EXTRACTEXPIEEE
> +   UNSPEC_EXTRACTSIGIEEE
> +   UNSPEC_INSERTEXPIEEE

These are not necessary, just use the existing UNSPEC_VSX_SXEXPDP etc.

> +])
>  
>  (define_int_iterator XVCVBF16	[UNSPEC_VSX_XVCVSPBF16
>  				 UNSPEC_VSX_XVCVBF16SPN])
> @@ -4155,6 +4158,38 @@
>   "vins<wd>rx %0,%1,%2"
>   [(set_attr "type" "vecsimple")])
>  
> +(define_expand "extractf128_exp_<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand")
> +  (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand")]
> +		  UNSPEC_EXTRACTEXPIEEE))]
> +"TARGET_P9_VECTOR"
> +{
> +  emit_insn (gen_xsxexpqp_f128_<mode> (operands[0], operands[1]));
> +  DONE;
> +})
> +
> +(define_expand "insertf128_exp_<mode>"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand")
> +  (unspec:IEEE128 [(match_operand:V1TI 1 "altivec_register_operand")
> +		   (match_operand:V2DI 2 "altivec_register_operand")]
> +		  UNSPEC_INSERTEXPIEEE))]
> +"TARGET_P9_VECTOR"
> +{
> +  emit_insn (gen_xsiexpqpf_f128_<mode> (operands[0], operands[1],
> +				        operands[2]));
> +  DONE;
> +})
> +
> +(define_expand "extractf128_sig_<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand")
> +  (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand")]
> +		  UNSPEC_EXTRACTSIGIEEE))]
> +"TARGET_P9_VECTOR"
> +{
> +  emit_insn (gen_xsxsigqp_f128_<mode> (operands[0], operands[1]));
> +  DONE;
> +})

These define_expands are not necessary, the called gen functions from the
define_insns can be directly used as bif pattern in rs6000-builtins.def.

> +
>  (define_expand "vreplace_elt_<mode>"
>    [(set (match_operand:REPLACE_ELT 0 "register_operand")
>    (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 "register_operand")
> @@ -5016,6 +5051,15 @@
>    "xsxexpqp %0,%1"
>    [(set_attr "type" "vecmove")])
>  
> +;; VSX Scalar to Vector Extract Exponent IEEE 128-bit floating point format
> +(define_insn "xsxexpqp_f128_<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> +	(unspec:V2DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> +	 UNSPEC_VSX_SXEXPDP))]
> +  "TARGET_P9_VECTOR"
> +  "xsxexpqp %0,%1"
> +  [(set_attr "type" "vecmove")])

I think you can just merge this into the existing xsxexpqp_<mode> by extending
with one mode like xsxexpqp_<IEEE128:mode>_<xxx:mode> for unspec:xxx.

xxx can be one mode iterator for DI and V2DI.

> +
>  ;; VSX Scalar Extract Exponent Double-Precision
>  (define_insn "xsxexpdp"
>    [(set (match_operand:DI 0 "register_operand" "=r")
> @@ -5034,6 +5078,15 @@
>    "xsxsigqp %0,%1"
>    [(set_attr "type" "vecmove")])
>  
> +;; VSX Scalar to Vector Extract Significand IEEE 128-bit floating point format
> +(define_insn "xsxsigqp_f128_<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> +	(unspec:V2DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> +	 UNSPEC_VSX_SXSIG))]
> +  "TARGET_P9_VECTOR"
> +  "xsxsigqp %0,%1"
> +  [(set_attr "type" "vecmove")])

Ditto, the mode V2DI looks unexpected, V1TI instead?

> +
>  ;; VSX Scalar Extract Significand Double-Precision
>  (define_insn "xsxsigdp"
>    [(set (match_operand:DI 0 "register_operand" "=r")
> @@ -5054,6 +5107,17 @@
>    "xsiexpqp %0,%1,%2"
>    [(set_attr "type" "vecmove")])
>  
> +;; VSX Insert Exponent IEEE 128-bit Floating point format
> +(define_insn "xsiexpqpf_f128_<mode>"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(unspec:IEEE128
> +	 [(match_operand:V1TI 1 "altivec_register_operand" "v")
> +	  (match_operand:V2DI 2 "altivec_register_operand" "v")]
> +	 UNSPEC_VSX_SIEXPQP))]
> +  "TARGET_P9_VECTOR"
> +  "xsiexpqp %0,%1,%2"
> +  [(set_attr "type" "vecmove")]> +
>  ;; VSX Scalar Insert Exponent Quad-Precision
>  (define_insn "xsiexpqp_<mode>"
>    [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e426a2eb7d8..456e5a03d69 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18511,6 +18511,34 @@ the FPSCR.  The instruction is a lower latency version of the @code{mffs}
>  instruction.  If the @code{mffsl} instruction is not available, then the
>  builtin uses the older @code{mffs} instruction to read the FPSCR.
>  
> +@smallexample
> +@exdent vector unsigned long long int
> +@exdent __builtin_extractf128_exp (__ieee128);
> +@end smallexample
> +
> +The exponent from the IEEE 128-bit floating point value is returned in the
> +lower bits of vector element 1 on Little Endian systems.
> +
> +@smallexample
> +@exdent vector unsigned __int128
> +@exdent __builtin_extractf128_sig (__ieee128);
> +@end smallexample
> +
> +The significand from the IEEE 128-bit floating point value is returned in the
> +vector element 0 bits [111:0]. Bit 112 is set to 0 if the input value was zero
> +or Denormal, 1 otherwise.  Bits [127:113] are set to zero.
> +
> +@smallexample
> +@exdent _ieee128
> +@exdent __builtin_insertf128_exp (vector unsigned __int128,
> +vector unsigned long long int);
> +@end smallexample
> +
> +The first argument contains the significand in bits [111:0] for the IEEE
> +128-bi floating point tresult and the sign bit [127] for the result.  The
> +second argument contains the exponenet bits for the result in bits [14:0] of
> +vector element 1.

Just put the new prototypes after the existing ones and extend the current related
documentation if needed.

> +
>  @node Basic PowerPC Built-in Functions Available on ISA 3.1
>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 3.1
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c b/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c
> new file mode 100644
> index 00000000000..47e2c43962f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c
> @@ -0,0 +1,49 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */> +/* { dg-require-effective-target power10_ok } */

It needs _hw for run and it doesn't require power10, so p9vector_hw.

> +/* { dg-options "-mdejagnu-cpu=power9" } */

You can refer to the conditions for the existing test cases on ieee128
extract_{exp,sig}, insert_..., the underlying insns of these newly
introduced are the same, the condition should be the same.

BR,
Kewen

> +
> +#include <altivec.h>
> +#include <stdlib.h>
> +
> +#if DEBUG
> +#include <stdio.h>
> +#endif
> +
> +vector unsigned long long int
> +get_exponents (__ieee128 *p)
> +{
> +  __ieee128 source = *p;
> +
> +  return __builtin_extractf128_exp (source);
> +}
> +
> +int
> +main ()
> +{
> +  vector unsigned long long int result, exp_result;
> +  union conv128_t
> +   {
> +     __ieee128 val_ieee128;
> +     __int128  val_int128;
> +  } source;
> +  
> +  exp_result[0] = 0x0ULL;
> +  exp_result[1] = 0x1234ULL;
> +  source.val_int128 = 0x923456789ABCDEF0ULL;
> +  source.val_int128 = (source.val_int128 << 64) | 0x123456789ABCDEFULL;
> +
> +  result = get_exponents (&source.val_ieee128);
> +
> +  if ((result[0] != exp_result[0]) || (result[1] != exp_result[1]))
> +#if DEBUG
> +    {
> +      printf("result[0] = 0x%llx; exp_result[0] = 0x%llx\n",
> +	     result[0], exp_result[0]);
> +      printf("result[1] = 0x%llx; exp_result[1] = 0x%llx\n",
> +	     result[1], exp_result[1]);
> +    }
> +#else
> +    abort();
> +#endif
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c b/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c
> new file mode 100644
> index 00000000000..4bd50ca2281
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c
> @@ -0,0 +1,56 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power9" } */
> +
> +#include <altivec.h>
> +#include <stdlib.h>
> +
> +#if DEBUG
> +#include <stdio.h>
> +#endif
> +
> +vector unsigned __int128
> +get_significand (__ieee128 *p)
> +{
> +  __ieee128 source = *p;
> +
> +  return __builtin_extractf128_sig (source);
> +}
> +
> +int
> +main ()
> +{
> +  #define NOT_ZERO_OR_DENORMAL  0x1000000000000
> +
> +  union conv128_t
> +   {
> +     __ieee128 val_ieee128;
> +     unsigned long long int val_ull[2];
> +     unsigned __int128  val_uint128;
> +     __vector unsigned __int128  val_vuint128;
> +  } source, result, exp_result;
> +  
> +  /* Result is not zero or denormal.  */
> +  exp_result.val_ull[1] = 0x00056789ABCDEF0ULL | NOT_ZERO_OR_DENORMAL;
> +  exp_result.val_ull[0] = 0x123456789ABCDEFULL;
> +  source.val_uint128 = 0x923456789ABCDEF0ULL;
> +  source.val_uint128 = (source.val_uint128 << 64) | 0x123456789ABCDEFULL;
> +
> +  /* Note, bits[0:14] are set to 0, bit[15] is 0 if the input was zero or
> +     Denormal, 1 otherwise.  */
> +  result.val_vuint128 = get_significand (&source.val_ieee128);
> +
> +  if ((result.val_ull[0] != exp_result.val_ull[0])
> +      || (result.val_ull[1] != exp_result.val_ull[1]))
> +#if DEBUG
> +    {
> +      printf("result[0] = 0x%llx; exp_result[0] = 0x%llx\n",
> +	     result.val_ull[0], exp_result.val_ull[0]);
> +      printf("result[1] = 0x%llx; exp_result[1] = 0x%llx\n",
> +	     result.val_ull[1], exp_result.val_ull[1]);
> +    }
> +#else
> +    abort();
> +#endif
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c b/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c
> new file mode 100644
> index 00000000000..5bd33ed5b7d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c
> @@ -0,0 +1,58 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power9" } */
> +
> +#include <altivec.h>
> +#include <stdlib.h>
> +
> +#ifdef DEBUG
> +#include <stdio.h>
> +#endif
> +
> +__ieee128
> +insert_exponent (__vector unsigned __int128 *significand_p,
> +		 __vector unsigned long long int *exponent_p)
> +{
> +  __vector unsigned __int128 significand = *significand_p;
> +  __vector unsigned long long int exponent = *exponent_p;
> +
> +  return __builtin_insertf128_exp (significand, exponent);
> +}
> +
> +
> +int
> +main ()
> +{
> +  union conv128_t
> +   {
> +     __ieee128 val_ieee128;
> +     __vector unsigned __int128 val_vint128;
> +     __vector unsigned long long int  val_vull;
> +  } result, exp_result, significand;
> +
> +  __vector unsigned long long int exponent;
> +
> +  significand.val_vull[0] = 0xFEDCBA9876543210ULL;
> +  significand.val_vull[1] = 0x7FFF12345678ABCDULL;  /* positive value */
> +
> +  exponent[0] = 0x5678;
> +  exponent[1] = 0x1234;
> +
> +  exp_result.val_vull[0] = 0xFEDCBA9876543210ULL;
> +  exp_result.val_vull[1] = 0x123412345678ABCDULL;
> +
> +  result.val_ieee128 = insert_exponent(&significand.val_vint128, &exponent);
> +  
> +  if (result.val_ieee128 != exp_result.val_ieee128)
> +#ifdef DEBUG
> +    {
> +      printf("result.val_vull[0] = 0x%llx, exp_result.val_vull[0] = 0x%llx\n",
> +	     result.val_vull[0], exp_result.val_vull[0]);
> +      printf("result.val_vull[1] = 0x%llx, exp_result.val_vull[1] = 0x%llx\n",
> +	     result.val_vull[1], exp_result.val_vull[1]);
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +}

next prev parent reply	other threads:[~2023-06-05  8:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-02 15:52 Carl Love
2023-06-05  8:45 ` Kewen.Lin [this message]
2023-06-06 19:54   ` [PATCH ver 2] " Carl Love
2023-06-06 19:54   ` [PATCH] " Carl Love
2023-06-07  9:36     ` Kewen.Lin
2023-06-08 15:21       ` Carl Love
2023-06-08 15:21       ` [PATCH ver 3] " Carl Love
2023-06-13  3:10         ` Kewen.Lin
2023-06-14 16:17           ` Carl Love

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bdefd37a-8911-7a53-49eb-f6c1f433f224@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=cel@us.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).