public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values
@ 2023-06-14 20:37 Carl Love
  2023-06-15  6:23 ` Kewen.Lin
  0 siblings, 1 reply; 3+ messages in thread
From: Carl Love @ 2023-06-14 20:37 UTC (permalink / raw)
  To: gcc-patches; +Cc: Kewen.Lin, Segher Boessenkool, Peter Bergner, dje.gcc, cel

Kewen, GCC maintainers:

Version 4, added missing cases for new xxexpqp, xsxexpdp and xsxsigqp
cases to rs6000_expand_builtin.  Merged the new define_insn definitions
with the existing definitions.  Renamed the builtins by removing the
__builtin_ prefix from the names.  Fixed the documentation for the
builtins.  Updated the test files to check the desired instructions
were generated.  Retested patch on Power 10 with no regressions.

Version 3, was able to get the overloaded version of scalar_insert_exp
to work and the change to xsxexpqp_f128_<mode> define instruction to
work with the suggestions from Kewen.  

Version 2, I have addressed the various comments from Kewen.  I had
issues with adding an additional overloaded version of
scalar_insert_exp with vector arguments.  The overload infrastructure
didn't work with a mix of scalar and vector arguments.  I did rename
the __builtin_insertf128_exp to __builtin_vsx_scalar_insert_exp_qp make
it similar to the existing builtin.  I also wasn't able to get the
suggested merge of xsxexpqp_f128_<mode> with xsxexpqp_<mode> to work so
I left the two simpler definitiions.

The patch add three new builtins to extract the significand and
exponent of an IEEE float 128-bit value where the builtin argument is a
vector.  Additionally, a builtin to insert the exponent into an IEEE
float 128-bit vector argument is added.  These builtins were requested
since there is no clean and optimal way to transfer between a vector
and a scalar IEEE 128 bit value.

The patch has been tested on Power 10 with no regressions.  Please let
me know if the patch is acceptable or not.  Thanks.

               Carl


----------------------------------------
rs6000: Add builtins for IEEE 128-bit floating point values

Add support for the following builtins:

 __vector unsigned long long int scalar_extract_exp_to_vec (__ieee128);
 __vector unsigned __int128 scalar_extract_sig_to_vec (__ieee128);
 __ieee128 scalar_insert_exp (__vector unsigned __int128,
 			      __vector unsigned long long);

These builtins were requesed since there is no clean and performant way to
transfer a value from a vector type and scalar type, despite the fact
that they both reside in vector registers.

gcc/
	* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin):
	Rename CCDE_FOR_xsxexpqp_tf to CODE_FOR_xsxexpqp_tf_di.
	Rename CODE_FOR_xsxexpqp_kf to CODE_FOR_xsxexpqp_kf_di.
	(CODE_FOR_xsxexpqp_kf_v2di, CODE_FOR_xsxsigqp_kf_v1ti,
	CODE_FOR_xsiexpqp_kf_v2di	): Add case statements.
	* config/rs6000/rs6000-buildin.def (__builtin_extractf128_exp,
	 __builtin_extractf128_sig, __builtin_insertf128_exp): Add new
	builtin definitions.
	Rename xsxexpqp_kf, xsxsigqp_kf, xxsiexpqp_kf to xsexpqp_kf_di,
	xsxsigqp_kf_ti, xsiexpqp_kf_di respectively.
	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
	Add else if for MODE_VECTOR_INT. Update comments.
	* config/rs6000/rs6000-overload.def
	(__builtin_vec_scalar_insert_exp): Add new overload definition with
	vector arguments.
	(scalar_extract_exp_to_vec, scalar_extract_sig_to_vec): New
	odverloaded definitions.
	* config/vsx.md (VSEEQP_DI, VSESQP_TI): New mode iterators.
	(VSEEQP_DI_base): New mode attribute definition.
	Rename xsxexpqp_<mode> to sxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>.
	Rename xsxsigqp_<mode> to xsxsigqp_<IEEE128:mode>_<VSESQP_TI:mode>.
	Rename xsiexpqp_<mode> to xsiexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>.
	(xsxsigqp_f128_<mode>, xsiexpqpf_f128_<mode>): Add define_insn for
	new builtins.
	* doc/extend.texi (__builtin_extractf128_exp,
	__builtin_extractf128_sig): Add documentation for new builtins.
	(scalar_insert_exp): Add new overloaded builtin definition.

gcc/testsuite/
	* gcc.target/powerpc/bfp/extract-exp-1.c: New test case.
	* gcc.target/powerpc/bfp/extract-sig-1.c: New test case.
	* gcc.target/powerpc/bfp/insert-exp-1.c: New test case.
---
 gcc/config/rs6000/rs6000-builtin.cc           | 21 +++--
 gcc/config/rs6000/rs6000-builtins.def         | 15 ++-
 gcc/config/rs6000/rs6000-c.cc                 | 10 +-
 gcc/config/rs6000/rs6000-overload.def         | 10 ++
 gcc/config/rs6000/vsx.md                      | 26 +++--
 gcc/doc/extend.texi                           | 21 ++++-
 .../gcc.target/powerpc/bfp/extract-exp-1.c    | 53 +++++++++++
 .../gcc.target/powerpc/bfp/extract-sig-1.c    | 60 ++++++++++++
 .../gcc.target/powerpc/bfp/insert-exp-1.c     | 94 +++++++++++++++++++
 9 files changed, 284 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-1.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
index 534698e7d3e..a8f291c6a72 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -3326,17 +3326,26 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
       case CODE_FOR_fmakf4_odd:
 	icode = CODE_FOR_fmatf4_odd;
 	break;
-      case CODE_FOR_xsxexpqp_kf:
-	icode = CODE_FOR_xsxexpqp_tf;
+      case CODE_FOR_xsxexpqp_kf_di:
+	icode = CODE_FOR_xsxexpqp_tf_di;
 	break;
-      case CODE_FOR_xsxsigqp_kf:
-	icode = CODE_FOR_xsxsigqp_tf;
+      case CODE_FOR_xsxexpqp_kf_v2di:
+	icode = CODE_FOR_xsxexpqp_tf_v2di;
+	break;
+      case CODE_FOR_xsxsigqp_kf_ti:
+	icode = CODE_FOR_xsxsigqp_tf_ti;
+	break;
+      case CODE_FOR_xsxsigqp_kf_v1ti:
+	icode = CODE_FOR_xsxsigqp_tf_v1ti;
 	break;
       case CODE_FOR_xststdcnegqp_kf:
 	icode = CODE_FOR_xststdcnegqp_tf;
 	break;
-      case CODE_FOR_xsiexpqp_kf:
-	icode = CODE_FOR_xsiexpqp_tf;
+      case CODE_FOR_xsiexpqp_kf_di:
+	icode = CODE_FOR_xsiexpqp_tf_di;
+	break;
+      case CODE_FOR_xsiexpqp_kf_v2di:
+	icode = CODE_FOR_xsiexpqp_tf_v2di;
 	break;
       case CODE_FOR_xsiexpqpf_kf:
 	icode = CODE_FOR_xsiexpqpf_tf;
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..6623cb8195d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2901,20 +2901,29 @@
   fpmath double __builtin_truncf128_round_to_odd (_Float128);
     TRUNCF128_ODD trunckfdf2_odd {}
 
+  vull __builtin_scalar_extract_exp_to_vec (_Float128);
+    EEXPKF xsxexpqp_kf_v2di {}
+
+  vuq __builtin_scalar_extract_sig_to_vec (_Float128);
+    ESIGKF xsxsigqp_kf_v1ti {}
+
   const signed long long __builtin_vsx_scalar_extract_expq (_Float128);
-    VSEEQP xsxexpqp_kf {}
+    VSEEQP xsxexpqp_kf_di {}
 
   const signed __int128 __builtin_vsx_scalar_extract_sigq (_Float128);
-    VSESQP xsxsigqp_kf {}
+    VSESQP xsxsigqp_kf_ti {}
 
   const _Float128 __builtin_vsx_scalar_insert_exp_q (unsigned __int128, \
                                                      unsigned long long);
-    VSIEQP xsiexpqp_kf {}
+    VSIEQP xsiexpqp_kf_di {}
 
   const _Float128 __builtin_vsx_scalar_insert_exp_qp (_Float128, \
                                                       unsigned long long);
     VSIEQPF xsiexpqpf_kf {}
 
+  const _Float128 __builtin_vsx_scalar_insert_exp_vqp (vuq, vull);
+    VSIEQPV xsiexpqp_kf_v2di {}
+
   const signed int __builtin_vsx_scalar_test_data_class_qp (_Float128, \
                                                             const int<7>);
     VSTDCQP xststdcqp_kf {}
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 8555174d36e..11060f697db 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -1929,11 +1929,15 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
 	   128-bit variant of built-in function.  */
 	if (GET_MODE_PRECISION (arg1_mode) > 64)
 	  {
-	    /* If first argument is of float variety, choose variant
-	       that expects __ieee128 argument.  Otherwise, expect
-	       __int128 argument.  */
+	    /* If first argument is of float variety, choose the variant that
+	       expects __ieee128 argument.  If the first argument is vector
+	       int, choose the variant that expects vector unsigned
+	       __int128 argument.  Otherwise, expect scalar __int128 argument.
+	    */
 	    if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
 	      instance_code = RS6000_BIF_VSIEQPF;
+	    else if (GET_MODE_CLASS (arg1_mode) == MODE_VECTOR_INT)
+	      instance_code = RS6000_BIF_VSIEQPV;
 	    else
 	      instance_code = RS6000_BIF_VSIEQP;
 	  }
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index c582490c084..1c503a4b3d3 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4515,6 +4515,16 @@
     VSIEQP
   _Float128 __builtin_vec_scalar_insert_exp (_Float128, unsigned long long);
     VSIEQPF
+  _Float128 __builtin_vec_scalar_insert_exp (vuq, vull);
+    VSIEQPV
+
+[VEC_VSEEV, scalar_extract_exp_to_vec, __builtin_scalar_extract_exp_to_vector]
+  vull __builtin_scalar_extract_exp_to_vector (_Float128);
+    EEXPKF
+
+[VEC_VSESV, scalar_extract_sig_to_vec, __builtin_scalar_extract_sig_to_vector]
+  vuq __builtin_scalar_extract_sig_to_vector (_Float128);
+    ESIGKF
 
 [VEC_VSTDC, scalar_test_data_class, __builtin_vec_scalar_test_data_class]
   unsigned int __builtin_vec_scalar_test_data_class (float, const int);
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 7d845df5c2d..bb19b5ef170 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -396,6 +396,10 @@
 			   V4SF
 			   V2DF
 			   V2DI])
+(define_mode_iterator VSEEQP_DI  [V2DI DI])
+(define_mode_iterator VSESQP_TI  [V1TI TI])
+(define_mode_attr VSEEQP_DI_base [(V2DI "V1TI")
+				  (DI   "TI")])
 
 (define_mode_attr VM3_char [(V2DI "d")
 			   (V4SI "w")
@@ -5008,9 +5012,10 @@
 ;; ISA 3.0 Binary Floating-Point Support
 
 ;; VSX Scalar Extract Exponent Quad-Precision
-(define_insn "xsxexpqp_<mode>"
-  [(set (match_operand:DI 0 "altivec_register_operand" "=v")
-	(unspec:DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+(define_insn "xsxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
+  [(set (match_operand:VSEEQP_DI 0 "altivec_register_operand" "=v")
+	(unspec:VSEEQP_DI
+	  [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SXEXPDP))]
   "TARGET_P9_VECTOR"
   "xsxexpqp %0,%1"
@@ -5026,9 +5031,10 @@
   [(set_attr "type" "integer")])
 
 ;; VSX Scalar Extract Significand Quad-Precision
-(define_insn "xsxsigqp_<mode>"
-  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
-	(unspec:TI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+(define_insn "xsxsigqp_<IEEE128:mode>_<VSESQP_TI:mode>"
+  [(set (match_operand:VSESQP_TI 0 "altivec_register_operand" "=v")
+	(unspec:VSESQP_TI [(match_operand:IEEE128 1
+			    "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SXSIG))]
   "TARGET_P9_VECTOR"
   "xsxsigqp %0,%1"
@@ -5055,10 +5061,12 @@
   [(set_attr "type" "vecmove")])
 
 ;; VSX Scalar Insert Exponent Quad-Precision
-(define_insn "xsiexpqp_<mode>"
+(define_insn "xsiexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
-	(unspec:IEEE128 [(match_operand:TI 1 "altivec_register_operand" "v")
-			 (match_operand:DI 2 "altivec_register_operand" "v")]
+	(unspec:IEEE128 [(match_operand:<VSEEQP_DI_base> 1
+			  "altivec_register_operand" "v")
+			 (match_operand:VSEEQP_DI 2
+			  "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SIEXPQP))]
   "TARGET_P9_VECTOR"
   "xsiexpqp %0,%1,%2"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e426a2eb7d8..6da7ae9d94c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -19724,6 +19724,10 @@ double scalar_insert_exp (double significand, unsigned long long int exponent);
 ieee_128 scalar_insert_exp (unsigned __int128 significand,
                             unsigned long long int exponent);
 ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);
+vector ieee_128 scalar_insert_exp (vector unsigned __int128,
+                                   vector unsigned long long);
+vector unsigned long long scalar_extract_exp_to_vec (ieee_128);
+vector unsigned __int128  scalar_extract_sig_to_vec (ieee_128);
 
 int scalar_cmp_exp_gt (double arg1, double arg2);
 int scalar_cmp_exp_lt (double arg1, double arg2);
@@ -19771,11 +19775,18 @@ of the result are composed of the least significant 11 bits of the
 
 When supplied with a 128-bit first argument, the
 @code{scalar_insert_exp} built-in function returns a quad-precision
-ieee floating point value.  The sign bit of the result is copied from
-the most significant bit of the @code{significand} argument.
-The significand and exponent components of the result are composed of
-the least significant 15 bits of the @code{exponent} argument and the
-least significant 112 bits of the @code{significand} argument respectively.
+ieee floating point value if the two arguments were scalar.  If the two
+arguments are vectors, the retun value is a vector ieee floating point value.
+The sign bit of the result is copied from the most significant bit of the
+@code{significand} argument.  The significand and exponent components of the
+result are composed of the least significant 15 bits of the @code{exponent}
+argument and the least significant 112 bits of the @code{significand} argument
+respectively.
+
+The @code{scalar_extract_exp_to_vec},
+and @code{scalar_extract_sig_to_vec} are similar to
+@code{scalar_extract_exp}, @code{scalar_extract_sig} except they return
+a vector result of type unsigned long long and unsigned __int128 respectively.
 
 The @code{scalar_cmp_exp_gt}, @code{scalar_cmp_exp_lt},
 @code{scalar_cmp_exp_eq}, and @code{scalar_cmp_exp_unordered} built-in
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-1.c b/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-1.c
new file mode 100644
index 00000000000..9ebf7902ea9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-1.c
@@ -0,0 +1,53 @@
+/* { dg-do run { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p9vector_hw } */
+/* { dg-options "-mdejagnu-cpu=power9 -save-temps" } */
+
+#include <altivec.h>
+#include <stdlib.h>
+
+#if DEBUG
+#include <stdio.h>
+#endif
+
+vector unsigned long long int
+get_exponents (__ieee128 *p)
+{
+  __ieee128 source = *p;
+
+  return scalar_extract_exp_to_vec (source);
+}
+
+int
+main ()
+{
+  vector unsigned long long int result, exp_result;
+  union conv128_t
+   {
+     __ieee128 val_ieee128;
+     __int128  val_int128;
+  } source;
+  
+  exp_result[0] = 0x0ULL;
+  exp_result[1] = 0x1234ULL;
+  source.val_int128 = 0x923456789ABCDEF0ULL;
+  source.val_int128 = (source.val_int128 << 64) | 0x123456789ABCDEFULL;
+
+  result = get_exponents (&source.val_ieee128);
+
+  if ((result[0] != exp_result[0]) || (result[1] != exp_result[1]))
+#if DEBUG
+    {
+      printf("result[0] = 0x%llx; exp_result[0] = 0x%llx\n",
+	     result[0], exp_result[0]);
+      printf("result[1] = 0x%llx; exp_result[1] = 0x%llx\n",
+	     result[1], exp_result[1]);
+    }
+#else
+    abort();
+#endif
+  return 0;
+}
+
+/* check that the expected extract exponent instruction is generated. */
+/* { dg-final { scan-assembler-times {\mxsxexpqp\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-1.c b/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-1.c
new file mode 100644
index 00000000000..ffcab1b5ac8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p9vector_hw } */
+/* { dg-options "-mdejagnu-cpu=power9 -save-temps" } */
+
+#include <altivec.h>
+#include <stdlib.h>
+
+#if DEBUG
+#include <stdio.h>
+#endif
+
+vector unsigned __int128
+get_significand (__ieee128 *p)
+{
+  __ieee128 source = *p;
+
+  return scalar_extract_sig_to_vec(source);
+}
+
+int
+main ()
+{
+  #define NOT_ZERO_OR_DENORMAL  0x1000000000000
+
+  union conv128_t
+   {
+     __ieee128 val_ieee128;
+     unsigned long long int val_ull[2];
+     unsigned __int128  val_uint128;
+     vector unsigned __int128  val_vuint128;
+  } source, result, exp_result;
+  
+  /* Result is not zero or denormal.  */
+  exp_result.val_ull[1] = 0x00056789ABCDEF0ULL | NOT_ZERO_OR_DENORMAL;
+  exp_result.val_ull[0] = 0x123456789ABCDEFULL;
+  source.val_uint128 = 0x923456789ABCDEF0ULL;
+  source.val_uint128 = (source.val_uint128 << 64) | 0x123456789ABCDEFULL;
+
+  /* Note, bits[0:14] are set to 0, bit[15] is 0 if the input was zero or
+     Denormal, 1 otherwise.  */
+  result.val_vuint128 = get_significand (&source.val_ieee128);
+
+  if ((result.val_ull[0] != exp_result.val_ull[0])
+      || (result.val_ull[1] != exp_result.val_ull[1]))
+#if DEBUG
+    {
+      printf("result[0] = 0x%llx; exp_result[0] = 0x%llx\n",
+	     result.val_ull[0], exp_result.val_ull[0]);
+      printf("result[1] = 0x%llx; exp_result[1] = 0x%llx\n",
+	     result.val_ull[1], exp_result.val_ull[1]);
+    }
+#else
+    abort();
+#endif
+  return 0;
+}
+
+/* check that the expected extract significand instruction is generated. */
+/* { dg-final { scan-assembler-times {\mxsxsigqp\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-1.c b/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-1.c
new file mode 100644
index 00000000000..66d06028c60
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-1.c
@@ -0,0 +1,94 @@
+/* { dg-do run { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p9vector_hw } */
+/* { dg-options "-mdejagnu-cpu=power9 -save-temps" } */
+
+#include <altivec.h>
+#include <stdlib.h>
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+__ieee128
+insert_exponent (vector unsigned __int128 *significand_p,
+		 vector unsigned long long int *exponent_p)
+{
+  vector unsigned __int128 significand = *significand_p;
+  vector unsigned long long int exponent = *exponent_p;
+
+  return scalar_insert_exp (significand, exponent);
+}
+
+__ieee128
+insert_exponent2 (unsigned __int128 significand,
+		 unsigned long long int exponent)
+{
+   return scalar_insert_exp (significand, exponent);
+}
+
+int
+main ()
+{
+  __ieee128 val_ieee128, result_ieee128, exp_result_ieee128;
+  unsigned __int128 val_int128;
+  unsigned long long int  val_ull;
+  union conv128_t
+   {
+     __ieee128 val_ieee128;
+     vector unsigned __int128 val_vint128;
+     vector unsigned long long int  val_vull;
+  } result, exp_result, significand;
+
+  vector unsigned long long int exponent;
+
+  /* Scalar argument test */
+  val_ieee128 = 0xFEDCBA9876543210ULL;
+  val_ull = 0x5678;
+  exp_result.val_vull[1] = 0x5678000000000000ULL;
+  exp_result.val_vull[0] = 0xfedcba9876543210;  
+  result_ieee128 = insert_exponent2 (val_ieee128, val_ull);
+
+  if (result_ieee128 != exp_result.val_ieee128)
+#ifdef DEBUG
+    {
+      result.val_ieee128 = result_ieee128;
+      printf("Scalar argument ERROR:\n");
+      printf(" val_ieee128 = 0x%llx %llx\n",
+	     result.val_vull[1], result.val_vull[0]);
+      printf(" val_ieee128 = 0x%llx %llx\n",
+	     exp_result.val_vull[1], exp_result.val_vull[0]);
+    }
+#else
+    abort ();
+#endif
+
+  /* Vector argument test */
+  significand.val_vull[0] = 0xFEDCBA9876543210ULL;
+  significand.val_vull[1] = 0x7FFF12345678ABCDULL;  /* positive value */
+
+  exponent[0] = 0x5678;
+  exponent[1] = 0x1234;
+
+  exp_result.val_vull[0] = 0xFEDCBA9876543210ULL;
+  exp_result.val_vull[1] = 0x123412345678ABCDULL;
+
+  result.val_ieee128 = insert_exponent(&significand.val_vint128, &exponent);
+  
+  if (result.val_ieee128 != exp_result.val_ieee128)
+#ifdef DEBUG
+    {
+      printf("Vector argument ERROR:\n");
+      printf(" result = 0x%llx %llx\n",
+	     result.val_vull[1], result.val_vull[0]);
+      printf(" exp_result = 0x%llx %llx\n",
+	     exp_result.val_vull[1], exp_result.val_vull[0]);
+    }
+#else
+    abort ();
+#endif
+
+}
+
+/* check that the expected insert exponent instruction is generated. */
+/* { dg-final { scan-assembler-times {\mxsiexpqp\M} 2 } } */
-- 
2.37.2



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values
  2023-06-14 20:37 [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values Carl Love
@ 2023-06-15  6:23 ` Kewen.Lin
  2023-06-16 17:57   ` Carl Love
  0 siblings, 1 reply; 3+ messages in thread
From: Kewen.Lin @ 2023-06-15  6:23 UTC (permalink / raw)
  To: Carl Love; +Cc: Segher Boessenkool, Peter Bergner, dje.gcc, gcc-patches

Hi Carl,

on 2023/6/15 04:37, Carl Love wrote:
> Kewen, GCC maintainers:
> 
> Version 4, added missing cases for new xxexpqp, xsxexpdp and xsxsigqp
> cases to rs6000_expand_builtin.  Merged the new define_insn definitions
> with the existing definitions.  Renamed the builtins by removing the
> __builtin_ prefix from the names.  Fixed the documentation for the
> builtins.  Updated the test files to check the desired instructions
> were generated.  Retested patch on Power 10 with no regressions.
> 
> Version 3, was able to get the overloaded version of scalar_insert_exp
> to work and the change to xsxexpqp_f128_<mode> define instruction to
> work with the suggestions from Kewen.  
> 
> Version 2, I have addressed the various comments from Kewen.  I had
> issues with adding an additional overloaded version of
> scalar_insert_exp with vector arguments.  The overload infrastructure
> didn't work with a mix of scalar and vector arguments.  I did rename
> the __builtin_insertf128_exp to __builtin_vsx_scalar_insert_exp_qp make
> it similar to the existing builtin.  I also wasn't able to get the
> suggested merge of xsxexpqp_f128_<mode> with xsxexpqp_<mode> to work so
> I left the two simpler definitiions.
> 
> The patch add three new builtins to extract the significand and
> exponent of an IEEE float 128-bit value where the builtin argument is a
> vector.  Additionally, a builtin to insert the exponent into an IEEE
> float 128-bit vector argument is added.  These builtins were requested
> since there is no clean and optimal way to transfer between a vector
> and a scalar IEEE 128 bit value.
> 
> The patch has been tested on Power 10 with no regressions.  Please let
> me know if the patch is acceptable or not.  Thanks.

I'd suggest you to test this on P9 BE as well to ensure the test case
to work well on BE too.

> 
>                Carl
> 
> 
> ----------------------------------------
> rs6000: Add builtins for IEEE 128-bit floating point values
> 
> Add support for the following builtins:
> 
>  __vector unsigned long long int scalar_extract_exp_to_vec (__ieee128);
>  __vector unsigned __int128 scalar_extract_sig_to_vec (__ieee128);
>  __ieee128 scalar_insert_exp (__vector unsigned __int128,
>  			      __vector unsigned long long);
> 
> These builtins were requesed since there is no clean and performant way to

s/requesed/requested/

> transfer a value from a vector type and scalar type, despite the fact

Describe it oppositely?  As the related existing bifs returns scalar type,
the users want them in vector type, so it's "from scalar type to vector
type"?

> that they both reside in vector registers.

the fact is the related hardware insns have vsx registers destination.

> 
> gcc/
> 	* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin):
> 	Rename CCDE_FOR_xsxexpqp_tf to CODE_FOR_xsxexpqp_tf_di.
> 	Rename CODE_FOR_xsxexpqp_kf to CODE_FOR_xsxexpqp_kf_di.
> 	(CODE_FOR_xsxexpqp_kf_v2di, CODE_FOR_xsxsigqp_kf_v1ti,
> 	CODE_FOR_xsiexpqp_kf_v2di	): Add case statements.

unnecessary tab.

> 	* config/rs6000/rs6000-buildin.def (__builtin_extractf128_exp,
> 	 __builtin_extractf128_sig, __builtin_insertf128_exp): Add new
> 	builtin definitions.
> 	Rename xsxexpqp_kf, xsxsigqp_kf, xxsiexpqp_kf to xsexpqp_kf_di,

typo, xxsiexpqp_kf => xsiexpqp_kf

> 	xsxsigqp_kf_ti, xsiexpqp_kf_di respectively.
> 	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
> 	Add else if for MODE_VECTOR_INT. Update comments.

May be better with "Update RS6000_OVLD_VEC_VSIE handling for MODE_VECTOR_INT
which is used for newly added overloaded instance"?

> 	* config/rs6000/rs6000-overload.def
> 	(__builtin_vec_scalar_insert_exp): Add new overload definition with
> 	vector arguments.
> 	(scalar_extract_exp_to_vec, scalar_extract_sig_to_vec): New
> 	odverloaded definitions.

s/odverloaded/overloaded/

> 	* config/vsx.md (VSEEQP_DI, VSESQP_TI): New mode iterators.
> 	(VSEEQP_DI_base): New mode attribute definition.
> 	Rename xsxexpqp_<mode> to sxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>.
> 	Rename xsxsigqp_<mode> to xsxsigqp_<IEEE128:mode>_<VSESQP_TI:mode>.
> 	Rename xsiexpqp_<mode> to xsiexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>.
> 	(xsxsigqp_f128_<mode>, xsiexpqpf_f128_<mode>): Add define_insn for
> 	new builtins.
> 	* doc/extend.texi (__builtin_extractf128_exp,
> 	__builtin_extractf128_sig): Add documentation for new builtins.
> 	(scalar_insert_exp): Add new overloaded builtin definition.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/bfp/extract-exp-1.c: New test case.
> 	* gcc.target/powerpc/bfp/extract-sig-1.c: New test case.
> 	* gcc.target/powerpc/bfp/insert-exp-1.c: New test case.

May be better just with the existing base name and increasing number:

  extract-exp-1.c => scalar-extract-exp-8.c
  extract-sig-1.c => scalar-extract-sig-8.c
  insert-exp-1.c  => scalar-insert-exp-16.c

> ---
>  gcc/config/rs6000/rs6000-builtin.cc           | 21 +++--
>  gcc/config/rs6000/rs6000-builtins.def         | 15 ++-
>  gcc/config/rs6000/rs6000-c.cc                 | 10 +-
>  gcc/config/rs6000/rs6000-overload.def         | 10 ++
>  gcc/config/rs6000/vsx.md                      | 26 +++--
>  gcc/doc/extend.texi                           | 21 ++++-
>  .../gcc.target/powerpc/bfp/extract-exp-1.c    | 53 +++++++++++
>  .../gcc.target/powerpc/bfp/extract-sig-1.c    | 60 ++++++++++++
>  .../gcc.target/powerpc/bfp/insert-exp-1.c     | 94 +++++++++++++++++++
>  9 files changed, 284 insertions(+), 26 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-1.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
> index 534698e7d3e..a8f291c6a72 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -3326,17 +3326,26 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* subtarget */,
>        case CODE_FOR_fmakf4_odd:
>  	icode = CODE_FOR_fmatf4_odd;
>  	break;
> -      case CODE_FOR_xsxexpqp_kf:
> -	icode = CODE_FOR_xsxexpqp_tf;
> +      case CODE_FOR_xsxexpqp_kf_di:
> +	icode = CODE_FOR_xsxexpqp_tf_di;
>  	break;
> -      case CODE_FOR_xsxsigqp_kf:
> -	icode = CODE_FOR_xsxsigqp_tf;
> +      case CODE_FOR_xsxexpqp_kf_v2di:
> +	icode = CODE_FOR_xsxexpqp_tf_v2di;
> +	break;
> +      case CODE_FOR_xsxsigqp_kf_ti:
> +	icode = CODE_FOR_xsxsigqp_tf_ti;
> +	break;
> +      case CODE_FOR_xsxsigqp_kf_v1ti:
> +	icode = CODE_FOR_xsxsigqp_tf_v1ti;
>  	break;
>        case CODE_FOR_xststdcnegqp_kf:
>  	icode = CODE_FOR_xststdcnegqp_tf;
>  	break;
> -      case CODE_FOR_xsiexpqp_kf:
> -	icode = CODE_FOR_xsiexpqp_tf;
> +      case CODE_FOR_xsiexpqp_kf_di:
> +	icode = CODE_FOR_xsiexpqp_tf_di;
> +	break;
> +      case CODE_FOR_xsiexpqp_kf_v2di:
> +	icode = CODE_FOR_xsiexpqp_tf_v2di;
>  	break;
>        case CODE_FOR_xsiexpqpf_kf:
>  	icode = CODE_FOR_xsiexpqpf_tf;
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..6623cb8195d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2901,20 +2901,29 @@
>    fpmath double __builtin_truncf128_round_to_odd (_Float128);
>      TRUNCF128_ODD trunckfdf2_odd {}
>  
> +  vull __builtin_scalar_extract_exp_to_vec (_Float128);
> +    EEXPKF xsxexpqp_kf_v2di {}

Since you have added one entry in overload.def for this, I think
it's better to name this as __builtin_**vsx**_scalar_extract_exp_to_vec
just like the others.  And sorry for nit-picking, the related bif
has the id VSEEQP, could we use VSEEQPV for the id here?

And move this entry just after its related __builtin_vsx_scalar_extract_expq.

> +
> +  vuq __builtin_scalar_extract_sig_to_vec (_Float128);
> +    ESIGKF xsxsigqp_kf_v1ti {}
> +

Similar comments like above applied for this one.


>    const signed long long __builtin_vsx_scalar_extract_expq (_Float128);
> -    VSEEQP xsxexpqp_kf {}
> +    VSEEQP xsxexpqp_kf_di {}
>  
>    const signed __int128 __builtin_vsx_scalar_extract_sigq (_Float128);
> -    VSESQP xsxsigqp_kf {}
> +    VSESQP xsxsigqp_kf_ti {}
>  
>    const _Float128 __builtin_vsx_scalar_insert_exp_q (unsigned __int128, \
>                                                       unsigned long long);
> -    VSIEQP xsiexpqp_kf {}
> +    VSIEQP xsiexpqp_kf_di {}
>  
>    const _Float128 __builtin_vsx_scalar_insert_exp_qp (_Float128, \
>                                                        unsigned long long);
>      VSIEQPF xsiexpqpf_kf {}
>  
> +  const _Float128 __builtin_vsx_scalar_insert_exp_vqp (vuq, vull);
> +    VSIEQPV xsiexpqp_kf_v2di {}
> +
>    const signed int __builtin_vsx_scalar_test_data_class_qp (_Float128, \
>                                                              const int<7>);
>      VSTDCQP xststdcqp_kf {}
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index 8555174d36e..11060f697db 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -1929,11 +1929,15 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
>  	   128-bit variant of built-in function.  */
>  	if (GET_MODE_PRECISION (arg1_mode) > 64)
>  	  {
> -	    /* If first argument is of float variety, choose variant
> -	       that expects __ieee128 argument.  Otherwise, expect
> -	       __int128 argument.  */
> +	    /* If first argument is of float variety, choose the variant that
> +	       expects __ieee128 argument.  If the first argument is vector
> +	       int, choose the variant that expects vector unsigned
> +	       __int128 argument.  Otherwise, expect scalar __int128 argument.
> +	    */
>  	    if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
>  	      instance_code = RS6000_BIF_VSIEQPF;
> +	    else if (GET_MODE_CLASS (arg1_mode) == MODE_VECTOR_INT)
> +	      instance_code = RS6000_BIF_VSIEQPV;
>  	    else
>  	      instance_code = RS6000_BIF_VSIEQP;
>  	  }
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index c582490c084..1c503a4b3d3 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4515,6 +4515,16 @@
>      VSIEQP
>    _Float128 __builtin_vec_scalar_insert_exp (_Float128, unsigned long long);
>      VSIEQPF
> +  _Float128 __builtin_vec_scalar_insert_exp (vuq, vull);
> +    VSIEQPV
> +
> +[VEC_VSEEV, scalar_extract_exp_to_vec, __builtin_scalar_extract_exp_to_vector]

Thanks for adding this, to make it more consistent with the existing others,
I think the latter name would be better with
__builtin_**vec**_scalar_extract_exp_to_vector.

> +  vull __builtin_scalar_extract_exp_to_vector (_Float128);
> +    EEXPKF

Need to update this bif id accordingly if the one in rs6000-buildin.def changes.

> +
> +[VEC_VSESV, scalar_extract_sig_to_vec, __builtin_scalar_extract_sig_to_vector]
> +  vuq __builtin_scalar_extract_sig_to_vector (_Float128);
> +    ESIGKF
>  

Ditto.

>  [VEC_VSTDC, scalar_test_data_class, __builtin_vec_scalar_test_data_class]
>    unsigned int __builtin_vec_scalar_test_data_class (float, const int);
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 7d845df5c2d..bb19b5ef170 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -396,6 +396,10 @@
>  			   V4SF
>  			   V2DF
>  			   V2DI])
> +(define_mode_iterator VSEEQP_DI  [V2DI DI])

Since you hoist this to one centralized place, I guess it may be better to name
it more general, like: V2DI_DI.

> +(define_mode_iterator VSESQP_TI  [V1TI TI])

VEC_TI matches the requirement here.

> +(define_mode_attr VSEEQP_DI_base [(V2DI "V1TI")
> +				  (DI   "TI")])

As above, may be better with name DI_to_TI.

>  
>  (define_mode_attr VM3_char [(V2DI "d")
>  			   (V4SI "w")
> @@ -5008,9 +5012,10 @@
>  ;; ISA 3.0 Binary Floating-Point Support
>  
>  ;; VSX Scalar Extract Exponent Quad-Precision
> -(define_insn "xsxexpqp_<mode>"
> -  [(set (match_operand:DI 0 "altivec_register_operand" "=v")
> -	(unspec:DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> +(define_insn "xsxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
> +  [(set (match_operand:VSEEQP_DI 0 "altivec_register_operand" "=v")
> +	(unspec:VSEEQP_DI
> +	  [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
>  	 UNSPEC_VSX_SXEXPDP))]
>    "TARGET_P9_VECTOR"
>    "xsxexpqp %0,%1"
> @@ -5026,9 +5031,10 @@
>    [(set_attr "type" "integer")])
>  
>  ;; VSX Scalar Extract Significand Quad-Precision
> -(define_insn "xsxsigqp_<mode>"
> -  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
> -	(unspec:TI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> +(define_insn "xsxsigqp_<IEEE128:mode>_<VSESQP_TI:mode>"
> +  [(set (match_operand:VSESQP_TI 0 "altivec_register_operand" "=v")
> +	(unspec:VSESQP_TI [(match_operand:IEEE128 1
> +			    "altivec_register_operand" "v")]
>  	 UNSPEC_VSX_SXSIG))]
>    "TARGET_P9_VECTOR"
>    "xsxsigqp %0,%1"
> @@ -5055,10 +5061,12 @@
>    [(set_attr "type" "vecmove")])
>  
>  ;; VSX Scalar Insert Exponent Quad-Precision
> -(define_insn "xsiexpqp_<mode>"
> +(define_insn "xsiexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
>    [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> -	(unspec:IEEE128 [(match_operand:TI 1 "altivec_register_operand" "v")
> -			 (match_operand:DI 2 "altivec_register_operand" "v")]
> +	(unspec:IEEE128 [(match_operand:<VSEEQP_DI_base> 1
> +			  "altivec_register_operand" "v")
> +			 (match_operand:VSEEQP_DI 2
> +			  "altivec_register_operand" "v")]
>  	 UNSPEC_VSX_SIEXPQP))]
>    "TARGET_P9_VECTOR"
>    "xsiexpqp %0,%1,%2"
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e426a2eb7d8..6da7ae9d94c 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -19724,6 +19724,10 @@ double scalar_insert_exp (double significand, unsigned long long int exponent);
>  ieee_128 scalar_insert_exp (unsigned __int128 significand,
>                              unsigned long long int exponent);
>  ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);
> +vector ieee_128 scalar_insert_exp (vector unsigned __int128,
> +                                   vector unsigned long long);
> +vector unsigned long long scalar_extract_exp_to_vec (ieee_128);
> +vector unsigned __int128  scalar_extract_sig_to_vec (ieee_128);
>  
>  int scalar_cmp_exp_gt (double arg1, double arg2);
>  int scalar_cmp_exp_lt (double arg1, double arg2);
> @@ -19771,11 +19775,18 @@ of the result are composed of the least significant 11 bits of the
>  
>  When supplied with a 128-bit first argument, the
>  @code{scalar_insert_exp} built-in function returns a quad-precision
> -ieee floating point value.  The sign bit of the result is copied from
> -the most significant bit of the @code{significand} argument.
> -The significand and exponent components of the result are composed of
> -the least significant 15 bits of the @code{exponent} argument and the
> -least significant 112 bits of the @code{significand} argument respectively.
> +ieee floating point value if the two arguments were scalar.  If the two
> +arguments are vectors, the retun value is a vector ieee floating point value.

s/retun/return/

> +The sign bit of the result is copied from the most significant bit of the
> +@code{significand} argument.  The significand and exponent components of the
> +result are composed of the least significant 15 bits of the @code{exponent}
> +argument and the least significant 112 bits of the @code{significand} argument
> +respectively.

The @code{exponent} now is a vector and has two elements, you should note
which element matters here, element zero on big-endian while element one
on little-endian will be used.

Since this document quoted both @code{significand} and @code{significand},
the above prototype should have the argument name as well.  :)

+vector ieee_128 scalar_insert_exp (vector unsigned __int128 significand,
+                                   vector unsigned long long exponent);


The others look good to me, thanks!

BR,
Kewen


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values
  2023-06-15  6:23 ` Kewen.Lin
@ 2023-06-16 17:57   ` Carl Love
  0 siblings, 0 replies; 3+ messages in thread
From: Carl Love @ 2023-06-16 17:57 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Segher Boessenkool, Peter Bergner, dje.gcc, gcc-patches

On Thu, 2023-06-15 at 14:23 +0800, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2023/6/15 04:37, Carl Love wrote:
> > Kewen, GCC maintainers:
> > 
> > Version 4, added missing cases for new xxexpqp, xsxexpdp and
> > xsxsigqp
> > cases to rs6000_expand_builtin.  Merged the new define_insn
> > definitions
> > with the existing definitions.  Renamed the builtins by removing
> > the
> > __builtin_ prefix from the names.  Fixed the documentation for the
> > builtins.  Updated the test files to check the desired instructions
> > were generated.  Retested patch on Power 10 with no regressions.
> > 
> > Version 3, was able to get the overloaded version of
> > scalar_insert_exp
> > to work and the change to xsxexpqp_f128_<mode> define instruction
> > to
> > work with the suggestions from Kewen.  
> > 
> > Version 2, I have addressed the various comments from Kewen.  I had
> > issues with adding an additional overloaded version of
> > scalar_insert_exp with vector arguments.  The overload
> > infrastructure
> > didn't work with a mix of scalar and vector arguments.  I did
> > rename
> > the __builtin_insertf128_exp to __builtin_vsx_scalar_insert_exp_qp
> > make
> > it similar to the existing builtin.  I also wasn't able to get the
> > suggested merge of xsxexpqp_f128_<mode> with xsxexpqp_<mode> to
> > work so
> > I left the two simpler definitiions.
> > 
> > The patch add three new builtins to extract the significand and
> > exponent of an IEEE float 128-bit value where the builtin argument
> > is a
> > vector.  Additionally, a builtin to insert the exponent into an
> > IEEE
> > float 128-bit vector argument is added.  These builtins were
> > requested
> > since there is no clean and optimal way to transfer between a
> > vector
> > and a scalar IEEE 128 bit value.
> > 
> > The patch has been tested on Power 10 with no regressions.  Please
> > let
> > me know if the patch is acceptable or not.  Thanks.
> 
> I'd suggest you to test this on P9 BE as well to ensure the test case
> to work well on BE too.

Tested on P9 BE.  Updated test cases for the correct expected BE and LE
results.

> 
> >                Carl
> > 
> > 
> > ----------------------------------------
> > rs6000: Add builtins for IEEE 128-bit floating point values
> > 
> > Add support for the following builtins:
> > 
> >  __vector unsigned long long int scalar_extract_exp_to_vec
> > (__ieee128);
> >  __vector unsigned __int128 scalar_extract_sig_to_vec (__ieee128);
> >  __ieee128 scalar_insert_exp (__vector unsigned __int128,
> >  			      __vector unsigned long long);
> > 
> > These builtins were requesed since there is no clean and performant
> > way to
> 
> s/requesed/requested/

Fixed.

> 
> > transfer a value from a vector type and scalar type, despite the
> > fact
> 
> Describe it oppositely?  As the related existing bifs returns scalar
> type,
> the users want them in vector type, so it's "from scalar type to
> vector
> type"?

Updated the description.

> 
> > that they both reside in vector registers.
> 
> the fact is the related hardware insns have vsx registers
> destination.
> 
> > gcc/
> > 	* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin):
> > 	Rename CCDE_FOR_xsxexpqp_tf to CODE_FOR_xsxexpqp_tf_di.
> > 	Rename CODE_FOR_xsxexpqp_kf to CODE_FOR_xsxexpqp_kf_di.
> > 	(CODE_FOR_xsxexpqp_kf_v2di, CODE_FOR_xsxsigqp_kf_v1ti,
> > 	CODE_FOR_xsiexpqp_kf_v2di	): Add case statements.
> 
> unnecessary tab.

Fixed.

> 
> > 	* config/rs6000/rs6000-buildin.def (__builtin_extractf128_exp,
> > 	 __builtin_extractf128_sig, __builtin_insertf128_exp): Add new
> > 	builtin definitions.
> > 	Rename xsxexpqp_kf, xsxsigqp_kf, xxsiexpqp_kf to xsexpqp_kf_di,
> 
> typo, xxsiexpqp_kf => xsiexpqp_kf
> 
> > 	xsxsigqp_kf_ti, xsiexpqp_kf_di respectively.
> > 	* config/rs6000/rs6000-c.cc
> > (altivec_resolve_overloaded_builtin):
> > 	Add else if for MODE_VECTOR_INT. Update comments.
> 
> May be better with "Update RS6000_OVLD_VEC_VSIE handling for
> MODE_VECTOR_INT
> which is used for newly added overloaded instance"?

Changed.

> 
> > 	* config/rs6000/rs6000-overload.def
> > 	(__builtin_vec_scalar_insert_exp): Add new overload definition
> > with
> > 	vector arguments.
> > 	(scalar_extract_exp_to_vec, scalar_extract_sig_to_vec): New
> > 	odverloaded definitions.
> 
> s/odverloaded/overloaded/

Fixed.

> 
> > 	* config/vsx.md (VSEEQP_DI, VSESQP_TI): New mode iterators.
> > 	(VSEEQP_DI_base): New mode attribute definition.
> > 	Rename xsxexpqp_<mode> to
> > sxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>.
> > 	Rename xsxsigqp_<mode> to
> > xsxsigqp_<IEEE128:mode>_<VSESQP_TI:mode>.
> > 	Rename xsiexpqp_<mode> to
> > xsiexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>.
> > 	(xsxsigqp_f128_<mode>, xsiexpqpf_f128_<mode>): Add define_insn
> > for
> > 	new builtins.
> > 	* doc/extend.texi (__builtin_extractf128_exp,
> > 	__builtin_extractf128_sig): Add documentation for new builtins.
> > 	(scalar_insert_exp): Add new overloaded builtin definition.
> > 
> > gcc/testsuite/
> > 	* gcc.target/powerpc/bfp/extract-exp-1.c: New test case.
> > 	* gcc.target/powerpc/bfp/extract-sig-1.c: New test case.
> > 	* gcc.target/powerpc/bfp/insert-exp-1.c: New test case.
> 
> May be better just with the existing base name and increasing number:

Renamed test files.

> 
>   extract-exp-1.c => scalar-extract-exp-8.c
>   extract-sig-1.c => scalar-extract-sig-8.c
>   insert-exp-1.c  => scalar-insert-exp-16.c
> 
> > ---
> >  gcc/config/rs6000/rs6000-builtin.cc           | 21 +++--
> >  gcc/config/rs6000/rs6000-builtins.def         | 15 ++-
> >  gcc/config/rs6000/rs6000-c.cc                 | 10 +-
> >  gcc/config/rs6000/rs6000-overload.def         | 10 ++
> >  gcc/config/rs6000/vsx.md                      | 26 +++--
> >  gcc/doc/extend.texi                           | 21 ++++-
> >  .../gcc.target/powerpc/bfp/extract-exp-1.c    | 53 +++++++++++
> >  .../gcc.target/powerpc/bfp/extract-sig-1.c    | 60 ++++++++++++
> >  .../gcc.target/powerpc/bfp/insert-exp-1.c     | 94
> > +++++++++++++++++++
> >  9 files changed, 284 insertions(+), 26 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-
> > exp-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-
> > sig-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/insert-
> > exp-1.c
> > 
> > diff --git a/gcc/config/rs6000/rs6000-builtin.cc
> > b/gcc/config/rs6000/rs6000-builtin.cc
> > index 534698e7d3e..a8f291c6a72 100644
> > --- a/gcc/config/rs6000/rs6000-builtin.cc
> > +++ b/gcc/config/rs6000/rs6000-builtin.cc
> > @@ -3326,17 +3326,26 @@ rs6000_expand_builtin (tree exp, rtx
> > target, rtx /* subtarget */,
> >        case CODE_FOR_fmakf4_odd:
> >  	icode = CODE_FOR_fmatf4_odd;
> >  	break;
> > -      case CODE_FOR_xsxexpqp_kf:
> > -	icode = CODE_FOR_xsxexpqp_tf;
> > +      case CODE_FOR_xsxexpqp_kf_di:
> > +	icode = CODE_FOR_xsxexpqp_tf_di;
> >  	break;
> > -      case CODE_FOR_xsxsigqp_kf:
> > -	icode = CODE_FOR_xsxsigqp_tf;
> > +      case CODE_FOR_xsxexpqp_kf_v2di:
> > +	icode = CODE_FOR_xsxexpqp_tf_v2di;
> > +	break;
> > +      case CODE_FOR_xsxsigqp_kf_ti:
> > +	icode = CODE_FOR_xsxsigqp_tf_ti;
> > +	break;
> > +      case CODE_FOR_xsxsigqp_kf_v1ti:
> > +	icode = CODE_FOR_xsxsigqp_tf_v1ti;
> >  	break;
> >        case CODE_FOR_xststdcnegqp_kf:
> >  	icode = CODE_FOR_xststdcnegqp_tf;
> >  	break;
> > -      case CODE_FOR_xsiexpqp_kf:
> > -	icode = CODE_FOR_xsiexpqp_tf;
> > +      case CODE_FOR_xsiexpqp_kf_di:
> > +	icode = CODE_FOR_xsiexpqp_tf_di;
> > +	break;
> > +      case CODE_FOR_xsiexpqp_kf_v2di:
> > +	icode = CODE_FOR_xsiexpqp_tf_v2di;
> >  	break;
> >        case CODE_FOR_xsiexpqpf_kf:
> >  	icode = CODE_FOR_xsiexpqpf_tf;
> > diff --git a/gcc/config/rs6000/rs6000-builtins.def
> > b/gcc/config/rs6000/rs6000-builtins.def
> > index 638d0bc72ca..6623cb8195d 100644
> > --- a/gcc/config/rs6000/rs6000-builtins.def
> > +++ b/gcc/config/rs6000/rs6000-builtins.def
> > @@ -2901,20 +2901,29 @@
> >    fpmath double __builtin_truncf128_round_to_odd (_Float128);
> >      TRUNCF128_ODD trunckfdf2_odd {}
> >  
> > +  vull __builtin_scalar_extract_exp_to_vec (_Float128);
> > +    EEXPKF xsxexpqp_kf_v2di {}
> 
> Since you have added one entry in overload.def for this, I think
> it's better to name this as
> __builtin_**vsx**_scalar_extract_exp_to_vec
> just like the others.  And sorry for nit-picking, the related bif
> has the id VSEEQP, could we use VSEEQPV for the id here?
> 
> And move this entry just after its related
> __builtin_vsx_scalar_extract_expq.
> 

Renamed and moved as requested.

> > +
> > +  vuq __builtin_scalar_extract_sig_to_vec (_Float128);
> > +    ESIGKF xsxsigqp_kf_v1ti {}
> > +
> 
> Similar comments like above applied for this one.

Also fixed.

> 
> 
> >    const signed long long __builtin_vsx_scalar_extract_expq
> > (_Float128);
> > -    VSEEQP xsxexpqp_kf {}
> > +    VSEEQP xsxexpqp_kf_di {}
> >  
> >    const signed __int128 __builtin_vsx_scalar_extract_sigq
> > (_Float128);
> > -    VSESQP xsxsigqp_kf {}
> > +    VSESQP xsxsigqp_kf_ti {}
> >  
> >    const _Float128 __builtin_vsx_scalar_insert_exp_q (unsigned
> > __int128, \
> >                                                       unsigned long
> > long);
> > -    VSIEQP xsiexpqp_kf {}
> > +    VSIEQP xsiexpqp_kf_di {}
> >  
> >    const _Float128 __builtin_vsx_scalar_insert_exp_qp (_Float128, \
> >                                                        unsigned
> > long long);
> >      VSIEQPF xsiexpqpf_kf {}
> >  
> > +  const _Float128 __builtin_vsx_scalar_insert_exp_vqp (vuq, vull);
> > +    VSIEQPV xsiexpqp_kf_v2di {}
> > +
> >    const signed int __builtin_vsx_scalar_test_data_class_qp
> > (_Float128, \
> >                                                              const
> > int<7>);
> >      VSTDCQP xststdcqp_kf {}
> > diff --git a/gcc/config/rs6000/rs6000-c.cc
> > b/gcc/config/rs6000/rs6000-c.cc
> > index 8555174d36e..11060f697db 100644
> > --- a/gcc/config/rs6000/rs6000-c.cc
> > +++ b/gcc/config/rs6000/rs6000-c.cc
> > @@ -1929,11 +1929,15 @@ altivec_resolve_overloaded_builtin
> > (location_t loc, tree fndecl,
> >  	   128-bit variant of built-in function.  */
> >  	if (GET_MODE_PRECISION (arg1_mode) > 64)
> >  	  {
> > -	    /* If first argument is of float variety, choose variant
> > -	       that expects __ieee128 argument.  Otherwise, expect
> > -	       __int128 argument.  */
> > +	    /* If first argument is of float variety, choose the
> > variant that
> > +	       expects __ieee128 argument.  If the first argument is
> > vector
> > +	       int, choose the variant that expects vector unsigned
> > +	       __int128 argument.  Otherwise, expect scalar __int128
> > argument.
> > +	    */
> >  	    if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
> >  	      instance_code = RS6000_BIF_VSIEQPF;
> > +	    else if (GET_MODE_CLASS (arg1_mode) == MODE_VECTOR_INT)
> > +	      instance_code = RS6000_BIF_VSIEQPV;
> >  	    else
> >  	      instance_code = RS6000_BIF_VSIEQP;
> >  	  }
> > diff --git a/gcc/config/rs6000/rs6000-overload.def
> > b/gcc/config/rs6000/rs6000-overload.def
> > index c582490c084..1c503a4b3d3 100644
> > --- a/gcc/config/rs6000/rs6000-overload.def
> > +++ b/gcc/config/rs6000/rs6000-overload.def
> > @@ -4515,6 +4515,16 @@
> >      VSIEQP
> >    _Float128 __builtin_vec_scalar_insert_exp (_Float128, unsigned
> > long long);
> >      VSIEQPF
> > +  _Float128 __builtin_vec_scalar_insert_exp (vuq, vull);
> > +    VSIEQPV
> > +
> > +[VEC_VSEEV, scalar_extract_exp_to_vec,
> > __builtin_scalar_extract_exp_to_vector]
> 
> Thanks for adding this, to make it more consistent with the existing
> others,
> I think the latter name would be better with
> __builtin_**vec**_scalar_extract_exp_to_vector.

Renamed. 

> 
> > +  vull __builtin_scalar_extract_exp_to_vector (_Float128);
> > +    EEXPKF
> 
> Need to update this bif id accordingly if the one in rs6000-
> buildin.def changes.

Changed. 

> 
> > +
> > +[VEC_VSESV, scalar_extract_sig_to_vec,
> > __builtin_scalar_extract_sig_to_vector]
> > +  vuq __builtin_scalar_extract_sig_to_vector (_Float128);
> > +    ESIGKF
> >  
> 
> Ditto.
> 

Fixed.


> >  [VEC_VSTDC, scalar_test_data_class,
> > __builtin_vec_scalar_test_data_class]
> >    unsigned int __builtin_vec_scalar_test_data_class (float, const
> > int);
> > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> > index 7d845df5c2d..bb19b5ef170 100644
> > --- a/gcc/config/rs6000/vsx.md
> > +++ b/gcc/config/rs6000/vsx.md
> > @@ -396,6 +396,10 @@
> >  			   V4SF
> >  			   V2DF
> >  			   V2DI])
> > +(define_mode_iterator VSEEQP_DI  [V2DI DI])
> 
> Since you hoist this to one centralized place, I guess it may be
> better to name
> it more general, like: V2DI_DI.

Changed name as suggested.

> 
> > +(define_mode_iterator VSESQP_TI  [V1TI TI])
> 
> VEC_TI matches the requirement here.

Missed that definition.  Replaced VSESQP_TI with VEC_TI.

> 
> > +(define_mode_attr VSEEQP_DI_base [(V2DI "V1TI")
> > +				  (DI   "TI")])
> 
> As above, may be better with name DI_to_TI.

Renamed.

> 
> >  
> >  (define_mode_attr VM3_char [(V2DI "d")
> >  			   (V4SI "w")
> > @@ -5008,9 +5012,10 @@
> >  ;; ISA 3.0 Binary Floating-Point Support
> >  
> >  ;; VSX Scalar Extract Exponent Quad-Precision
> > -(define_insn "xsxexpqp_<mode>"
> > -  [(set (match_operand:DI 0 "altivec_register_operand" "=v")
> > -	(unspec:DI [(match_operand:IEEE128 1 "altivec_register_operand"
> > "v")]
> > +(define_insn "xsxexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
> > +  [(set (match_operand:VSEEQP_DI 0 "altivec_register_operand"
> > "=v")
> > +	(unspec:VSEEQP_DI
> > +	  [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> >  	 UNSPEC_VSX_SXEXPDP))]
> >    "TARGET_P9_VECTOR"
> >    "xsxexpqp %0,%1"
> > @@ -5026,9 +5031,10 @@
> >    [(set_attr "type" "integer")])
> >  
> >  ;; VSX Scalar Extract Significand Quad-Precision
> > -(define_insn "xsxsigqp_<mode>"
> > -  [(set (match_operand:TI 0 "altivec_register_operand" "=v")
> > -	(unspec:TI [(match_operand:IEEE128 1 "altivec_register_operand"
> > "v")]
> > +(define_insn "xsxsigqp_<IEEE128:mode>_<VSESQP_TI:mode>"
> > +  [(set (match_operand:VSESQP_TI 0 "altivec_register_operand"
> > "=v")
> > +	(unspec:VSESQP_TI [(match_operand:IEEE128 1
> > +			    "altivec_register_operand" "v")]
> >  	 UNSPEC_VSX_SXSIG))]
> >    "TARGET_P9_VECTOR"
> >    "xsxsigqp %0,%1"
> > @@ -5055,10 +5061,12 @@
> >    [(set_attr "type" "vecmove")])
> >  
> >  ;; VSX Scalar Insert Exponent Quad-Precision
> > -(define_insn "xsiexpqp_<mode>"
> > +(define_insn "xsiexpqp_<IEEE128:mode>_<VSEEQP_DI:mode>"
> >    [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > -	(unspec:IEEE128 [(match_operand:TI 1 "altivec_register_operand"
> > "v")
> > -			 (match_operand:DI 2 "altivec_register_operand"
> > "v")]
> > +	(unspec:IEEE128 [(match_operand:<VSEEQP_DI_base> 1
> > +			  "altivec_register_operand" "v")
> > +			 (match_operand:VSEEQP_DI 2
> > +			  "altivec_register_operand" "v")]
> >  	 UNSPEC_VSX_SIEXPQP))]
> >    "TARGET_P9_VECTOR"
> >    "xsiexpqp %0,%1,%2"
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index e426a2eb7d8..6da7ae9d94c 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -19724,6 +19724,10 @@ double scalar_insert_exp (double
> > significand, unsigned long long int exponent);
> >  ieee_128 scalar_insert_exp (unsigned __int128 significand,
> >                              unsigned long long int exponent);
> >  ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long
> > long int exponent);
> > +vector ieee_128 scalar_insert_exp (vector unsigned __int128,
> > +                                   vector unsigned long long);
> > +vector unsigned long long scalar_extract_exp_to_vec (ieee_128);
> > +vector unsigned __int128  scalar_extract_sig_to_vec (ieee_128);
> >  
> >  int scalar_cmp_exp_gt (double arg1, double arg2);
> >  int scalar_cmp_exp_lt (double arg1, double arg2);
> > @@ -19771,11 +19775,18 @@ of the result are composed of the least
> > significant 11 bits of the
> >  
> >  When supplied with a 128-bit first argument, the
> >  @code{scalar_insert_exp} built-in function returns a quad-
> > precision
> > -ieee floating point value.  The sign bit of the result is copied
> > from
> > -the most significant bit of the @code{significand} argument.
> > -The significand and exponent components of the result are composed
> > of
> > -the least significant 15 bits of the @code{exponent} argument and
> > the
> > -least significant 112 bits of the @code{significand} argument
> > respectively.
> > +ieee floating point value if the two arguments were scalar.  If
> > the two
> > +arguments are vectors, the retun value is a vector ieee floating
> > point value.
> 
> s/retun/return/

fixed.

> 
> > +The sign bit of the result is copied from the most significant bit
> > of the
> > +@code{significand} argument.  The significand and exponent
> > components of the
> > +result are composed of the least significant 15 bits of the
> > @code{exponent}
> > +argument and the least significant 112 bits of the
> > @code{significand} argument
> > +respectively.
> 
> The @code{exponent} now is a vector and has two elements, you should
> note
> which element matters here, element zero on big-endian while element
> one
> on little-endian will be used.
> 
> Since this document quoted both @code{significand} and
> @code{significand},
> the above prototype should have the argument name as well.  :)
> 
> +vector ieee_128 scalar_insert_exp (vector unsigned __int128
> significand,
> +                                   vector unsigned long long
> exponent);
> 

Ah, didn't notice that I missed adding the argument names.  Fixed.

               Carl 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-16 17:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-14 20:37 [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values Carl Love
2023-06-15  6:23 ` Kewen.Lin
2023-06-16 17:57   ` Carl Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).