[PATCH 0/13 ver4] rs6000, built-in cleanup patch series

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/13 ver4] rs6000, built-in cleanup patch series
@ 2024-06-13 19:23 Carl Love
  2024-06-13 19:40 ` [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins Carl Love
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Carl Love @ 2024-06-13 19:23 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner, Carl Love

GCC maintainers:

I have addressed the comments to the five patches in the series that have not yet been approved.
The patches that have already been approved are 1, 3, 5, 6, 8, 9, 10, and 12.

The remaining patches all have fairly minor fixes requested.  I will just post version 4 of these patches here.  The goal is to commit the entire series all at once as they are all related.  So I a holding off committing the approved patches.  

Thank you for your time and feedback of these patches.  The entire patch series has been tested on Power 10 LE, Power 9 BE with no regression failures.

                       Carl 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins.
  2024-06-13 19:23 [PATCH 0/13 ver4] rs6000, built-in cleanup patch series Carl Love
@ 2024-06-13 19:40 ` Carl Love
  2024-06-19  3:03   ` Kewen.Lin
  2024-06-13 19:40 ` [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins Carl Love
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Carl Love @ 2024-06-13 19:40 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner, Carl Love

GCC maintainers:

Per the comments on patch 0004 from version 3, the removal of 
The built-in __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws was moved to this patch.  The rest of the patch is unchanged from version 3.  There were no comments on this patch for version 3.

Please let me know if this patch is acceptable.  Thanks.

                            Carl 

-----------------------------------------------------

rs6000, Remove __builtin_vsx_xvcvspsxws,
 __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins.

The built-in __builtin_vsx_xvcvspsxws is a duplicate of the vec_signed
built-in that is documented in the PVIPR.  The __builtin_vsx_xvcvspsxws
built-in is not documented and there are no test cases for it.

The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
vec_unsigned, remove.

The __builtin_vsx_xvcvspuxws is redundant as it is covered by
vec_unsigned, remove.

This patch removes the redundant built-in.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxws,
	__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws):
	Remove built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 7c36976a089..8cf0b715898 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1697,9 +1697,6 @@
   const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
     XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}

-  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
-    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
-
   const vsi __builtin_vsx_xvcvdpuxws (vd);
     XVCVDPUXWS vsx_xvcvdpuxws {}

@@ -1709,15 +1706,9 @@
   const vsll __builtin_vsx_xvcvspsxds (vf);
     XVCVSPSXDS vsx_xvcvspsxds {}

-  const vsi __builtin_vsx_xvcvspsxws (vf);
-    XVCVSPSXWS vsx_fix_truncv4sfv4si2 {}
-
   const vsll __builtin_vsx_xvcvspuxds (vf);
     XVCVSPUXDS vsx_xvcvspuxds {}

-  const vsi __builtin_vsx_xvcvspuxws (vf);
-    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
-
   const vd __builtin_vsx_xvcvsxddp (vsll);
     XVCVSXDDP vsx_floatv2div2df2 {}

-- 
2.45.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins
  2024-06-13 19:23 [PATCH 0/13 ver4] rs6000, built-in cleanup patch series Carl Love
  2024-06-13 19:40 ` [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins Carl Love
@ 2024-06-13 19:40 ` Carl Love
  2024-06-19  3:03   ` Kewen.Lin
  2024-06-13 19:40 ` [PATCH 7/13 ver4] rs6000, add overloaded vec_sel with int128 arguments Carl Love
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Carl Love @ 2024-06-13 19:40 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner, Carl Love


GCC maintainers:

As noted the removal of __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws was moved to patch 2 in the seris.  The patch has been updated per the comments from version 3.

Please let me know if this patch is acceptable for mainline.  

                             Carl 

------------------------------------------------------------------

rs6000, extend the current vec_{un,}signed{e,o} built-ins

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
convert a vector of floats to signed/unsigned long long ints.  Extend the
existing vec_{un,}signed{e,o} built-ins to handle the argument
vector of floats to return the even/odd signed/unsigned integers.

The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
built-ins.

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
now for internal use only. They are not documented and they do not
have testcases.

The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
vec_signed{e,o}, remove.

The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
vec_unsigned{e,o}, remove.

Add testcases and update documentation.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def: __builtin_vsx_xvcvdpsxws,
	__builtin_vsx_xvcvdpuxws): Removed.
	(__builtin_vsx_xvcvspsxds, __builtin_vsx_xvcvspuxds): Renamed
	__builtin_vsignede_v4sf, __builtin_vunsignede_v4sf respectively.
	(XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
	VEC_VUNSIGNEDE_V4SF respectively.
	(__builtin_vsignedo_v4sf, __builtin_vunsignedo_v4sf): New
	built-in definitions.
	* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
	vec_unsignede,vec_unsignedo):  Add new overloaded specifications.
	* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
	vunsignede_v4sf, vunsignedo_v4sf): New	define_expands.
	* doc/extend.texi (vec_signedo, vec_signede): Add documentation
	for new overloaded built-ins.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/builtins-3-runnable.c
	(test_unsigned_int_result, test_ll_unsigned_int_result): Add
	new argument.
	(vec_signede, vec_signedo, vec_unsignede, vec_unsignedo): New
	tests for the overloaded built-ins.
---  gcc/config/rs6000/rs6000-builtins.def         | 20 ++---
 gcc/config/rs6000/rs6000-overload.def         |  8 ++
 gcc/config/rs6000/vsx.md                      | 84 +++++++++++++++++++
 gcc/doc/extend.texi                           | 10 +++
 .../gcc.target/powerpc/builtins-3-runnable.c  | 49 +++++++++--
 5 files changed, 154 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 322d27b7a0d..29a9deb3410 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1688,26 +1688,26 @@
   const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
     XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
 
-  const vsi __builtin_vsx_xvcvdpsxws (vd);
-    XVCVDPSXWS vsx_xvcvdpsxws {}
-
   const vsll __builtin_vsx_xvcvdpuxds (vd);
     XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
 
   const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
     XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
 
-  const vsi __builtin_vsx_xvcvdpuxws (vd);
-    XVCVDPUXWS vsx_xvcvdpuxws {}
-
   const vd __builtin_vsx_xvcvspdp (vf);
     XVCVSPDP vsx_xvcvspdp {}
 
-  const vsll __builtin_vsx_xvcvspsxds (vf);
-    XVCVSPSXDS vsx_xvcvspsxds {}
+  const vsll __builtin_vsignede_v4sf (vf);
+    VEC_VSIGNEDE_V4SF vsignede_v4sf {}
+
+  const vsll __builtin_vsignedo_v4sf (vf);
+    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
+
+  const vull __builtin_vunsignede_v4sf (vf);
+    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
 
-  const vsll __builtin_vsx_xvcvspuxds (vf);
-    XVCVSPUXDS vsx_xvcvspuxds {}
+  const vull __builtin_vunsignedo_v4sf (vf);
+    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
 
   const vd __builtin_vsx_xvcvsxddp (vsll);
     XVCVSXDDP vsx_floatv2div2df2 {}
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 84bd9ae6554..4d857bb1af3 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3307,10 +3307,14 @@
 [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
   vsi __builtin_vec_vsignede (vd);
     VEC_VSIGNEDE_V2DF
+  vsll __builtin_vec_vsignede (vf);
+    VEC_VSIGNEDE_V4SF
 
 [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
   vsi __builtin_vec_vsignedo (vd);
     VEC_VSIGNEDO_V2DF
+  vsll __builtin_vec_vsignedo (vf);
+    VEC_VSIGNEDO_V4SF
 
 [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
   vsi __builtin_vec_signexti (vsc);
@@ -4433,10 +4437,14 @@
 [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
   vui __builtin_vec_vunsignede (vd);
     VEC_VUNSIGNEDE_V2DF
+  vull __builtin_vec_vunsignede (vf);
+    VEC_VUNSIGNEDE_V4SF
 
 [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
   vui __builtin_vec_vunsignedo (vd);
     VEC_VUNSIGNEDO_V2DF
+  vull __builtin_vec_vunsignedo (vf);
+    VEC_VUNSIGNEDO_V4SF
 
 [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
   vui __builtin_vec_extract_exp (vf);
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f135fa079bd..0187bdbf90e 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2704,6 +2704,90 @@
   DONE;
 })
 
+;; Convert float vector even elements to signed long long vector
+(define_expand "vsignede_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));
+  else
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
+    }
+
+  DONE;
+})
+
+;; Convert float vector odd elements to signed long long vector
+(define_expand "vsignedo_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
+    }
+  else
+    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
+
+  DONE;
+})
+
+;; Convert even vector elements to unsigned long long vector
+(define_expand "vunsignede_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
+  else
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
+    }
+
+  DONE;
+})
+
+;; Convert odd vector elements to unsigned long long vector
+(define_expand "vunsignedo_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
+    }
+  else
+    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
+
+  DONE;
+})
+
 ;; Generate float2 double
 ;; convert two double to float
 (define_expand "float2_v2df"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 799a36586dc..b1620274285 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22625,6 +22625,16 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
 @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
 @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
 
+@smallexample
+vector signed long long vec_signedo (vector float);
+vector signed long long vec_signede (vector float);
+vector unsigned signed long long vec_signedo (vector float);
+vector unsigned signed long long vec_signede (vector float);
+@end smallexample
+
+The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are
+additional extensions to the built-ins as documented in the PVIPR.
+
 @node PowerPC AltiVec Built-in Functions Available on ISA 2.07
 @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
index 5dcdfbee791..5c2481dd612 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
@@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned int vec_result,
 }
 
 void test_ll_int_result(vector long long int vec_result,
-			vector long long int vec_expected)
+			vector long long int vec_expected,
+			char *string)
 {
 	int i;
 
 	for (i = 0; i < 2; i++)
 		if (vec_result[i] != vec_expected[i]) {
 #ifdef DEBUG
-			printf("Test_ll_int_result: ");
+			printf("Test_ll_int_result %s: ", string);
 			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
 			       i, vec_result[i], i, vec_expected[i]);
 #else
@@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result,
 }
 
 void test_ll_unsigned_int_result(vector long long unsigned int vec_result,
-				 vector long long unsigned int vec_expected)
+				 vector long long unsigned int vec_expected,
+				 char *string)
 {
 	int i;
 
 	for (i = 0; i < 2; i++)
 		if (vec_result[i] != vec_expected[i]) {
 #ifdef DEBUG
-			printf("Test_ll_unsigned_int_result: ");
+			printf("Test_ll_unsigned_int_result %s: ", string);
 			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
 			       i, vec_result[i], i, vec_expected[i]);
 #else
@@ -292,7 +294,8 @@ int main()
 	vec_dble0 = (vector double){-124.930, 81234.49};
 	vec_ll_int_expected = (vector long long signed int){-124, 81234};
 	vec_ll_int_result = vec_signed (vec_dble0);
-	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected);
+	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
+			    "vec_signed");
 
 	/* Convert double precision vector float to vector int, even words */
 	vec_dble0 = (vector double){-124.930, 81234.49};
@@ -321,12 +324,44 @@ int main()
 	test_unsigned_int_result (ALL, vec_uns_int_result,
 				  vec_uns_int_expected);
 
+	/* Convert single precision vector float, even args, to vector
+	   signed long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_int_expected = (vector signed long long int){14, -3};
+	vec_ll_int_result = vec_signede (vec_flt0);
+	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
+			    "vec_signede");
+
+	/* Convert single precision vector float, odd args, to vector
+	   signed long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_int_expected = (vector signed long long int){834, -5};
+	vec_ll_int_result = vec_signedo (vec_flt0);
+	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
+			    "vec_signedo");
+
+	/* Convert single precision vector float, even args, to vector
+	   unsigned long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_uns_int_expected = (vector unsigned long long int){14, 0};
+	vec_ll_uns_int_result = vec_unsignede (vec_flt0);
+	test_ll_unsigned_int_result (vec_ll_uns_int_result,
+				     vec_ll_uns_int_expected, "vec_unsignede");
+
+	/* Convert single precision vector float, odd args, to vector
+	   unsigned long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_uns_int_expected = (vector unsigned long long int){834, 0};
+	vec_ll_uns_int_result = vec_unsignedo (vec_flt0);
+	test_ll_unsigned_int_result (vec_ll_uns_int_result,
+				     vec_ll_uns_int_expected, "vec_unsignedo");
+
 	/* Convert double precision float to long long unsigned int */
 	vec_dble0 = (vector double){124.930, 8134.49};
 	vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
 	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
 	test_ll_unsigned_int_result (vec_ll_uns_int_result,
-				     vec_ll_uns_int_expected);
+				     vec_ll_uns_int_expected, "vec_unsigned");
 
 	/* Convert double precision float to long long unsigned int. Negative
 	   arguments.  */
@@ -334,7 +369,7 @@ int main()
 	vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
 	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
 	test_ll_unsigned_int_result (vec_ll_uns_int_result,
-				     vec_ll_uns_int_expected);
+				     vec_ll_uns_int_expected, "vec_unsigned");
 
 	/* Convert double precision vector float to vector unsigned int,
 	   even words.  Negative arguments */
-- 
2.45.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 7/13 ver4] rs6000, add overloaded vec_sel with int128 arguments
  2024-06-13 19:23 [PATCH 0/13 ver4] rs6000, built-in cleanup patch series Carl Love
  2024-06-13 19:40 ` [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins Carl Love
  2024-06-13 19:40 ` [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins Carl Love
@ 2024-06-13 19:40 ` Carl Love
  2024-06-19  3:03   ` Kewen.Lin
  2024-06-13 19:40 ` [PATCH 11/13 ver4] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
  2024-06-13 19:40 ` [PATCH 13/13 ver4] rs6000, remove vector set and vector init built-ins Carl Love
  4 siblings, 1 reply; 13+ messages in thread
From: Carl Love @ 2024-06-13 19:40 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner, Carl Love


GCC maintainers:

The patch has been updated per the comments from version 3.  Please let me know if the patch is acceptable for mainline.

                     Carl 

-----------------------------------------------------------------

rs6000, add overloaded vec_sel with int128 arguments

Extend the vec_sel built-in to take three signed/unsigned/bool int128
arguments and return a signed/unsigned/bool int128 result.

Extending the vec_sel built-in makes the existing buit-ins
__builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
patch removes these built-ins.

The patch adds documentation and test cases for the new overloaded
vec_sel built-ins.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
	__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
	* config/rs6000/rs6000-overload.def (vec_sel): Add new
	overloaded	definitions.
	* doc/extend.texi: Add documentation for new vec_sel instances.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/builtins-10-runnable.c: New runnable test
	file.
	* gcc.target/powerpc/builtins-10.c: New compile only test file.
---
 gcc/config/rs6000/rs6000-builtins.def         |   6 -
 gcc/config/rs6000/rs6000-overload.def         |  12 +
 gcc/doc/extend.texi                           |  20 ++
 .../gcc.target/powerpc/builtins-10-runnable.c | 220 ++++++++++++++++++
 .../gcc.target/powerpc/builtins-10.c          |  63 +++++
 5 files changed, 315 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-10.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index b90b3f34167..c969cd0f3f6 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1907,12 +1907,6 @@
   const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
     XXSEL_16QI_UNS vector_select_v16qi_uns {}
 
-  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
-    XXSEL_1TI vector_select_v1ti {}
-
-  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
-    XXSEL_1TI_UNS vector_select_v1ti_uns {}
-
   const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
     XXSEL_2DF vector_select_v2df {}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 4d857bb1af3..6cec1ad4f1a 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3274,6 +3274,18 @@
     VSEL_2DF  VSEL_2DF_B
   vd __builtin_vec_sel (vd, vd, vull);
     VSEL_2DF  VSEL_2DF_U
+  vsq __builtin_vec_sel (vsq, vsq, vbq);
+    VSEL_1TI  VSEL_1TI_B
+  vsq __builtin_vec_sel (vsq, vsq, vuq);
+    VSEL_1TI  VSEL_1TI_U
+  vuq __builtin_vec_sel (vuq, vuq, vbq);
+    VSEL_1TI_UNS  VSEL_1TI_UB
+  vuq __builtin_vec_sel (vuq, vuq, vuq);
+    VSEL_1TI_UNS  VSEL_1TI_UU
+  vbq __builtin_vec_sel (vbq, vbq, vbq);
+    VSEL_1TI_UNS  VSEL_1TI_BB
+  vbq __builtin_vec_sel (vbq, vbq, vuq);
+    VSEL_1TI_UNS  VSEL_1TI_BU
 ; The following variants are deprecated.
   vsll __builtin_vec_sel (vsll, vsll, vsll);
     VSEL_2DI_B  VSEL_2DI_S
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b1620274285..d7d8d149a43 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -21420,6 +21420,26 @@ Additional built-in functions are available for the 64-bit PowerPC
 family of processors, for efficient use of 128-bit floating point
 (@code{__float128}) values.
 
+Vector select
+
+@smallexample
+vector signed __int128 vec_sel (vector signed __int128,
+               vector signed __int128, vector bool __int128);
+vector signed __int128 vec_sel (vector signed __int128,
+               vector signed __int128, vector unsigned __int128);
+vector unsigned __int128 vec_sel (vector unsigned __int128,
+               vector unsigned __int128, vector bool __int128);
+vector unsigned __int128 vec_sel (vector unsigned __int128,
+               vector unsigned __int128, vector unsigned __int128);
+vector bool __int128 vec_sel (vector bool __int128,
+               vector bool __int128, vector bool __int128);
+vector bool __int128 vec_sel (vector bool __int128,
+               vector bool __int128, vector unsigned __int128);
+@end smallexample
+
+The instance is an extension of the exiting overloaded built-in @code{vec_sel}
+that is documented in the PVIPR.
+
 @node Basic PowerPC Built-in Functions Available on ISA 2.06
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.06
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c
new file mode 100644
index 00000000000..b7b4a95ea0e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c
@@ -0,0 +1,220 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-maltivec -O2 " } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+void print_i128 (unsigned __int128 val)
+{
+  printf(" 0x%016llx%016llx",
+         (unsigned long long)(val >> 64),
+         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
+}
+#endif
+
+extern void abort (void);
+
+union convert_union {
+  vector signed __int128    s128;
+  vector unsigned __int128  u128;
+  vector bool __int128  b128;
+  char  val[16];
+} convert;
+
+int check_u128_result(vector unsigned __int128 vresult_u128,
+		      vector unsigned __int128 expected_vresult_u128)
+{
+  /* Use a for loop to check each byte manually so the test case will run
+     with ISA 2.06.
+
+     Return 1 if they match, 0 otherwise.  */
+
+  int i;
+
+  union convert_union result;
+  union convert_union expected;
+
+  result.u128 = vresult_u128;
+  expected.u128 = expected_vresult_u128;
+
+  /* Check if each byte of the result and expected match. */
+  for (i = 0; i < 16; i++)
+    {
+      if (result.val[i] != expected.val[i])
+	return 0;
+    }
+  return 1;
+}
+
+int check_s128_result(vector signed __int128 vresult_s128,
+		      vector signed __int128 expected_vresult_s128)
+{
+  /* Convert the arguments to unsigned, then check equality.  */
+  union convert_union result;
+  union convert_union expected;
+
+  result.s128 = vresult_s128;
+  expected.s128 = expected_vresult_s128;
+
+  return check_u128_result (result.u128, expected.u128);
+}
+
+int check_b128_result(vector bool __int128 vresult_b128,
+		      vector bool __int128 expected_vresult_b128)
+{
+  /* Convert the arguments to unsigned, then check equality.  */
+  union convert_union result;
+  union convert_union expected;
+
+  result.b128 = vresult_b128;
+  expected.b128 = expected_vresult_b128;
+
+  return check_u128_result (result.u128, expected.u128);
+}
+
+
+int
+main (int argc, char *argv [])
+{
+  int i;
+  
+  vector signed __int128 src_va_s128;
+  vector signed __int128 src_vb_s128;
+  vector signed __int128 src_vc_s128;
+  vector signed __int128 vresult_s128;
+  vector signed __int128 expected_vresult_s128;
+
+  vector unsigned __int128 src_va_u128;
+  vector unsigned __int128 src_vb_u128;
+  vector unsigned __int128 src_vc_u128;
+  vector unsigned __int128 vresult_u128;
+  vector unsigned __int128 expected_vresult_u128;
+
+  vector bool __int128 src_va_b128;
+  vector bool __int128 src_vb_b128;
+  vector bool __int128 src_vc_b128;
+  vector bool __int128 vresult_b128;
+  vector bool __int128 expected_vresult_b128;
+
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vc_b128 = (vector bool   __int128) {0x3333333333333333};
+  src_vc_u128 = (vector unsigned __int128) {0xBBBBBBBBBBBBBBBB};
+
+  /* Signed arguments.  */
+  expected_vresult_s128 = (vector signed __int128) {0x32147658ba9cfed0};
+  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_b128);
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_b128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  expected_vresult_s128 = (vector signed __int128) {0xba9cfed832147650};
+  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_u128);
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_u128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
+  src_va_b128 = (vector bool __int128) {0xFFFFFFFF00000000};
+  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
+  src_vb_b128 = (vector bool __int128) {0xFFFF0000FFFF0000};
+  src_vc_u128 = (vector unsigned __int128) {0x5555555555555555};
+
+  /* Unigned arguments.  */
+  expected_vresult_u128 = (vector unsigned __int128) {0x2147a9cf2147badc};
+  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_b128);
+
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_b128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  expected_vresult_u128 = (vector unsigned __int128) {0x307cfcf47439a9a};
+  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
+
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_u128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  /* Boolean arguments.  */
+  expected_vresult_b128 = (vector bool __int128) {0xffffcccc33330000};
+  vresult_b128 = vec_sel (src_va_b128, src_vb_b128, src_vc_b128);
+
+  if (!check_b128_result (vresult_b128, expected_vresult_b128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_b128, src_vb_b128, src_vc_b128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_b128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_b128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  expected_vresult_b128 = (vector bool __int128) {0xffffaaaa55550000};
+  vresult_b128 = vec_sel (src_va_b128, src_vb_b128, src_vc_u128);
+
+  if (!check_b128_result (vresult_b128, expected_vresult_b128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_b128, src_vb_b128, src_vc_u128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_b128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_b128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+    return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-10.c b/gcc/testsuite/gcc.target/powerpc/builtins-10.c
new file mode 100644
index 00000000000..eddc4c93b32
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-10.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-save-temps" } */
+/* { dg-final { scan-assembler-times "xxsel" 6 } } */
+
+#include <altivec.h>
+
+/* Signed args */
+vector signed __int128
+test_vec_sel_ssb (vector signed __int128 src_va_s128,
+		  vector signed __int128 src_vb_s128,
+		  vector bool __int128 src_vc_b128)
+{
+  return vec_sel (src_va_s128, src_vb_s128, src_vc_b128);
+}
+
+vector signed __int128
+test_vec_sel_ssu (vector signed __int128 src_va_s128,
+		  vector signed __int128 src_vb_s128,
+		  vector unsigned __int128 src_vc_u128)
+{
+  return vec_sel (src_va_s128, src_vb_s128, src_vc_u128);
+}
+
+/* Unsigned args */
+vector unsigned __int128
+test_vec_sel_uub (vector unsigned __int128 src_va_u128,
+		  vector unsigned __int128 src_vb_u128,
+		  vector bool __int128 src_vc_b128)
+{
+  return vec_sel (src_va_u128, src_vb_u128, src_vc_b128);
+}
+
+vector unsigned __int128
+test_vec_sel_uuu (vector unsigned __int128 src_va_u128,
+		  vector unsigned __int128 src_vb_u128,
+		  vector unsigned __int128 src_vc_u128)
+{
+  return vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
+}
+
+/* Boolean args */
+vector bool __int128
+test_vec_sel_bbb (vector bool __int128 src_va_b128,
+		  vector bool __int128 src_vb_b128,
+		  vector bool __int128 src_vc_b128)
+{
+  return vec_sel (src_va_b128, src_vb_b128, src_vc_b128);
+}
+
+vector bool __int128
+test_vec_sel_bbu (vector bool __int128 src_va_b128,
+		  vector bool __int128 src_vb_b128,
+		  vector unsigned __int128 src_vc_u128)
+{
+  return vec_sel (src_va_b128, src_vb_b128, src_vc_u128);
+}
+
+/* Expected results:
+   vec_sel              xxsel    */
+
+/* { dg-final { scan-assembler-times "xxsel" 6 } } */
+
-- 
2.45.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 11/13 ver4] rs6000, extend vec_xxpermdi built-in for __int128 args
  2024-06-13 19:23 [PATCH 0/13 ver4] rs6000, built-in cleanup patch series Carl Love
                   ` (2 preceding siblings ...)
  2024-06-13 19:40 ` [PATCH 7/13 ver4] rs6000, add overloaded vec_sel with int128 arguments Carl Love
@ 2024-06-13 19:40 ` Carl Love
  2024-06-19  3:03   ` Kewen.Lin
  2024-06-13 19:40 ` [PATCH 13/13 ver4] rs6000, remove vector set and vector init built-ins Carl Love
  4 siblings, 1 reply; 13+ messages in thread
From: Carl Love @ 2024-06-13 19:40 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner, Carl Love


GCC maintainers:

The patch has been updated per the comments from version 3.  Please let me know if the patch is acceptable for mainline.

Thanks.

                             Carl 

---------------------------------------------------------

rs6000, extend vec_xxpermdi built-in for __int128 args

Add a new signed and unsigned overloaded instances for vec_xxpermdi

   __int128 vec_xxpermdi (__int128, __int128, const int);
   __uint128 vec_xxpermdi (__uint128, __uint128, const int);

Update the documentation to include a reference to the new built-in
instances.

Add test cases for the new overloaded instances.

gcc/ChangeLog:
	* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
	overloaded built-in instances.
	* doc/extend.texi:  Add documentation for new overloaded built-in
	instances.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
---
 gcc/config/rs6000/rs6000-overload.def         |   4 +
 gcc/doc/extend.texi                           |   4 +
 .../powerpc/vec_perm-runnable-i128.c          | 229 ++++++++++++++++++
 3 files changed, 237 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c

diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 6cec1ad4f1a..354f8fabe0f 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4936,6 +4936,10 @@
     XXPERMDI_2DI  XXPERMDI_VSLL
   vull __builtin_vsx_xxpermdi (vull, vull, const int);
     XXPERMDI_2DI  XXPERMDI_VULL
+  vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
+    XXPERMDI_1TI  XXPERMDI_1SQ
+  vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
+    XXPERMDI_1TI  XXPERMDI_1UQ
   vf __builtin_vsx_xxpermdi (vf, vf, const int);
     XXPERMDI_4SF  XXPERMDI_VF
   vd __builtin_vsx_xxpermdi (vd, vd, const int);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d7d8d149a43..9e45976436b 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22610,6 +22610,10 @@ void vec_vsx_st (vector bool char, int, signed char *);
 
 vector double vec_xxpermdi (vector double, vector double, const int);
 vector float vec_xxpermdi (vector float, vector float, const int);
+vector __int128 vec_xxpermdi (vector signed __int128,
+                              vector signed __int128, const int);
+vector __int128 vec_xxpermdi (vector unsigned __int128,
+                              vector unsigned __int128, const int);
 vector long long vec_xxpermdi (vector long long, vector long long, const int);
 vector unsigned long long vec_xxpermdi (vector unsigned long long,
                                         vector unsigned long long, const int);
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
new file mode 100644
index 00000000000..0e0d77bcb84
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
@@ -0,0 +1,229 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-maltivec -O2 " } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+void print_i128 (unsigned __int128 val)
+{
+  printf(" 0x%016llx%016llx",
+         (unsigned long long)(val >> 64),
+         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
+}
+#endif
+
+extern void abort (void);
+
+union convert_union {
+  vector signed __int128    s128;
+  vector unsigned __int128  u128;
+  char  val[16];
+} convert;
+
+int check_u128_result(vector unsigned __int128 vresult_u128,
+		      vector unsigned __int128 expected_vresult_u128)
+{
+  /* Use a for loop to check each byte manually so the test case will
+     run with ISA 2.06.
+
+     Return 1 if they match, 0 otherwise.  */
+
+  int i;
+
+  union convert_union result;
+  union convert_union expected;
+
+  result.u128 = vresult_u128;
+  expected.u128 = expected_vresult_u128;
+
+  /* Check if each byte of the result and expected match. */
+  for (i = 0; i < 16; i++)
+    {
+      if (result.val[i] != expected.val[i])
+	return 0;
+    }
+  return 1;
+}
+
+int check_s128_result(vector signed __int128 vresult_s128,
+		      vector signed __int128 expected_vresult_s128)
+{
+  /* Convert the arguments to unsigned, then check equality.  */
+  union convert_union result;
+  union convert_union expected;
+
+  result.s128 = vresult_s128;
+  expected.s128 = expected_vresult_s128;
+
+  return check_u128_result (result.u128, expected.u128);
+}
+
+
+int
+main (int argc, char *argv [])
+{
+  int i;
+  
+  vector signed __int128 src_va_s128;
+  vector signed __int128 src_vb_s128;
+  vector signed __int128 vresult_s128;
+  vector signed __int128 expected_vresult_s128;
+
+  vector unsigned __int128 src_va_u128;
+  vector unsigned __int128 src_vb_u128;
+  vector unsigned __int128 src_vc_u128;
+  vector unsigned __int128 vresult_u128;
+  vector unsigned __int128 expected_vresult_u128;
+
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = src_va_s128 << 64; 
+  src_va_s128 |= (vector signed __int128) {0x22446688AACCEE00};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vb_s128 = src_vb_s128 << 64;
+  src_vb_s128 |= (vector signed __int128) {0x3333333333333333};
+
+  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
+  src_va_u128 = src_va_u128 << 64;
+  src_va_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
+  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
+  src_vb_u128 = src_vb_u128 << 64;
+  src_vb_u128 |= (vector unsigned __int128) {0x5555555555555555};
+
+
+  /* Signed 128-bit arguments.  */
+  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x1);
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0x3333333333333333};
+#else
+  /* LE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0x22446688AACCEE00};
+#endif
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x1) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x2);
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0x22446688AACCEE00};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0xFEDCBA9876543210};
+#else
+  /* LE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0x3333333333333333};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0x123456789ABCDEF0};
+#endif
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x2) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  /* Unigned arguments.  */
+  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x1);
+
+  #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0x5555555555555555};
+#else
+  /* LE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
+#endif
+
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x1) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  /* Unigned arguments.  */
+  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x2);
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0x1133557799BBDD00};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0xA987654FEDCB3210};
+#else
+  /* LE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0x5555555555555555};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0x13579ACE02468BDF};
+#endif
+  
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x2) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+    return 0;
+}
-- 
2.45.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 13/13 ver4] rs6000, remove vector set and vector init built-ins
  2024-06-13 19:23 [PATCH 0/13 ver4] rs6000, built-in cleanup patch series Carl Love
                   ` (3 preceding siblings ...)
  2024-06-13 19:40 ` [PATCH 11/13 ver4] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
@ 2024-06-13 19:40 ` Carl Love
  2024-06-19  3:04   ` Kewen.Lin
  4 siblings, 1 reply; 13+ messages in thread
From: Carl Love @ 2024-06-13 19:40 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner, Carl Love

GCC maintainers:

The patch has been updated per the feedback from version 3.  Please let me know it the patch is acceptable for mainline.

Thanks.

                      Carl 

----------------------------------------------------------------------------------

rs6000, remove vector set and vector init built-ins

The vector init built-ins:

  __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
  __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
  __builtin_vec_init_v2di, __builtin_vec_init_v2df,
  __builtin_vec_init_v1ti

perform the same operation as initializing the vector in C code.  For
example:

  result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
  result_v4si = {1, 2, 3, 4};

These two constructs were tested and verified they generate identical
assembly instructions with no optimization and -O3 optimization.

The vector set built-ins:

  __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
  __builtin_vec_set_v4si, __builtin_vec_set_v4sf,
  __builtin_vec_set_v1ti, __builtin_vec_set_v2di,
  __builtin_vec_set_v2df

perform the same operation as setting a specific element in the vector in
C code.  For example:

  src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
  src_v4si[index] = int_val;

The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.

All of the above built-ins that are removed do not have test cases and
are not documented.

Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.

The built-ins are removed as they don't provide any benefit over just
using C code.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
	__builtin_vec_init_v4sf, __builtin_vec_init_v4si,
	__builtin_vec_init_v8hi, __builtin_vec_init_v1ti,
	__builtin_vec_init_v2df, __builtin_vec_init_v2di,
	__builtin_vec_set_v16qi, __builtin_vec_set_v4sf,
	__builtin_vec_set_v4si, __builtin_vec_set_v8hi): Remove
	built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 44 +++------------------------
 1 file changed, 4 insertions(+), 40 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 02aa04e5698..053dc0115d2 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1118,37 +1118,6 @@
   const signed short __builtin_vec_ext_v8hi (vss, signed int);
     VEC_EXT_V8HI nothing {extract}
 
-  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
-            signed char, signed char, signed char, signed char, signed char, \
-            signed char, signed char, signed char, signed char, signed char, \
-            signed char, signed char, signed char);
-    VEC_INIT_V16QI nothing {init}
-
-  const vf __builtin_vec_init_v4sf (float, float, float, float);
-    VEC_INIT_V4SF nothing {init}
-
-  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
-                                     signed int);
-    VEC_INIT_V4SI nothing {init}
-
-  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short,\
-             signed short, signed short, signed short, signed short, \
-             signed short);
-    VEC_INIT_V8HI nothing {init}
-
-  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
-    VEC_SET_V16QI nothing {set}
-
-  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
-    VEC_SET_V4SF nothing {set}
-
-  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
-    VEC_SET_V4SI nothing {set}
-
-  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
-    VEC_SET_V8HI nothing {set}
-
-
 ; Cell builtins.
 [cell]
   pure vsc __builtin_altivec_lvlx (signed long, const void *);
@@ -1295,15 +1264,10 @@
   const signed long long __builtin_vec_ext_v2di (vsll, signed int);
     VEC_EXT_V2DI nothing {extract}
 
-  const vsq __builtin_vec_init_v1ti (signed __int128);
-    VEC_INIT_V1TI nothing {init}
-
-  const vd __builtin_vec_init_v2df (double, double);
-    VEC_INIT_V2DF nothing {init}
-
-  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
-    VEC_INIT_V2DI nothing {init}
-
+;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
+;; resolve_vec_insert(), rs6000-c.cc
+;; TODO: Remove VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI once the uses
+;; in resolve_vec_insert are replaced by the equivalent gimple statements.
   const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
     VEC_SET_V1TI nothing {set}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins.
  2024-06-13 19:40 ` [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins Carl Love
@ 2024-06-19  3:03   ` Kewen.Lin
  2024-06-24 22:15     ` Carl Love
  0 siblings, 1 reply; 13+ messages in thread
From: Kewen.Lin @ 2024-06-19  3:03 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/6/14 03:40, Carl Love wrote:
> GCC maintainers:
> 
> Per the comments on patch 0004 from version 3, the removal of 
> The built-in __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws was moved to this patch.  The rest of the patch is unchanged from version 3.  There were no comments on this patch for version 3.
> 
> Please let me know if this patch is acceptable.  Thanks.
> 
>                             Carl 
> 
> 
> -----------------------------------------------------
> 
> rs6000, Remove __builtin_vsx_xvcvspsxws,
>  __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins.

Nit: Maybe make it shorter like: Remove built-ins __builtin_vsx_xvcv{sp{sx,u}ws,dpuxds_uns}

> 
> The built-in __builtin_vsx_xvcvspsxws is a duplicate of the vec_signed

Nit: Strictly speaking, not a duplicate of vec_signed but covered by it.

> built-in that is documented in the PVIPR.  The __builtin_vsx_xvcvspsxws
> built-in is not documented and there are no test cases for it.
> 
> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
> vec_unsigned, remove.
> 
> The __builtin_vsx_xvcvspuxws is redundant as it is covered by
> vec_unsigned, remove.

As mentioned in the previous review, I'd expect patch 4/13 only focuses on
extending vec_{un,}signed{e,o} for vector float (aka. __builtin_vsx_xvcvspsxds
and __builtin_vsx_xvcvspuxds related), and this patch focuses on some built-in
removals which have been covered by the existing vec_{un,}signed{,e,o}, so
it can also drop the built-ins:

"The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
vec_signed{e,o}, remove.

The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
vec_unsigned{e,o}, remove."

// copied from 4/13.

BR,
Kewen

> 
> This patch removes the redundant built-in.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxws,
> 	__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws):
> 	Remove built-in definitions.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 7c36976a089..8cf0b715898 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1697,9 +1697,6 @@
>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>  
> -  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
> -    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
> -
>    const vsi __builtin_vsx_xvcvdpuxws (vd);
>      XVCVDPUXWS vsx_xvcvdpuxws {}
>  
> @@ -1709,15 +1706,9 @@
>    const vsll __builtin_vsx_xvcvspsxds (vf);
>      XVCVSPSXDS vsx_xvcvspsxds {}
>  
> -  const vsi __builtin_vsx_xvcvspsxws (vf);
> -    XVCVSPSXWS vsx_fix_truncv4sfv4si2 {}
> -
>    const vsll __builtin_vsx_xvcvspuxds (vf);
>      XVCVSPUXDS vsx_xvcvspuxds {}
>  
> -  const vsi __builtin_vsx_xvcvspuxws (vf);
> -    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
> -
>    const vd __builtin_vsx_xvcvsxddp (vsll);
>      XVCVSXDDP vsx_floatv2div2df2 {}
>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins
  2024-06-13 19:40 ` [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins Carl Love
@ 2024-06-19  3:03   ` Kewen.Lin
  2024-06-24 22:14     ` Carl Love
  0 siblings, 1 reply; 13+ messages in thread
From: Kewen.Lin @ 2024-06-19  3:03 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/6/14 03:40, Carl Love wrote:
> 
> GCC maintainers:
> 
> As noted the removal of __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws was moved to patch 2 in the seris.  The patch has been updated per the comments from version 3.
> 
> Please let me know if this patch is acceptable for mainline.  
> 
>                              Carl 
> 
> ------------------------------------------------------------------
> 
> rs6000, extend the current vec_{un,}signed{e,o} built-ins
> 
> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
> convert a vector of floats to signed/unsigned long long ints.  Extend the

Nit: s/signed/a vector of signed/

> existing vec_{un,}signed{e,o} built-ins to handle the argument
> vector of floats to return the even/odd signed/unsigned integers.
> 

Likewise.

> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
> built-ins.
> 
> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
> now for internal use only. They are not documented and they do not
> have testcases.
> 


> The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
> vec_signed{e,o}, remove.
> 
> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
> vec_unsigned{e,o}, remove.

As the comments in 2/13 v4 and the previous review comments, I preferred
these two are moved to 2/13 as well (this patch should focus on extending).

> 
> Add testcases and update documentation.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def: __builtin_vsx_xvcvdpsxws,
> 	__builtin_vsx_xvcvdpuxws): Removed.
> 	(__builtin_vsx_xvcvspsxds, __builtin_vsx_xvcvspuxds): Renamed

Nit: s/Renamed/Rename to/

> 	__builtin_vsignede_v4sf, __builtin_vunsignede_v4sf respectively.
> 	(XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
> 	VEC_VUNSIGNEDE_V4SF respectively.

Likewise.

> 	(__builtin_vsignedo_v4sf, __builtin_vunsignedo_v4sf): New
> 	built-in definitions.
> 	* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
> 	vec_unsignede,vec_unsignedo):  Add new overloaded specifications.

Formatting nits: "..,.." -> ".., ..", "  " -> " "

> 	* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
> 	vunsignede_v4sf, vunsignedo_v4sf): New	define_expands.

Likewise.

> 	* doc/extend.texi (vec_signedo, vec_signede): Add documentation
> 	for new overloaded built-ins.

Missing vec_unsignedo and vec_unsignede, may be also mention for which
types, like "converting vector float to vector {un,}signed long long".

> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/builtins-3-runnable.c
> 	(test_unsigned_int_result, test_ll_unsigned_int_result): Add
> 	new argument.
> 	(vec_signede, vec_signedo, vec_unsignede, vec_unsignedo): New
> 	tests for the overloaded built-ins.
> ---  gcc/config/rs6000/rs6000-builtins.def         | 20 ++---
>  gcc/config/rs6000/rs6000-overload.def         |  8 ++
>  gcc/config/rs6000/vsx.md                      | 84 +++++++++++++++++++
>  gcc/doc/extend.texi                           | 10 +++
>  .../gcc.target/powerpc/builtins-3-runnable.c  | 49 +++++++++--
>  5 files changed, 154 insertions(+), 17 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 322d27b7a0d..29a9deb3410 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1688,26 +1688,26 @@
>    const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
>      XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
>  
> -  const vsi __builtin_vsx_xvcvdpsxws (vd);
> -    XVCVDPSXWS vsx_xvcvdpsxws {}
> -
>    const vsll __builtin_vsx_xvcvdpuxds (vd);
>      XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
>  
>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>  
> -  const vsi __builtin_vsx_xvcvdpuxws (vd);
> -    XVCVDPUXWS vsx_xvcvdpuxws {}
> -
>    const vd __builtin_vsx_xvcvspdp (vf);
>      XVCVSPDP vsx_xvcvspdp {}
>  
> -  const vsll __builtin_vsx_xvcvspsxds (vf);
> -    XVCVSPSXDS vsx_xvcvspsxds {}
> +  const vsll __builtin_vsignede_v4sf (vf);
> +    VEC_VSIGNEDE_V4SF vsignede_v4sf {}
> +
> +  const vsll __builtin_vsignedo_v4sf (vf);
> +    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
> +
> +  const vull __builtin_vunsignede_v4sf (vf);
> +    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
>  
> -  const vsll __builtin_vsx_xvcvspuxds (vf);
> -    XVCVSPUXDS vsx_xvcvspuxds {}
> +  const vull __builtin_vunsignedo_v4sf (vf);
> +    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
>  
>    const vd __builtin_vsx_xvcvsxddp (vsll);
>      XVCVSXDDP vsx_floatv2div2df2 {}
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index 84bd9ae6554..4d857bb1af3 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -3307,10 +3307,14 @@
>  [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
>    vsi __builtin_vec_vsignede (vd);
>      VEC_VSIGNEDE_V2DF
> +  vsll __builtin_vec_vsignede (vf);
> +    VEC_VSIGNEDE_V4SF
>  
>  [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
>    vsi __builtin_vec_vsignedo (vd);
>      VEC_VSIGNEDO_V2DF
> +  vsll __builtin_vec_vsignedo (vf);
> +    VEC_VSIGNEDO_V4SF
>  
>  [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
>    vsi __builtin_vec_signexti (vsc);
> @@ -4433,10 +4437,14 @@
>  [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
>    vui __builtin_vec_vunsignede (vd);
>      VEC_VUNSIGNEDE_V2DF
> +  vull __builtin_vec_vunsignede (vf);
> +    VEC_VUNSIGNEDE_V4SF
>  
>  [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
>    vui __builtin_vec_vunsignedo (vd);
>      VEC_VUNSIGNEDO_V2DF
> +  vull __builtin_vec_vunsignedo (vf);
> +    VEC_VUNSIGNEDO_V4SF
>  
>  [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
>    vui __builtin_vec_extract_exp (vf);
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index f135fa079bd..0187bdbf90e 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -2704,6 +2704,90 @@
>    DONE;
>  })
>  
> +;; Convert float vector even elements to signed long long vector
> +(define_expand "vsignede_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));
> +  else
> +    {
> +      /* Shift left one word to put even word in correct location */

Nit: s/location/location. /
(two spaces after period).

> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
> +    }
> +
> +  DONE;
> +})
> +
> +;; Convert float vector odd elements to signed long long vector
> +(define_expand "vsignedo_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      /* Shift left one word to put even word in correct location */

Likewise.

> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
> +    }
> +  else
> +    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
> +;; Convert even vector elements to unsigned long long vector

Nit: This comment miss "float" (it doesn't align with the one for vsignede_v4sf above).

> +(define_expand "vunsignede_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
> +  else
> +    {
> +      /* Shift left one word to put even word in correct location */

Likewise.

> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
> +    }
> +
> +  DONE;
> +})
> +
> +;; Convert odd vector elements to unsigned long long vector

Likewise.


> +(define_expand "vunsignedo_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      /* Shift left one word to put even word in correct location */

Likewise.

> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
> +    }
> +  else
> +    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
>  ;; Generate float2 double
>  ;; convert two double to float
>  (define_expand "float2_v2df"
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 799a36586dc..b1620274285 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22625,6 +22625,16 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
>  @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>  
> +@smallexample
> +vector signed long long vec_signedo (vector float);
> +vector signed long long vec_signede (vector float);
> +vector unsigned signed long long vec_signedo (vector float);
> +vector unsigned signed long long vec_signede (vector float);

The last two lines should be vec_**un**signed{o,e} and unexpected "signed" should be removed.

The others look good, thanks!

BR,
Kewen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 7/13 ver4] rs6000, add overloaded vec_sel with int128 arguments
  2024-06-13 19:40 ` [PATCH 7/13 ver4] rs6000, add overloaded vec_sel with int128 arguments Carl Love
@ 2024-06-19  3:03   ` Kewen.Lin
  0 siblings, 0 replies; 13+ messages in thread
From: Kewen.Lin @ 2024-06-19  3:03 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/6/14 03:40, Carl Love wrote:
> 
> GCC maintainers:
> 
> The patch has been updated per the comments from version 3.  Please let me know if the patch is acceptable for mainline.
> 
>                      Carl 
> 
> -----------------------------------------------------------------
> 
> rs6000, add overloaded vec_sel with int128 arguments
> 
> Extend the vec_sel built-in to take three signed/unsigned/bool int128
> arguments and return a signed/unsigned/bool int128 result.
> 
> Extending the vec_sel built-in makes the existing buit-ins
> __builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
> patch removes these built-ins.
> 
> The patch adds documentation and test cases for the new overloaded
> vec_sel built-ins.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
> 	__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
> 	* config/rs6000/rs6000-overload.def (vec_sel): Add new
> 	overloaded	definitions.

Nit: unexpected tab between "overloaded" and "definitions", should be a space,
better to mention which types of overloaded function are added, like "
for vector signed, unsigned and bool int128 types."

> 	* doc/extend.texi: Add documentation for new vec_sel instances.

Likewise.

> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/builtins-10-runnable.c: New runnable test
> 	file.
> 	* gcc.target/powerpc/builtins-10.c: New compile only test file.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   6 -
>  gcc/config/rs6000/rs6000-overload.def         |  12 +
>  gcc/doc/extend.texi                           |  20 ++
>  .../gcc.target/powerpc/builtins-10-runnable.c | 220 ++++++++++++++++++
>  .../gcc.target/powerpc/builtins-10.c          |  63 +++++
>  5 files changed, 315 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-10.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index b90b3f34167..c969cd0f3f6 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1907,12 +1907,6 @@
>    const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
>      XXSEL_16QI_UNS vector_select_v16qi_uns {}
>  
> -  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
> -    XXSEL_1TI vector_select_v1ti {}
> -
> -  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
> -    XXSEL_1TI_UNS vector_select_v1ti_uns {}
> -
>    const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
>      XXSEL_2DF vector_select_v2df {}
>  
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index 4d857bb1af3..6cec1ad4f1a 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -3274,6 +3274,18 @@
>      VSEL_2DF  VSEL_2DF_B
>    vd __builtin_vec_sel (vd, vd, vull);
>      VSEL_2DF  VSEL_2DF_U
> +  vsq __builtin_vec_sel (vsq, vsq, vbq);
> +    VSEL_1TI  VSEL_1TI_B
> +  vsq __builtin_vec_sel (vsq, vsq, vuq);
> +    VSEL_1TI  VSEL_1TI_U
> +  vuq __builtin_vec_sel (vuq, vuq, vbq);
> +    VSEL_1TI_UNS  VSEL_1TI_UB
> +  vuq __builtin_vec_sel (vuq, vuq, vuq);
> +    VSEL_1TI_UNS  VSEL_1TI_UU
> +  vbq __builtin_vec_sel (vbq, vbq, vbq);
> +    VSEL_1TI_UNS  VSEL_1TI_BB
> +  vbq __builtin_vec_sel (vbq, vbq, vuq);
> +    VSEL_1TI_UNS  VSEL_1TI_BU

Nit: Put these new lines after line "VSEL_2DI_UNS  VSEL_2DI_BU"
and before "vf __builtin_vec_sel (vf, vf, vbi);", to make all
integral element type be placed together.

>  ; The following variants are deprecated.
>    vsll __builtin_vec_sel (vsll, vsll, vsll);
>      VSEL_2DI_B  VSEL_2DI_S
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index b1620274285..d7d8d149a43 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -21420,6 +21420,26 @@ Additional built-in functions are available for the 64-bit PowerPC
>  family of processors, for efficient use of 128-bit floating point
>  (@code{__float128}) values.
>  
> +Vector select
> +
> +@smallexample
> +vector signed __int128 vec_sel (vector signed __int128,
> +               vector signed __int128, vector bool __int128);
> +vector signed __int128 vec_sel (vector signed __int128,
> +               vector signed __int128, vector unsigned __int128);
> +vector unsigned __int128 vec_sel (vector unsigned __int128,
> +               vector unsigned __int128, vector bool __int128);
> +vector unsigned __int128 vec_sel (vector unsigned __int128,
> +               vector unsigned __int128, vector unsigned __int128);
> +vector bool __int128 vec_sel (vector bool __int128,
> +               vector bool __int128, vector bool __int128);
> +vector bool __int128 vec_sel (vector bool __int128,
> +               vector bool __int128, vector unsigned __int128);
> +@end smallexample
> +
> +The instance is an extension of the exiting overloaded built-in @code{vec_sel}
> +that is documented in the PVIPR.
> +
>  @node Basic PowerPC Built-in Functions Available on ISA 2.06
>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.06
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c
> new file mode 100644
> index 00000000000..b7b4a95ea0e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-10-runnable.c
> @@ -0,0 +1,220 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target vmx_hw } */
> +/* { dg-options "-maltivec -O2 " } */
> +
> +#include <altivec.h>
> +
> +#define DEBUG 0
> +
> +#if DEBUG
> +#include <stdio.h>
> +void print_i128 (unsigned __int128 val)
> +{
> +  printf(" 0x%016llx%016llx",
> +         (unsigned long long)(val >> 64),
> +         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
> +}
> +#endif
> +
> +extern void abort (void);
> +
> +union convert_union {
> +  vector signed __int128    s128;
> +  vector unsigned __int128  u128;
> +  vector bool __int128  b128;
> +  char  val[16];
> +} convert;
> +
> +int check_u128_result(vector unsigned __int128 vresult_u128,
> +		      vector unsigned __int128 expected_vresult_u128)
> +{
> +  /* Use a for loop to check each byte manually so the test case will run
> +     with ISA 2.06.
> +
> +     Return 1 if they match, 0 otherwise.  */
> +
> +  int i;
> +
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.u128 = vresult_u128;
> +  expected.u128 = expected_vresult_u128;
> +
> +  /* Check if each byte of the result and expected match. */
> +  for (i = 0; i < 16; i++)
> +    {
> +      if (result.val[i] != expected.val[i])
> +	return 0;
> +    }
> +  return 1;
> +}
> +
> +int check_s128_result(vector signed __int128 vresult_s128,
> +		      vector signed __int128 expected_vresult_s128)
> +{
> +  /* Convert the arguments to unsigned, then check equality.  */
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.s128 = vresult_s128;
> +  expected.s128 = expected_vresult_s128;
> +
> +  return check_u128_result (result.u128, expected.u128);
> +}
> +
> +int check_b128_result(vector bool __int128 vresult_b128,
> +		      vector bool __int128 expected_vresult_b128)
> +{
> +  /* Convert the arguments to unsigned, then check equality.  */
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.b128 = vresult_b128;
> +  expected.b128 = expected_vresult_b128;
> +
> +  return check_u128_result (result.u128, expected.u128);
> +}
> +
> +
> +int
> +main (int argc, char *argv [])
> +{
> +  int i;
> +  
> +  vector signed __int128 src_va_s128;
> +  vector signed __int128 src_vb_s128;
> +  vector signed __int128 src_vc_s128;
> +  vector signed __int128 vresult_s128;
> +  vector signed __int128 expected_vresult_s128;
> +
> +  vector unsigned __int128 src_va_u128;
> +  vector unsigned __int128 src_vb_u128;
> +  vector unsigned __int128 src_vc_u128;
> +  vector unsigned __int128 vresult_u128;
> +  vector unsigned __int128 expected_vresult_u128;
> +
> +  vector bool __int128 src_va_b128;
> +  vector bool __int128 src_vb_b128;
> +  vector bool __int128 src_vc_b128;
> +  vector bool __int128 vresult_b128;
> +  vector bool __int128 expected_vresult_b128;
> +
> +  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> +  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> +  src_vc_b128 = (vector bool   __int128) {0x3333333333333333};
> +  src_vc_u128 = (vector unsigned __int128) {0xBBBBBBBBBBBBBBBB};
> +
> +  /* Signed arguments.  */
> +  expected_vresult_s128 = (vector signed __int128) {0x32147658ba9cfed0};
> +  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_b128);
> +
> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_b128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_s128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_s128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  expected_vresult_s128 = (vector signed __int128) {0xba9cfed832147650};
> +  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_u128);
> +
> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_u128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_s128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_s128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
> +  src_va_b128 = (vector bool __int128) {0xFFFFFFFF00000000};
> +  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
> +  src_vb_b128 = (vector bool __int128) {0xFFFF0000FFFF0000};
> +  src_vc_u128 = (vector unsigned __int128) {0x5555555555555555};
> +
> +  /* Unigned arguments.  */

Nit: s/Unigned/Unsigned/

> +  expected_vresult_u128 = (vector unsigned __int128) {0x2147a9cf2147badc};
> +  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_b128);
> +
> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_b128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_u128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_u128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  expected_vresult_u128 = (vector unsigned __int128) {0x307cfcf47439a9a};
> +  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
> +
> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_u128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_u128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_u128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  /* Boolean arguments.  */
> +  expected_vresult_b128 = (vector bool __int128) {0xffffcccc33330000};
> +  vresult_b128 = vec_sel (src_va_b128, src_vb_b128, src_vc_b128);
> +
> +  if (!check_b128_result (vresult_b128, expected_vresult_b128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_b128, src_vb_b128, src_vc_b128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_b128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_b128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  expected_vresult_b128 = (vector bool __int128) {0xffffaaaa55550000};
> +  vresult_b128 = vec_sel (src_va_b128, src_vb_b128, src_vc_u128);
> +
> +  if (!check_b128_result (vresult_b128, expected_vresult_b128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_b128, src_vb_b128, src_vc_u128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_b128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_b128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +    return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-10.c b/gcc/testsuite/gcc.target/powerpc/builtins-10.c
> new file mode 100644
> index 00000000000..eddc4c93b32
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-10.c
> @@ -0,0 +1,63 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vmx_hw } */

s/vmx_hw/powerpc_altivec/, this is for compiling, shouldn't use _hw.

> +/* { dg-options "-save-temps" } */

s/-save-temps/-O2 -maltivec/

Also move dg-options line before powerpc_altivec line (as powerpc_altivec
evaluation considers current_compiler_flags now).

OK for trunk with all the above fixed.  Thanks!

BR,
Kewen

> +/* { dg-final { scan-assembler-times "xxsel" 6 } } */
> +
> +#include <altivec.h>
> +
> +/* Signed args */
> +vector signed __int128
> +test_vec_sel_ssb (vector signed __int128 src_va_s128,
> +		  vector signed __int128 src_vb_s128,
> +		  vector bool __int128 src_vc_b128)
> +{
> +  return vec_sel (src_va_s128, src_vb_s128, src_vc_b128);
> +}
> +
> +vector signed __int128
> +test_vec_sel_ssu (vector signed __int128 src_va_s128,
> +		  vector signed __int128 src_vb_s128,
> +		  vector unsigned __int128 src_vc_u128)
> +{
> +  return vec_sel (src_va_s128, src_vb_s128, src_vc_u128);
> +}
> +
> +/* Unsigned args */
> +vector unsigned __int128
> +test_vec_sel_uub (vector unsigned __int128 src_va_u128,
> +		  vector unsigned __int128 src_vb_u128,
> +		  vector bool __int128 src_vc_b128)
> +{
> +  return vec_sel (src_va_u128, src_vb_u128, src_vc_b128);
> +}
> +
> +vector unsigned __int128
> +test_vec_sel_uuu (vector unsigned __int128 src_va_u128,
> +		  vector unsigned __int128 src_vb_u128,
> +		  vector unsigned __int128 src_vc_u128)
> +{
> +  return vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
> +}
> +
> +/* Boolean args */
> +vector bool __int128
> +test_vec_sel_bbb (vector bool __int128 src_va_b128,
> +		  vector bool __int128 src_vb_b128,
> +		  vector bool __int128 src_vc_b128)
> +{
> +  return vec_sel (src_va_b128, src_vb_b128, src_vc_b128);
> +}
> +
> +vector bool __int128
> +test_vec_sel_bbu (vector bool __int128 src_va_b128,
> +		  vector bool __int128 src_vb_b128,
> +		  vector unsigned __int128 src_vc_u128)
> +{
> +  return vec_sel (src_va_b128, src_vb_b128, src_vc_u128);
> +}
> +
> +/* Expected results:
> +   vec_sel              xxsel    */
> +
> +/* { dg-final { scan-assembler-times "xxsel" 6 } } */
> +


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 11/13 ver4] rs6000, extend vec_xxpermdi built-in for __int128 args
  2024-06-13 19:40 ` [PATCH 11/13 ver4] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
@ 2024-06-19  3:03   ` Kewen.Lin
  0 siblings, 0 replies; 13+ messages in thread
From: Kewen.Lin @ 2024-06-19  3:03 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/6/14 03:40, Carl Love wrote:
> 
> GCC maintainers:
> 
> The patch has been updated per the comments from version 3.  Please let me know if the patch is acceptable for mainline.
> 
> Thanks.
> 
>                              Carl 
> 
> ---------------------------------------------------------
> 
> rs6000, extend vec_xxpermdi built-in for __int128 args
> 
> Add a new signed and unsigned overloaded instances for vec_xxpermdi
> 
>    __int128 vec_xxpermdi (__int128, __int128, const int);
>    __uint128 vec_xxpermdi (__uint128, __uint128, const int);

Nit: I think we need the "vector" keyword here to avoid confusion.

> 
> Update the documentation to include a reference to the new built-in
> instances.
> 
> Add test cases for the new overloaded instances.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
> 	overloaded built-in instances.

Better to mention something like:  "built-in instances for vector
signed and unsigned int128".

> 	* doc/extend.texi:  Add documentation for new overloaded built-in

Nit: One more space before "Add".

> 	instances.

... can be extended similarly.

> 
> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
> ---
>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>  gcc/doc/extend.texi                           |   4 +
>  .../powerpc/vec_perm-runnable-i128.c          | 229 ++++++++++++++++++
>  3 files changed, 237 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
> 
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index 6cec1ad4f1a..354f8fabe0f 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4936,6 +4936,10 @@
>      XXPERMDI_2DI  XXPERMDI_VSLL
>    vull __builtin_vsx_xxpermdi (vull, vull, const int);
>      XXPERMDI_2DI  XXPERMDI_VULL
> +  vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
> +    XXPERMDI_1TI  XXPERMDI_1SQ
> +  vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
> +    XXPERMDI_1TI  XXPERMDI_1UQ

Nit: XXPERMDI_1SQ -> XXPERMDI_SQ
     XXPERMDI_1UQ -> XXPERMDI_UQ
(removing "1" to align with the above).

>    vf __builtin_vsx_xxpermdi (vf, vf, const int);
>      XXPERMDI_4SF  XXPERMDI_VF
>    vd __builtin_vsx_xxpermdi (vd, vd, const int);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index d7d8d149a43..9e45976436b 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22610,6 +22610,10 @@ void vec_vsx_st (vector bool char, int, signed char *);
>  
>  vector double vec_xxpermdi (vector double, vector double, const int);
>  vector float vec_xxpermdi (vector float, vector float, const int);
> +vector __int128 vec_xxpermdi (vector signed __int128,
> +                              vector signed __int128, const int);

Nit: either s/vector __int128/vector signed __int128/
     or s/signed //g
to keep consistent.

> +vector __int128 vec_xxpermdi (vector unsigned __int128,
> +                              vector unsigned __int128, const int);

This line misses unsigned for the return type.

OK for trunk with nits above tweaked, thanks!

BR,
Kewen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 13/13 ver4] rs6000, remove vector set and vector init built-ins
  2024-06-13 19:40 ` [PATCH 13/13 ver4] rs6000, remove vector set and vector init built-ins Carl Love
@ 2024-06-19  3:04   ` Kewen.Lin
  0 siblings, 0 replies; 13+ messages in thread
From: Kewen.Lin @ 2024-06-19  3:04 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, bergner, Segher Boessenkool

Hi Carl,

on 2024/6/14 03:40, Carl Love wrote:
> GCC maintainers:
> 
> The patch has been updated per the feedback from version 3.  Please let me know it the patch is acceptable for mainline.
> 
> Thanks.
> 
>                       Carl 
> 
> ----------------------------------------------------------------------------------
> 
> rs6000, remove vector set and vector init built-ins
> 
> The vector init built-ins:
> 
>   __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
>   __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
>   __builtin_vec_init_v2di, __builtin_vec_init_v2df,
>   __builtin_vec_init_v1ti
> 
> perform the same operation as initializing the vector in C code.  For
> example:
> 
>   result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
>   result_v4si = {1, 2, 3, 4};
> 
> These two constructs were tested and verified they generate identical
> assembly instructions with no optimization and -O3 optimization.
> 
> The vector set built-ins:
> 
>   __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
>   __builtin_vec_set_v4si, __builtin_vec_set_v4sf,
>   __builtin_vec_set_v1ti, __builtin_vec_set_v2di,
>   __builtin_vec_set_v2df
> 
> perform the same operation as setting a specific element in the vector in
> C code.  For example:
> 
>   src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
>   src_v4si[index] = int_val;
> 
> The built-in actually generates more instructions than the inline C code
> with no optimization but is identical with -O3 optimizations.
> 
> All of the above built-ins that are removed do not have test cases and
> are not documented.
> 
> Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
> __builtin_vec_set_v2df are not removed as they are used in function
> resolve_vec_insert() in file rs6000-c.cc.
> 
> The built-ins are removed as they don't provide any benefit over just
> using C code.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
> 	__builtin_vec_init_v4sf, __builtin_vec_init_v4si,
> 	__builtin_vec_init_v8hi, __builtin_vec_init_v1ti,
> 	__builtin_vec_init_v2df, __builtin_vec_init_v2di,
> 	__builtin_vec_set_v16qi, __builtin_vec_set_v4sf,
> 	__builtin_vec_set_v4si, __builtin_vec_set_v8hi): Remove
> 	built-in definitions.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 44 +++------------------------
>  1 file changed, 4 insertions(+), 40 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 02aa04e5698..053dc0115d2 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1118,37 +1118,6 @@
>    const signed short __builtin_vec_ext_v8hi (vss, signed int);
>      VEC_EXT_V8HI nothing {extract}
>  
> -  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
> -            signed char, signed char, signed char, signed char, signed char, \
> -            signed char, signed char, signed char, signed char, signed char, \
> -            signed char, signed char, signed char);
> -    VEC_INIT_V16QI nothing {init}

I just realized this {init} is customized for vec_init only, these removed vec_init
bifs are the only users of it, so we should remove this attribute as well.  Sorry that
I should have found and pointed out this in the previous review.  I think it means
some removals are needed on:

    1) comments in rs6000-builtins.def
       ;   init     Process as a vec_init function

    2) related gen code for this attribute bit, like:

      fprintf (header_file, "#define bif_init_bit\t\t(0x00000001)\n");
      fprintf (header_file,
	   "#define bif_is_init(x)\t\t((x).bifattrs & bif_init_bit)\n");
      if (bifp->attrs.isinit)
	fprintf (init_file, " | bif_init_bit");

The others look good to me!

BR,
Kewen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins
  2024-06-19  3:03   ` Kewen.Lin
@ 2024-06-24 22:14     ` Carl Love
  0 siblings, 0 replies; 13+ messages in thread
From: Carl Love @ 2024-06-24 22:14 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: gcc-patches, Segher Boessenkool, bergner



On 6/18/24 20:03, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2024/6/14 03:40, Carl Love wrote:
>>
>> GCC maintainers:
>>
>> As noted the removal of __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws was moved to patch 2 in the seris.  The patch has been updated per the comments from version 3.
>>
>> Please let me know if this patch is acceptable for mainline.  
>>
>>                              Carl 
>>
>> ------------------------------------------------------------------
>>
>> rs6000, extend the current vec_{un,}signed{e,o} built-ins
>>
>> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
>> convert a vector of floats to signed/unsigned long long ints.  Extend the
> 
> Nit: s/signed/a vector of signed/

Fixed.

> 
>> existing vec_{un,}signed{e,o} built-ins to handle the argument
>> vector of floats to return the even/odd signed/unsigned integers.
>>
> 
> Likewise.

Fixed.

> 
>> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
>> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
>> built-ins.
>>
>> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
>> now for internal use only. They are not documented and they do not
>> have testcases.
>>
> 
> 
>> The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
>> vec_signed{e,o}, remove.
>>
>> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
>> vec_unsigned{e,o}, remove.
> 
> As the comments in 2/13 v4 and the previous review comments, I preferred
> these two are moved to 2/13 as well (this patch should focus on extending).
> 

Moved to patch 2.

>>
>> Add testcases and update documentation.
>>
>> gcc/ChangeLog:
>> 	* config/rs6000/rs6000-builtins.def: __builtin_vsx_xvcvdpsxws,
>> 	__builtin_vsx_xvcvdpuxws): Removed.
>> 	(__builtin_vsx_xvcvspsxds, __builtin_vsx_xvcvspuxds): Renamed
> 
> Nit: s/Renamed/Rename to/

OK, fixed.

> 
>> 	__builtin_vsignede_v4sf, __builtin_vunsignede_v4sf respectively.
>> 	(XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
>> 	VEC_VUNSIGNEDE_V4SF respectively.
> 
> Likewise.

OK, fixed. 

> 
>> 	(__builtin_vsignedo_v4sf, __builtin_vunsignedo_v4sf): New
>> 	built-in definitions.
>> 	* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
>> 	vec_unsignede,vec_unsignedo):  Add new overloaded specifications.
> 
> Formatting nits: "..,.." -> ".., ..", "  " -> " "

OK, I fixed the various spacing issues.
> 
>> 	* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
>> 	vunsignede_v4sf, vunsignedo_v4sf): New	define_expands.
> 
> Likewise.

dito

> 
>> 	* doc/extend.texi (vec_signedo, vec_signede): Add documentation
>> 	for new overloaded built-ins.
> 
> Missing vec_unsignedo and vec_unsignede, may be also mention for which
> types, like "converting vector float to vector {un,}signed long long".
> 

OK, fixed.

>>
>> gcc/testsuite/ChangeLog:
>> 	* gcc.target/powerpc/builtins-3-runnable.c
>> 	(test_unsigned_int_result, test_ll_unsigned_int_result): Add
>> 	new argument.
>> 	(vec_signede, vec_signedo, vec_unsignede, vec_unsignedo): New
>> 	tests for the overloaded built-ins.
>> ---  gcc/config/rs6000/rs6000-builtins.def         | 20 ++---
>>  gcc/config/rs6000/rs6000-overload.def         |  8 ++
>>  gcc/config/rs6000/vsx.md                      | 84 +++++++++++++++++++
>>  gcc/doc/extend.texi                           | 10 +++
>>  .../gcc.target/powerpc/builtins-3-runnable.c  | 49 +++++++++--
>>  5 files changed, 154 insertions(+), 17 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
>> index 322d27b7a0d..29a9deb3410 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1688,26 +1688,26 @@
>>    const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
>>      XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
>>  
>> -  const vsi __builtin_vsx_xvcvdpsxws (vd);
>> -    XVCVDPSXWS vsx_xvcvdpsxws {}
>> -
>>    const vsll __builtin_vsx_xvcvdpuxds (vd);
>>      XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
>>  
>>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>>  
>> -  const vsi __builtin_vsx_xvcvdpuxws (vd);
>> -    XVCVDPUXWS vsx_xvcvdpuxws {}
>> -
>>    const vd __builtin_vsx_xvcvspdp (vf);
>>      XVCVSPDP vsx_xvcvspdp {}
>>  
>> -  const vsll __builtin_vsx_xvcvspsxds (vf);
>> -    XVCVSPSXDS vsx_xvcvspsxds {}
>> +  const vsll __builtin_vsignede_v4sf (vf);
>> +    VEC_VSIGNEDE_V4SF vsignede_v4sf {}
>> +
>> +  const vsll __builtin_vsignedo_v4sf (vf);
>> +    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
>> +
>> +  const vull __builtin_vunsignede_v4sf (vf);
>> +    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
>>  
>> -  const vsll __builtin_vsx_xvcvspuxds (vf);
>> -    XVCVSPUXDS vsx_xvcvspuxds {}
>> +  const vull __builtin_vunsignedo_v4sf (vf);
>> +    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
>>  
>>    const vd __builtin_vsx_xvcvsxddp (vsll);
>>      XVCVSXDDP vsx_floatv2div2df2 {}
>> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
>> index 84bd9ae6554..4d857bb1af3 100644
>> --- a/gcc/config/rs6000/rs6000-overload.def
>> +++ b/gcc/config/rs6000/rs6000-overload.def
>> @@ -3307,10 +3307,14 @@
>>  [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
>>    vsi __builtin_vec_vsignede (vd);
>>      VEC_VSIGNEDE_V2DF
>> +  vsll __builtin_vec_vsignede (vf);
>> +    VEC_VSIGNEDE_V4SF
>>  
>>  [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
>>    vsi __builtin_vec_vsignedo (vd);
>>      VEC_VSIGNEDO_V2DF
>> +  vsll __builtin_vec_vsignedo (vf);
>> +    VEC_VSIGNEDO_V4SF
>>  
>>  [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
>>    vsi __builtin_vec_signexti (vsc);
>> @@ -4433,10 +4437,14 @@
>>  [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
>>    vui __builtin_vec_vunsignede (vd);
>>      VEC_VUNSIGNEDE_V2DF
>> +  vull __builtin_vec_vunsignede (vf);
>> +    VEC_VUNSIGNEDE_V4SF
>>  
>>  [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
>>    vui __builtin_vec_vunsignedo (vd);
>>      VEC_VUNSIGNEDO_V2DF
>> +  vull __builtin_vec_vunsignedo (vf);
>> +    VEC_VUNSIGNEDO_V4SF
>>  
>>  [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
>>    vui __builtin_vec_extract_exp (vf);
>> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
>> index f135fa079bd..0187bdbf90e 100644
>> --- a/gcc/config/rs6000/vsx.md
>> +++ b/gcc/config/rs6000/vsx.md
>> @@ -2704,6 +2704,90 @@
>>    DONE;
>>  })
>>  
>> +;; Convert float vector even elements to signed long long vector
>> +(define_expand "vsignede_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));
>> +  else
>> +    {
>> +      /* Shift left one word to put even word in correct location */
> 
> Nit: s/location/location. /
> (two spaces after period).

Yes, fixed four instances of the issue.

> 
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
>> +    }
>> +
>> +  DONE;
>> +})
>> +
>> +;; Convert float vector odd elements to signed long long vector
>> +(define_expand "vsignedo_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    {
>> +      /* Shift left one word to put even word in correct location */
> 
> Likewise.

Fixed.

> 
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
>> +    }
>> +  else
>> +    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
>> +
>> +  DONE;
>> +})
>> +
>> +;; Convert even vector elements to unsigned long long vector
> 
> Nit: This comment miss "float" (it doesn't align with the one for vsignede_v4sf above).

OK, fixed. 

> 
>> +(define_expand "vunsignede_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
>> +  else
>> +    {
>> +      /* Shift left one word to put even word in correct location */
> 
> Likewise.

Fixed.
> 
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
>> +    }
>> +
>> +  DONE;
>> +})
>> +
>> +;; Convert odd vector elements to unsigned long long vector
> 
> Likewise.

Fixed.


> 
> 
>> +(define_expand "vunsignedo_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    {
>> +      /* Shift left one word to put even word in correct location */
> 
> Likewise.

Fixed. 

> 
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
>> +    }
>> +  else
>> +    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
>> +
>> +  DONE;
>> +})
>> +
>>  ;; Generate float2 double
>>  ;; convert two double to float
>>  (define_expand "float2_v2df"
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 799a36586dc..b1620274285 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -22625,6 +22625,16 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
>>  @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>>  
>> +@smallexample
>> +vector signed long long vec_signedo (vector float);
>> +vector signed long long vec_signede (vector float);
>> +vector unsigned signed long long vec_signedo (vector float);
>> +vector unsigned signed long long vec_signede (vector float);
> 
> The last two lines should be vec_**un**signed{o,e} and unexpected "signed" should be removed.

Fixed the last two lines so they read:

  vector unsigned long long vec_signedo (vector float);
  vector unsigned long long vec_signede (vector float);

                 Carl 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins.
  2024-06-19  3:03   ` Kewen.Lin
@ 2024-06-24 22:15     ` Carl Love
  0 siblings, 0 replies; 13+ messages in thread
From: Carl Love @ 2024-06-24 22:15 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: gcc-patches, Segher Boessenkool, bergner

Kewen:

On 6/18/24 20:03, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2024/6/14 03:40, Carl Love wrote:
>> GCC maintainers:
>>
>> Per the comments on patch 0004 from version 3, the removal of 
>> The built-in __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws was moved to this patch.  The rest of the patch is unchanged from version 3.  There were no comments on this patch for version 3.
>>
>> Please let me know if this patch is acceptable.  Thanks.
>>
>>                             Carl 
>>
>>
>> -----------------------------------------------------
>>
>> rs6000, Remove __builtin_vsx_xvcvspsxws,
>>  __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins.
> 
> Nit: Maybe make it shorter like: Remove built-ins __builtin_vsx_xvcv{sp{sx,u}ws,dpuxds_uns}
> 
>>
>> The built-in __builtin_vsx_xvcvspsxws is a duplicate of the vec_signed
> 
> Nit: Strictly speaking, not a duplicate of vec_signed but covered by it.
> 
>> built-in that is documented in the PVIPR.  The __builtin_vsx_xvcvspsxws
>> built-in is not documented and there are no test cases for it.
>>
>> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
>> vec_unsigned, remove.
>>
>> The __builtin_vsx_xvcvspuxws is redundant as it is covered by
>> vec_unsigned, remove.
> 
> As mentioned in the previous review, I'd expect patch 4/13 only focuses on
> extending vec_{un,}signed{e,o} for vector float (aka. __builtin_vsx_xvcvspsxds
> and __builtin_vsx_xvcvspuxds related), and this patch focuses on some built-in
> removals which have been covered by the existing vec_{un,}signed{,e,o}, so
> it can also drop the built-ins:
> 
> "The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
> vec_signed{e,o}, remove.
> 
> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
> vec_unsigned{e,o}, remove."
> 
> // copied from 4/13.

Not sure why I didn't move these two with the other two???  Sorry.

Moved them from patch 4.

                              Carl 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-06-25  3:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-13 19:23 [PATCH 0/13 ver4] rs6000, built-in cleanup patch series Carl Love
2024-06-13 19:40 ` [PATCH 2/13 ver4] rs6000, Remove __builtin_vsx_xvcvspsxws,, __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws built-ins Carl Love
2024-06-19  3:03   ` Kewen.Lin
2024-06-24 22:15     ` Carl Love
2024-06-13 19:40 ` [PATCH 4/13 ver4] rs6000, extend the current vec_{un,}signed{e,o}, built-ins Carl Love
2024-06-19  3:03   ` Kewen.Lin
2024-06-24 22:14     ` Carl Love
2024-06-13 19:40 ` [PATCH 7/13 ver4] rs6000, add overloaded vec_sel with int128 arguments Carl Love
2024-06-19  3:03   ` Kewen.Lin
2024-06-13 19:40 ` [PATCH 11/13 ver4] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
2024-06-19  3:03   ` Kewen.Lin
2024-06-13 19:40 ` [PATCH 13/13 ver4] rs6000, remove vector set and vector init built-ins Carl Love
2024-06-19  3:04   ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).