[PATCH 0/13 ver 3] rs6000, built-in cleanup patch series

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series
@ 2024-05-29 15:48 Carl Love
  2024-05-29 15:52 ` [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins Carl Love
                   ` (12 more replies)
  0 siblings, 13 replies; 30+ messages in thread
From: Carl Love @ 2024-05-29 15:48 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, Carl Love, bergner

GCC maintainers:

The following is an updated patch series to remove duplicate built-ins.  

There are patches to extend an existing overloaded built-in to cover additional input types. 

A new patch, 0005-rs6000-Remove-redundant-float-double-type-conversion.patch, was added to remove built-ins that were inadvertently missing in the last version.  

Patch 12 patch in the previous series was dropped as the built-in __builtin_vsx_xvcmpeqsp is not a duplicate of the overloaded vec_cmpeq built-in.  Specifically, the return values are different.  The goal in this series is to remove built-ins that are functionally equivalent.  Patch 12 from the previous series will be reworked and submitted later.

Some of the patches in the previous series were approved, but everything is being reposted for completeness.  The following gives the mapping of the patches from the previous version to the current version of the series with notes on the patches.

Version 2                       Version 3		Notes
patch 1				patch 1			Approved, no changes
patch 2				patch 2			Responded to comments, no changes to the patch
patch 3				patch 3			Updated changelog, no functional changes
patch 4				patch 4			Updated patch
				patch 5			New patch to removed built-ins missed in the
							series.
patch 5				patch 6			Updated patch
patch 6				patch 7			Updated patch
patch 7				patch 8			Updated patch
patch 8				patch 9			Approved, no changes to this patch
patch 9				patch 10		Approved, no changes to this patch
patch 10			patch 11		Updated, added test file.
patch 11			patch 12		Updated
patch 12			                        Patch from previous series removed
patch 13			patch 13		Comments said built-ins __builtin_vec_set_v1ti
							__builtin_vec_set_v2di, __builtin_vec_set_v2df
							can also get removed with equivalent gimple codes.
							This is somewhat more involved than a simple
							removal of redundant built-ins.  The built-ins 
							will be removed in a separate future patch.

The patch series has been tested on Power 10 LE, Power 9 BE with no regression failures.
							in additional patch

The patches have all been tested on Power 10 LE.  The last patch was also tested on Power 8 BE.

No regression tests were seen.

Please let me know if the patches are acceptable for mainline.  Thanks.

                       Carl 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
@ 2024-05-29 15:52 ` Carl Love
  2024-06-04  6:00   ` [PATCH 1/13 ver 3] rs6000, " Kewen.Lin
  2024-05-29 15:55 ` [PATCH 2/13 ver 3] rs6000, Remove __builtin_vsx_xvcvspsxws built-in Carl Love
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 15:52 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This patch was approved in the previous series.  There are no changes to this patch.  Reposting for completeness. 

                     Carl 
-------------------------------------------------------

rs6000, Remove __builtin_vsx_cmple* builtins

The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take
unsigned arguments and return an unsigned result.  The current definitions
take signed arguments and return signed results which is incorrect.

The signed and unsigned versions of __builtin_vsx_cmple* are not
documented in extend.texi.  Also there are no test cases for the
built-ins.

Users can use the existing vec_cmple as PVIPR defines instead of
__builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi,
__builtin_vsx_cmple_16qi, __builtin_vsx_cmple_2di,
__builtin_vsx_cmple_4si and __builtin_vsx_cmple_8hi,
__builtin_altivec_cmple_1ti, __builtin_altivec_cmple_u1ti.

Hence these built-ins are redundant and are removed by this patch.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtin.cc (RS6000_BIF_CMPLE_16QI,
	RS6000_BIF_CMPLE_U16QI, RS6000_BIF_CMPLE_8HI,
	RS6000_BIF_CMPLE_U8HI, RS6000_BIF_CMPLE_4SI, RS6000_BIF_CMPLE_U4SI,
	RS6000_BIF_CMPLE_2DI, RS6000_BIF_CMPLE_U2DI, RS6000_BIF_CMPLE_1TI,
	RS6000_BIF_CMPLE_U1TI): Remove case statements.
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_cmple_16qi,
	__builtin_vsx_cmple_2di, __builtin_vsx_cmple_4si,
	__builtin_vsx_cmple_8hi, __builtin_vsx_cmple_u16qi,
	__builtin_vsx_cmple_u2di, __builtin_vsx_cmple_u4si,
	__builtin_vsx_cmple_u8hi): Remove buit-in definitions.
---
 gcc/config/rs6000/rs6000-builtin.cc   | 13 ------------
 gcc/config/rs6000/rs6000-builtins.def | 30 ---------------------------
 2 files changed, 43 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
index 320affd79e3..ac9f16fe51a 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -2027,19 +2027,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
       fold_compare_helper (gsi, GT_EXPR, stmt);
       return true;
 
-    case RS6000_BIF_CMPLE_16QI:
-    case RS6000_BIF_CMPLE_U16QI:
-    case RS6000_BIF_CMPLE_8HI:
-    case RS6000_BIF_CMPLE_U8HI:
-    case RS6000_BIF_CMPLE_4SI:
-    case RS6000_BIF_CMPLE_U4SI:
-    case RS6000_BIF_CMPLE_2DI:
-    case RS6000_BIF_CMPLE_U2DI:
-    case RS6000_BIF_CMPLE_1TI:
-    case RS6000_BIF_CMPLE_U1TI:
-      fold_compare_helper (gsi, LE_EXPR, stmt);
-      return true;
-
     /* flavors of vec_splat_[us]{8,16,32}.  */
     case RS6000_BIF_VSPLTISB:
     case RS6000_BIF_VSPLTISH:
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 3bc7fed6956..7c36976a089 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1337,30 +1337,6 @@
   const vss __builtin_vsx_cmpge_u8hi (vus, vus);
     CMPGE_U8HI vector_nltuv8hi {}
 
-  const vsc __builtin_vsx_cmple_16qi (vsc, vsc);
-    CMPLE_16QI vector_ngtv16qi {}
-
-  const vsll __builtin_vsx_cmple_2di (vsll, vsll);
-    CMPLE_2DI vector_ngtv2di {}
-
-  const vsi __builtin_vsx_cmple_4si (vsi, vsi);
-    CMPLE_4SI vector_ngtv4si {}
-
-  const vss __builtin_vsx_cmple_8hi (vss, vss);
-    CMPLE_8HI vector_ngtv8hi {}
-
-  const vsc __builtin_vsx_cmple_u16qi (vsc, vsc);
-    CMPLE_U16QI vector_ngtuv16qi {}
-
-  const vsll __builtin_vsx_cmple_u2di (vsll, vsll);
-    CMPLE_U2DI vector_ngtuv2di {}
-
-  const vsi __builtin_vsx_cmple_u4si (vsi, vsi);
-    CMPLE_U4SI vector_ngtuv4si {}
-
-  const vss __builtin_vsx_cmple_u8hi (vss, vss);
-    CMPLE_U8HI vector_ngtuv8hi {}
-
   const vd __builtin_vsx_concat_2df (double, double);
     CONCAT_2DF vsx_concat_v2df {}
 
@@ -3117,12 +3093,6 @@
   const vbq __builtin_altivec_cmpge_u1ti (vuq, vuq);
     CMPGE_U1TI vector_nltuv1ti {}
 
-  const vbq __builtin_altivec_cmple_1ti (vsq, vsq);
-    CMPLE_1TI vector_ngtv1ti {}
-
-  const vbq __builtin_altivec_cmple_u1ti (vuq, vuq);
-    CMPLE_U1TI vector_ngtuv1ti {}
-
   const unsigned long long __builtin_altivec_cntmbb (vuc, const int<1>);
     VCNTMBB vec_cntmb_v16qi {}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/13 ver 3] rs6000, Remove __builtin_vsx_xvcvspsxws built-in
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
  2024-05-29 15:52 ` [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins Carl Love
@ 2024-05-29 15:55 ` Carl Love
  2024-05-29 15:56 ` [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition Carl Love
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-05-29 15:55 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

I responded to comments about the patch from the previous patch series.  No functional changes were made to this patch.

                            Carl 
-------------------------------------------------- 

rs6000, Remove __builtin_vsx_xvcvspsxws built-in.

The built-in __builtin_vsx_xvcvspsxws is a duplicate of the vec_signed
built-in that is documented in the PVIPR.  The __builtin_vsx_xvcvspsxws
built-in is not documented and there are no test cases for it.

This patch removes the redundant built-in.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxws):
	Remove built-in definition.
---
 gcc/config/rs6000/rs6000-builtins.def | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 7c36976a089..c6d2ea1bc39 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1709,9 +1709,6 @@
   const vsll __builtin_vsx_xvcvspsxds (vf);
     XVCVSPSXDS vsx_xvcvspsxds {}
 
-  const vsi __builtin_vsx_xvcvspsxws (vf);
-    XVCVSPSXWS vsx_fix_truncv4sfv4si2 {}
-
   const vsll __builtin_vsx_xvcvspuxds (vf);
     XVCVSPUXDS vsx_xvcvspuxds {}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
  2024-05-29 15:52 ` [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins Carl Love
  2024-05-29 15:55 ` [PATCH 2/13 ver 3] rs6000, Remove __builtin_vsx_xvcvspsxws built-in Carl Love
@ 2024-05-29 15:56 ` Carl Love
  2024-06-04  5:58   ` Kewen.Lin
  2024-05-29 15:58 ` [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins Carl Love
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 15:56 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This patch was updated per the feedback comment from the previous version in series 2.

                             Carl 
-------------------------------------------------------------------

rs6000, fix error in unsigned vector float to unsigned int built-in definitions

The built-in __builtin_vsx_vunsigned_v2df is supposed to take a vector of
doubles and return a vector of unsigned long long ints.  Similarly
__builtin_vsx_vunsigned_v4sf takes a vector of floats an is supposed to
return a vector of unsinged ints.  The definitions are using the signed
version of the instructions not the unsigned version of the instruction.
The results should also be unsigned.  The builtins are used by the
overloaded vec_unsigned builtin which has an unsigned result.

Similarly the built-ins __builtin_vsx_vunsignede_v2df and
__builtin_vsx_vunsignedo_v2df are supposed to return an unsigned result.
If the floating point argument is negative, the unsigned result is zero.
The built-ins are used in the overloaded built-in vec_unsignede and
vec_unsignedo respectively.

Add a test cases for a negative floating point arguments for each of the
above built-ins.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_vunsigned_v2df,
	__builtin_vsx_vunsigned_v4sf, __builtin_vsx_vunsignede_v2df,
	__builtin_vsx_vunsignedo_v2df): Change the result type to unsigned.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/builtins-3-runnable.c: Add tests for
	vec_unsignede and vec_unsignedo with negative arguments.
---
 gcc/config/rs6000/rs6000-builtins.def         | 12 ++++----
 .../gcc.target/powerpc/builtins-3-runnable.c  | 30 +++++++++++++++++--
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index c6d2ea1bc39..bf9a0ae22fc 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1580,16 +1580,16 @@
   const vsi __builtin_vsx_vsignedo_v2df (vd);
     VEC_VSIGNEDO_V2DF vsignedo_v2df {}
 
-  const vsll __builtin_vsx_vunsigned_v2df (vd);
-    VEC_VUNSIGNED_V2DF vsx_xvcvdpsxds {}
+  const vull __builtin_vsx_vunsigned_v2df (vd);
+    VEC_VUNSIGNED_V2DF vsx_xvcvdpuxds {}
 
-  const vsi __builtin_vsx_vunsigned_v4sf (vf);
-    VEC_VUNSIGNED_V4SF vsx_xvcvspsxws {}
+  const vui __builtin_vsx_vunsigned_v4sf (vf);
+    VEC_VUNSIGNED_V4SF vsx_xvcvspuxws {}
 
-  const vsi __builtin_vsx_vunsignede_v2df (vd);
+  const vui __builtin_vsx_vunsignede_v2df (vd);
     VEC_VUNSIGNEDE_V2DF vunsignede_v2df {}
 
-  const vsi __builtin_vsx_vunsignedo_v2df (vd);
+  const vui __builtin_vsx_vunsignedo_v2df (vd);
     VEC_VUNSIGNEDO_V2DF vunsignedo_v2df {}
 
   const vf __builtin_vsx_xscvdpsp (double);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
index 0231a1fd086..5dcdfbee791 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
@@ -313,6 +313,14 @@ int main()
 	test_unsigned_int_result (ALL, vec_uns_int_result,
 				  vec_uns_int_expected);
 
+	/* Convert single precision float to  unsigned int.  Negative
+	   arguments.  */
+	vec_flt0 = (vector float){-14.930, -834.49, -3.3, -5.4};
+	vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
+	vec_uns_int_result = vec_unsigned (vec_flt0);
+	test_unsigned_int_result (ALL, vec_uns_int_result,
+				  vec_uns_int_expected);
+
 	/* Convert double precision float to long long unsigned int */
 	vec_dble0 = (vector double){124.930, 8134.49};
 	vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
@@ -320,10 +328,18 @@ int main()
 	test_ll_unsigned_int_result (vec_ll_uns_int_result,
 				     vec_ll_uns_int_expected);
 
+	/* Convert double precision float to long long unsigned int. Negative
+	   arguments.  */
+	vec_dble0 = (vector double){-24.93, -134.9};
+	vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
+	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
+	test_ll_unsigned_int_result (vec_ll_uns_int_result,
+				     vec_ll_uns_int_expected);
+
 	/* Convert double precision vector float to vector unsigned int,
-	   even words */
-	vec_dble0 = (vector double){3124.930, 8234.49};
-	vec_uns_int_expected = (vector unsigned int){3124, 0, 8234, 0};
+	   even words.  Negative arguments */
+	vec_dble0 = (vector double){-124.930, -234.49};
+	vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
 	vec_uns_int_result = vec_unsignede (vec_dble0);
 	test_unsigned_int_result (EVEN, vec_uns_int_result,
 				  vec_uns_int_expected);
@@ -335,5 +351,13 @@ int main()
 	vec_uns_int_result = vec_unsignedo (vec_dble0);
 	test_unsigned_int_result (ODD, vec_uns_int_result,
 				  vec_uns_int_expected);
+
+	/* Convert double precision vector float to vector unsigned int,
+	   odd words.  Negative arguments.  */
+	vec_dble0 = (vector double){-924.930, -1234.49};
+	vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
+	vec_uns_int_result = vec_unsignedo (vec_dble0);
+	test_unsigned_int_result (ODD, vec_uns_int_result,
+				  vec_uns_int_expected);
 }
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (2 preceding siblings ...)
  2024-05-29 15:56 ` [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition Carl Love
@ 2024-05-29 15:58 ` Carl Love
  2024-06-04  7:19   ` Kewen.Lin
  2024-05-29 16:00 ` [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions Carl Love
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 15:58 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

Updated the patch per the feedback comments from the previous version.

                                 Carl 
-------------------------------------------------------

rs6000, extend the current vec_{un,}signed{e,o} built-ins

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
convert a vector of floats to signed/unsigned long long ints.  Extend the
existing vec_{un,}signed{e,o} built-ins to handle the argument
vector of floats to return the even/odd signed/unsigned integers.

The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
built-ins.

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
now for internal use only. They are not documented and they do not
have testcases.

The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
vec_signed{e,o}, remove.

The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
vec_unsigned{e,o}, remove.

The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
vec_unsigned, remove.

The __builtin_vsx_xvcvspuxws is redundante as it is covered by
vec_unsigned, remove.

Add testcases and update documentation.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low,
	__builtin_vsx_xvcvspuxds_low): New built-in definitions.
	(__builtin_vsx_xvcvspuxds): Fix return type.
	(XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
	VEC_VUNSIGNEDE_V4SF respectively.
	(vsx_xvcvspsxds, vsx_xvcvspuxds): Renamed vsignede_v4sf,
	vunsignede_v4sf respectively.
	(__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws,
	__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws): Removed.
	* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
	vec_unsignede,vec_unsignedo):  Add new overloaded specifications.
	* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
	vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
	* doc/extend.texi (vec_signedo, vec_signede): Add documentation.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/builtins-3-runnable.c: New tests for the added
	overloaded built-ins.
---
 gcc/config/rs6000/rs6000-builtins.def         | 25 ++----
 gcc/config/rs6000/rs6000-overload.def         |  8 ++
 gcc/config/rs6000/vsx.md                      | 88 +++++++++++++++++++
 gcc/doc/extend.texi                           | 10 +++
 .../gcc.target/powerpc/builtins-3-runnable.c  | 51 +++++++++--
 5 files changed, 157 insertions(+), 25 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index bf9a0ae22fc..cea2649b86c 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1688,32 +1688,23 @@
   const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
     XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
 
-  const vsi __builtin_vsx_xvcvdpsxws (vd);
-    XVCVDPSXWS vsx_xvcvdpsxws {}
-
-  const vsll __builtin_vsx_xvcvdpuxds (vd);
-    XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
-
   const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
     XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
 
-  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
-    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
-
-  const vsi __builtin_vsx_xvcvdpuxws (vd);
-    XVCVDPUXWS vsx_xvcvdpuxws {}
-
   const vd __builtin_vsx_xvcvspdp (vf);
     XVCVSPDP vsx_xvcvspdp {}
 
   const vsll __builtin_vsx_xvcvspsxds (vf);
-    XVCVSPSXDS vsx_xvcvspsxds {}
+    VEC_VSIGNEDE_V4SF vsignede_v4sf {}
+
+  const vsll __builtin_vsx_xvcvspsxds_low (vf);
+    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
 
-  const vsll __builtin_vsx_xvcvspuxds (vf);
-    XVCVSPUXDS vsx_xvcvspuxds {}
+  const vull __builtin_vsx_xvcvspuxds (vf);
+    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
 
-  const vsi __builtin_vsx_xvcvspuxws (vf);
-    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
+  const vull __builtin_vsx_xvcvspuxds_low (vf);
+    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
 
   const vd __builtin_vsx_xvcvsxddp (vsll);
     XVCVSXDDP vsx_floatv2div2df2 {}
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 84bd9ae6554..4d857bb1af3 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3307,10 +3307,14 @@
 [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
   vsi __builtin_vec_vsignede (vd);
     VEC_VSIGNEDE_V2DF
+  vsll __builtin_vec_vsignede (vf);
+    VEC_VSIGNEDE_V4SF
 
 [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
   vsi __builtin_vec_vsignedo (vd);
     VEC_VSIGNEDO_V2DF
+  vsll __builtin_vec_vsignedo (vf);
+    VEC_VSIGNEDO_V4SF
 
 [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
   vsi __builtin_vec_signexti (vsc);
@@ -4433,10 +4437,14 @@
 [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
   vui __builtin_vec_vunsignede (vd);
     VEC_VUNSIGNEDE_V2DF
+  vull __builtin_vec_vunsignede (vf);
+    VEC_VUNSIGNEDE_V4SF
 
 [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
   vui __builtin_vec_vunsignedo (vd);
     VEC_VUNSIGNEDO_V2DF
+  vull __builtin_vec_vunsignedo (vf);
+    VEC_VUNSIGNEDO_V4SF
 
 [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
   vui __builtin_vec_extract_exp (vf);
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f135fa079bd..a8f3d459232 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2704,6 +2704,94 @@ (define_expand "vsx_xvcvsp<su>xds"
   DONE;
 })
 
+;; Convert low vector elements of 32-bit floating point numbers to vector of
+;; 64-bit signed
+(define_expand "vsignede_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+       /* Shift left one word to put even word in correct location */
+       rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+       rtx rtx_val = GEN_INT (4);
+       emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+       emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
+    }
+  else
+    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
+
+  DONE;
+})
+
+;; Convert high vector elements of 32-bit floating point numbers to vector of
+;; 64-bit signed
+(define_expand "vsignedo_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));
+  else
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
+    }
+
+  DONE;
+})
+
+;; Convert low vector elements of 32-bit floating point numbers to vector of
+;; 64-bit unsigned integers.
+(define_expand "vunsignede_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
+    }
+  else
+    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
+
+  DONE;
+})
+
+;; Convert high vector elements of 32-bit floating point numbers to vector of
+;; 64-bit unsigned integers.
+(define_expand "vunsignedo_v4sf"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
+  else
+    {
+      /* Shift left one word to put even word in correct location */
+      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
+      rtx rtx_val = GEN_INT (4);
+      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+					  rtx_val));
+      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
+    }
+
+  DONE;
+})
+
 ;; Generate float2 double
 ;; convert two double to float
 (define_expand "float2_v2df"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 267fccd1512..b88e61641a2 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22577,6 +22577,16 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
 @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
 @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
 
+@smallexample
+vector signed signed long long vec_signedo (vector float);
+vector signed signed long long vec_signede (vector float);
+vector unsigned signed long long vec_signedo (vector float);
+vector unsigned signed long long vec_signede (vector float);
+@end smallexample
+
+The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are
+additional extensions to the built-ins as documented in the PVIPR.
+
 @node PowerPC AltiVec Built-in Functions Available on ISA 2.07
 @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
index 5dcdfbee791..557befc9a4a 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
@@ -3,7 +3,7 @@
 /* { dg-options "-maltivec -mvsx" } */
 
 #include <altivec.h> // vector
-
+#define DEBUG 1
 #ifdef DEBUG
 #include <stdio.h>
 #endif
@@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned int vec_result,
 }
 
 void test_ll_int_result(vector long long int vec_result,
-			vector long long int vec_expected)
+			vector long long int vec_expected,
+			char *string)
 {
 	int i;
 
 	for (i = 0; i < 2; i++)
 		if (vec_result[i] != vec_expected[i]) {
 #ifdef DEBUG
-			printf("Test_ll_int_result: ");
+			printf("Test_ll_int_result %s: ", string);
 			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
 			       i, vec_result[i], i, vec_expected[i]);
 #else
@@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result,
 }
 
 void test_ll_unsigned_int_result(vector long long unsigned int vec_result,
-				 vector long long unsigned int vec_expected)
+				 vector long long unsigned int vec_expected,
+				 char *string)
 {
 	int i;
 
 	for (i = 0; i < 2; i++)
 		if (vec_result[i] != vec_expected[i]) {
 #ifdef DEBUG
-			printf("Test_ll_unsigned_int_result: ");
+			printf("Test_ll_unsigned_int_result %s: ", string);
 			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
 			       i, vec_result[i], i, vec_expected[i]);
 #else
@@ -292,7 +294,8 @@ int main()
 	vec_dble0 = (vector double){-124.930, 81234.49};
 	vec_ll_int_expected = (vector long long signed int){-124, 81234};
 	vec_ll_int_result = vec_signed (vec_dble0);
-	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected);
+	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
+			    "vec_signed");
 
 	/* Convert double precision vector float to vector int, even words */
 	vec_dble0 = (vector double){-124.930, 81234.49};
@@ -321,12 +324,44 @@ int main()
 	test_unsigned_int_result (ALL, vec_uns_int_result,
 				  vec_uns_int_expected);
 
+	/* Convert single precision vector float, even args, to vector
+	   signed long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_int_expected = (vector signed long long int){834, -5};
+	vec_ll_int_result = vec_signede (vec_flt0);
+	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
+			    "vec_signede");
+
+	/* Convert single precision vector float, odd args, to vector
+	   signed long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_int_expected = (vector signed long long int){14, -3};
+	vec_ll_int_result = vec_signedo (vec_flt0);
+	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
+			    "vec_signedo");
+
+	/* Convert single precision vector float, even args, to vector
+	   unsigned long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_uns_int_expected = (vector unsigned long long int){834, 0};
+	vec_ll_uns_int_result = vec_unsignede (vec_flt0);
+	test_ll_unsigned_int_result (vec_ll_uns_int_result,
+				     vec_ll_uns_int_expected, "vec_unsignede");
+
+	/* Convert single precision vector float, odd args, to vector
+	   unsigned long long int.  */
+	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
+	vec_ll_uns_int_expected = (vector unsigned long long int){14, 0};
+	vec_ll_uns_int_result = vec_unsignedo (vec_flt0);
+	test_ll_unsigned_int_result (vec_ll_uns_int_result,
+				     vec_ll_uns_int_expected, "vec_unsignedo");
+
 	/* Convert double precision float to long long unsigned int */
 	vec_dble0 = (vector double){124.930, 8134.49};
 	vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
 	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
 	test_ll_unsigned_int_result (vec_ll_uns_int_result,
-				     vec_ll_uns_int_expected);
+				     vec_ll_uns_int_expected, "vec_unsigned");
 
 	/* Convert double precision float to long long unsigned int. Negative
 	   arguments.  */
@@ -334,7 +369,7 @@ int main()
 	vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
 	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
 	test_ll_unsigned_int_result (vec_ll_uns_int_result,
-				     vec_ll_uns_int_expected);
+				     vec_ll_uns_int_expected, "vec_unsigned");
 
 	/* Convert double precision vector float to vector unsigned int,
 	   even words.  Negative arguments */
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (3 preceding siblings ...)
  2024-05-29 15:58 ` [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins Carl Love
@ 2024-05-29 16:00 ` Carl Love
  2024-06-04  6:20   ` Kewen.Lin
  2024-05-29 16:01 ` [PATCH 6/13 ver 3] rs6000, remove duplicated built-ins of vecmergl and, vec_mergeh Carl Love
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:00 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This is a new patch to removed the built-ins that were inadvertently missing in the previous series.

                              Carl 
--------------------------------------------------------------

rs6000, Remove redundant float/double type conversions

The following built-ins are redundant as they are covered by another
overloaded built-in.

  __builtin_vsx_xvcvspdp covered by vec_double{e,o}
  __builtin_vsx_xvcvdpsp covered by vec_float{e,o}
  __builtin_vsx_xvcvsxwdp covered by vec_double{e,o}
  __builtin_vsx_xvcvuxddp_uns covered by  vec_double

Remove the redundant built-ins. They are not documented nor do they have
test cases.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspdp,
	__builtin_vsx_xvcvdpsp, __builtin_vsx_xvcvsxwdp,
	__builtin_vsx_xvcvuxddp_uns): Remove.
---
 gcc/config/rs6000/rs6000-builtins.def | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index cea2649b86c..6049f3a4599 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1679,9 +1679,6 @@
   const signed int __builtin_vsx_xvcmpgtsp_p (signed int, vf, vf);
     XVCMPGTSP_P vector_gt_v4sf_p {pred}
 
-  const vf __builtin_vsx_xvcvdpsp (vd);
-    XVCVDPSP vsx_xvcvdpsp {}
-
   const vsll __builtin_vsx_xvcvdpsxds (vd);
     XVCVDPSXDS vsx_fix_truncv2dfv2di2 {}
 
@@ -1691,9 +1688,6 @@
   const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
     XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
 
-  const vd __builtin_vsx_xvcvspdp (vf);
-    XVCVSPDP vsx_xvcvspdp {}
-
   const vsll __builtin_vsx_xvcvspsxds (vf);
     VEC_VSIGNEDE_V4SF vsignede_v4sf {}
 
@@ -1715,9 +1709,6 @@
   const vf __builtin_vsx_xvcvsxdsp (vsll);
     XVCVSXDSP vsx_xvcvsxdsp {}
 
-  const vd __builtin_vsx_xvcvsxwdp (vsi);
-    XVCVSXWDP vsx_xvcvsxwdp {}
-
   const vf __builtin_vsx_xvcvsxwsp (vsi);
     XVCVSXWSP vsx_floatv4siv4sf2 {}
 
@@ -1727,9 +1718,6 @@
   const vd __builtin_vsx_xvcvuxddp_scale (vsll, const int<5>);
     XVCVUXDDP_SCALE vsx_xvcvuxddp_scale {}
 
-  const vd __builtin_vsx_xvcvuxddp_uns (vull);
-    XVCVUXDDP_UNS vsx_floatunsv2div2df2 {}
-
   const vf __builtin_vsx_xvcvuxdsp (vull);
     XVCVUXDSP vsx_xvcvuxdsp {}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/13 ver 3] rs6000, remove duplicated built-ins of vecmergl and, vec_mergeh
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (4 preceding siblings ...)
  2024-05-29 16:00 ` [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions Carl Love
@ 2024-05-29 16:01 ` Carl Love
  2024-05-29 16:03 ` [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments Carl Love
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:01 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This was patch 5 in the previous series.  It was previously approved.  Not changes in this version.  Being posted for completeness.

                             Carl 
----------------------------------------------------

rs6000, remove duplicated built-ins of vecmergl and
 vec_mergeh

The following undocumented built-ins are same as existing documented
overloaded builtins.

  const vf __builtin_vsx_xxmrghw (vf, vf);
same as  vf __builtin_vec_mergeh (vf, vf);      (overloaded vec_mergeh)

  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
same as vsi __builtin_vec_mergeh (vsi, vsi);   (overloaded vec_mergeh)

  const vf __builtin_vsx_xxmrglw (vf, vf);
same as vf __builtin_vec_mergel (vf, vf);      (overloaded vec_mergel)

  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
same as vsi __builtin_vec_mergel (vsi, vsi);   (overloaded vec_mergel)

This patch removes the duplicate built-in definitions so only the
documented built-ins will be available for use.  The case statements in
rs6000_gimple_fold_builtin are removed as they are no longer needed.  The
patch removes the now unused define_expands for vsx_xxmrghw_<mode> and
vsx_xxmrglw_<mode>.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxmrghw,
	__builtin_vsx_xxmrghw_4si, __builtin_vsx_xxmrglw,
	__builtin_vsx_xxmrglw_4si, __builtin_vsx_xxsel_16qi): Remove
	built-in definition.
	* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):
	remove case entries RS6000_BIF_XXMRGLW_4SI,
	RS6000_BIF_XXMRGLW_4SF, RS6000_BIF_XXMRGHW_4SI,
	RS6000_BIF_XXMRGHW_4SF.
	* config/rs6000/vsx.md (vsx_xxmrghw_<mode>, vsx_xxmrglw_<mode>):
	Remove unused define_expands.
---
 gcc/config/rs6000/rs6000-builtin.cc   |  4 ---
 gcc/config/rs6000/rs6000-builtins.def | 12 --------
 gcc/config/rs6000/vsx.md              | 41 ---------------------------
 3 files changed, 57 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
index ac9f16fe51a..f83d65b06ef 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -2097,20 +2097,16 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
     /* vec_mergel (integrals).  */
     case RS6000_BIF_VMRGLH:
     case RS6000_BIF_VMRGLW:
-    case RS6000_BIF_XXMRGLW_4SI:
     case RS6000_BIF_VMRGLB:
     case RS6000_BIF_VEC_MERGEL_V2DI:
-    case RS6000_BIF_XXMRGLW_4SF:
     case RS6000_BIF_VEC_MERGEL_V2DF:
       fold_mergehl_helper (gsi, stmt, 1);
       return true;
     /* vec_mergeh (integrals).  */
     case RS6000_BIF_VMRGHH:
     case RS6000_BIF_VMRGHW:
-    case RS6000_BIF_XXMRGHW_4SI:
     case RS6000_BIF_VMRGHB:
     case RS6000_BIF_VEC_MERGEH_V2DI:
-    case RS6000_BIF_XXMRGHW_4SF:
     case RS6000_BIF_VEC_MERGEH_V2DF:
       fold_mergehl_helper (gsi, stmt, 0);
       return true;
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 6049f3a4599..13e36df008d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1877,18 +1877,6 @@
   const signed int __builtin_vsx_xvtsqrtsp_fg (vf);
     XVTSQRTSP_FG vsx_tsqrtv4sf2_fg {}
 
-  const vf __builtin_vsx_xxmrghw (vf, vf);
-    XXMRGHW_4SF vsx_xxmrghw_v4sf {}
-
-  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
-    XXMRGHW_4SI vsx_xxmrghw_v4si {}
-
-  const vf __builtin_vsx_xxmrglw (vf, vf);
-    XXMRGLW_4SF vsx_xxmrglw_v4sf {}
-
-  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
-    XXMRGLW_4SI vsx_xxmrglw_v4si {}
-
   const vsc __builtin_vsx_xxpermdi_16qi (vsc, vsc, const int<2>);
     XXPERMDI_16QI vsx_xxpermdi_v16qi {}
 
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index a8f3d459232..4402b8b01d5 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4875,47 +4875,6 @@ (define_insn "vsx_xxspltd_<mode>"
 }
   [(set_attr "type" "vecperm")])
 
-;; V4SF/V4SI interleave
-(define_expand "vsx_xxmrghw_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
-        (vec_select:VSX_W
-	  (vec_concat:<VS_double>
-	    (match_operand:VSX_W 1 "vsx_register_operand" "wa")
-	    (match_operand:VSX_W 2 "vsx_register_operand" "wa"))
-	  (parallel [(const_int 0) (const_int 4)
-		     (const_int 1) (const_int 5)])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-{
-  rtx (*fun) (rtx, rtx, rtx);
-  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrghw_direct_<mode>
-			 : gen_altivec_vmrglw_direct_<mode>;
-  if (!BYTES_BIG_ENDIAN)
-    std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
-  DONE;
-}
-  [(set_attr "type" "vecperm")])
-
-(define_expand "vsx_xxmrglw_<mode>"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
-	(vec_select:VSX_W
-	  (vec_concat:<VS_double>
-	    (match_operand:VSX_W 1 "vsx_register_operand" "wa")
-	    (match_operand:VSX_W 2 "vsx_register_operand" "wa"))
-	  (parallel [(const_int 2) (const_int 6)
-		     (const_int 3) (const_int 7)])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-{
-  rtx (*fun) (rtx, rtx, rtx);
-  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrglw_direct_<mode>
-			 : gen_altivec_vmrghw_direct_<mode>;
-  if (!BYTES_BIG_ENDIAN)
-    std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
-  DONE;
-}
-  [(set_attr "type" "vecperm")])
-
 ;; Shift left double by word immediate
 (define_insn "vsx_xxsldwi_<mode>"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=wa")
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (5 preceding siblings ...)
  2024-05-29 16:01 ` [PATCH 6/13 ver 3] rs6000, remove duplicated built-ins of vecmergl and, vec_mergeh Carl Love
@ 2024-05-29 16:03 ` Carl Love
  2024-06-04  5:58   ` Kewen.Lin
  2024-05-29 16:05 ` [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates Carl Love
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:03 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This was patch 6 in the previous series.  Updated the documentation file per the comments.  No functional changes to the patch.

                          Carl 
------------------------------------------------------------

rs6000, add overloaded vec_sel with int128 arguments

Extend the vec_sel built-in to take three signed/unsigned int128 arguments
and return a signed/unsigned int128 result.

Extending the vec_sel built-in makes the existing buit-ins
__builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
patch removes these built-ins.

The patch adds documentation and test cases for the new overloaded vec_sel
built-ins.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
	__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
	* config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
	definitions.
	* doc/extend.texi: Add documentation for new vec_sel instances.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/vec-sel-runnable-i128.c: New test file.
---
 gcc/config/rs6000/rs6000-builtins.def         |   6 -
 gcc/config/rs6000/rs6000-overload.def         |   4 +
 gcc/doc/extend.texi                           |  12 ++
 .../powerpc/vec-sel-runnable-i128.c           | 129 ++++++++++++++++++
 4 files changed, 145 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 13e36df008d..ea0da77f13e 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1904,12 +1904,6 @@
   const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
     XXSEL_16QI_UNS vector_select_v16qi_uns {}
 
-  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
-    XXSEL_1TI vector_select_v1ti {}
-
-  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
-    XXSEL_1TI_UNS vector_select_v1ti_uns {}
-
   const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
     XXSEL_2DF vector_select_v2df {}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 4d857bb1af3..a210c5ad10d 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3274,6 +3274,10 @@
     VSEL_2DF  VSEL_2DF_B
   vd __builtin_vec_sel (vd, vd, vull);
     VSEL_2DF  VSEL_2DF_U
+  vsq __builtin_vec_sel (vsq, vsq, vsq);
+    VSEL_1TI  VSEL_1TI_S
+  vuq __builtin_vec_sel (vuq, vuq, vuq);
+    VSEL_1TI_UNS  VSEL_1TI_U
 ; The following variants are deprecated.
   vsll __builtin_vec_sel (vsll, vsll, vsll);
     VSEL_2DI_B  VSEL_2DI_S
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b88e61641a2..0756230b19e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -21372,6 +21372,18 @@ Additional built-in functions are available for the 64-bit PowerPC
 family of processors, for efficient use of 128-bit floating point
 (@code{__float128}) values.
 
+Vector select
+
+@smallexample
+vector signed __int128 vec_sel (vector signed __int128,
+               vector signed __int128, vector signed __int128);
+vector unsigned __int128 vec_sel (vector unsigned __int128,
+               vector unsigned __int128, vector unsigned __int128);
+@end smallexample
+
+The instance is an extension of the exiting overloaded built-in @code{vec_sel}
+that is documented in the PVIPR.
+
 @node Basic PowerPC Built-in Functions Available on ISA 2.06
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.06
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
new file mode 100644
index 00000000000..d82225cc847
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
@@ -0,0 +1,129 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-save-temps" } */
+/* { dg-final { scan-assembler-times "xxsel" 2 } } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+void print_i128 (unsigned __int128 val)
+{
+  printf(" 0x%016llx%016llx",
+         (unsigned long long)(val >> 64),
+         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
+}
+#endif
+
+extern void abort (void);
+
+union convert_union {
+  vector signed __int128    s128;
+  vector unsigned __int128  u128;
+  char  val[16];
+} convert;
+
+int check_u128_result(vector unsigned __int128 vresult_u128,
+		      vector unsigned __int128 expected_vresult_u128)
+{
+  /* Use a for loop to check each byte manually so the test case will run
+     with ISA 2.06.
+
+     Return 1 if they match, 0 otherwise.  */
+
+  int i;
+
+  union convert_union result;
+  union convert_union expected;
+
+  result.u128 = vresult_u128;
+  expected.u128 = expected_vresult_u128;
+
+  /* Check if each byte of the result and expected match. */
+  for (i = 0; i < 16; i++)
+    {
+      if (result.val[i] != expected.val[i])
+	return 0;
+    }
+  return 1;
+}
+
+int check_s128_result(vector signed __int128 vresult_s128,
+		      vector signed __int128 expected_vresult_s128)
+{
+  /* Convert the arguments to unsigned, then check equality.  */
+  union convert_union result;
+  union convert_union expected;
+
+  result.s128 = vresult_s128;
+  expected.s128 = expected_vresult_s128;
+
+  return check_u128_result (result.u128, expected.u128);
+}
+
+
+int
+main (int argc, char *argv [])
+{
+  int i;
+  
+  vector signed __int128 src_va_s128;
+  vector signed __int128 src_vb_s128;
+  vector signed __int128 src_vc_s128;
+  vector signed __int128 vresult_s128;
+  vector signed __int128 expected_vresult_s128;
+
+  vector unsigned __int128 src_va_u128;
+  vector unsigned __int128 src_vb_u128;
+  vector unsigned __int128 src_vc_u128;
+  vector unsigned __int128 vresult_u128;
+  vector unsigned __int128 expected_vresult_u128;
+
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vc_s128 = (vector signed __int128) {0x3333333333333333};
+  expected_vresult_s128 = (vector signed __int128) {0x32147658ba9cfed0};
+
+  /* Signed arguments.  */
+  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_s128);
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_s128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
+  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
+  src_vc_u128 = (vector unsigned __int128) {0x5555555555555555};
+  expected_vresult_u128 = (vector unsigned __int128) {0x0307CFCF47439A9A};
+
+  /* Unigned arguments.  */
+  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
+
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_u128) result does not match expected output.\n");
+      printf ("  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+    return 0;
+}
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (6 preceding siblings ...)
  2024-05-29 16:03 ` [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments Carl Love
@ 2024-05-29 16:05 ` Carl Love
  2024-06-04  5:58   ` Kewen.Lin
  2024-05-29 16:06 ` [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins Carl Love
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:05 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This was patch 7 in the previous series.  Patch was updated to address the feedback comments.

                            Carl 
------------------------------------------------------------------------

rs6000, remove the vec_xxsel built-ins, they are duplicates

The following undocumented built-ins are covered by the existing overloaded
vec_sel built-in definitions.

  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
same as vsc __builtin_vec_sel (vsc, vsc, vuc);  (overloaded vec_sel)

  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
same as vuc __builtin_vec_sel (vuc, vuc, vuc);  (overloaded vec_sel)

  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
same as  vd __builtin_vec_sel (vd, vd, vull);   (overloaded vec_sel)

  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
same as vsll __builtin_vec_sel (vsll, vsll, vsll);  (overloaded vec_sel)

  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
same as vull __builtin_vec_sel (vull, vull, vsll);  (overloaded vec_sel)

  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
same as vf __builtin_vec_sel (vf, vf, vsi)          (overloaded vec_sel)

  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
same as vsi __builtin_vec_sel (vsi, vsi, vbi);      (overloaded vec_sel)

  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
same as vui __builtin_vec_sel (vui, vui, vui);      (overloaded vec_sel)

  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
same as vss __builtin_vec_sel (vss, vss, vbs);      (overloaded vec_sel)

  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
same as vus __builtin_vec_sel (vus, vus, vus);      (overloaded vec_sel)

This patch removed the duplicate built-in definitions so users will only
use the documented vec_sel built-in.  The __builtin_vsx_xxsel_[4si, 8hi,
16qi, 4sf, 2df] tests are also removed.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_16qi,
	__builtin_vsx_xxsel_16qi_uns, __builtin_vsx_xxsel_2df,
	__builtin_vsx_xxsel_2di,	__builtin_vsx_xxsel_2di_uns,
	__builtin_vsx_xxsel_4sf, 	__builtin_vsx_xxsel_4si,
	__builtin_vsx_xxsel_4si_uns, 	__builtin_vsx_xxsel_8hi,
	__builtin_vsx_xxsel_8hi_uns): Remove 	built-in definitions.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_xxsel_4si,
	__builtin_vsx_xxsel_8hi, __builtin_vsx_xxsel_16qi,
	__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_2df,
	__builtin_vsx_xxsel): Change built-in call to overloaded built-in
	call vec_sel.
---
 gcc/config/rs6000/rs6000-builtins.def         | 30 ----------------
 .../gcc.target/powerpc/vsx-builtin-3.c        | 36 ++++++++++---------
 2 files changed, 19 insertions(+), 47 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index ea0da77f13e..a78c52183bc 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1898,36 +1898,6 @@
   const vss __builtin_vsx_xxpermdi_8hi (vss, vss, const int<2>);
     XXPERMDI_8HI vsx_xxpermdi_v8hi {}
 
-  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
-    XXSEL_16QI vector_select_v16qi {}
-
-  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
-    XXSEL_16QI_UNS vector_select_v16qi_uns {}
-
-  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
-    XXSEL_2DF vector_select_v2df {}
-
-  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
-    XXSEL_2DI vector_select_v2di {}
-
-  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
-    XXSEL_2DI_UNS vector_select_v2di_uns {}
-
-  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
-    XXSEL_4SF vector_select_v4sf {}
-
-  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
-    XXSEL_4SI vector_select_v4si {}
-
-  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
-    XXSEL_4SI_UNS vector_select_v4si_uns {}
-
-  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
-    XXSEL_8HI vector_select_v8hi {}
-
-  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
-    XXSEL_8HI_UNS vector_select_v8hi_uns {}
-
   const vsc __builtin_vsx_xxsldwi_16qi (vsc, vsc, const int<2>);
     XXSLDWI_16QI vsx_xxsldwi_v16qi {}
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
index ff875c55304..e20d3f03c86 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
@@ -37,6 +37,8 @@
 /* { dg-final { scan-assembler "xvcvsxdsp" } } */
 /* { dg-final { scan-assembler "xvcvuxdsp" } } */
 
+#include <altivec.h>
+
 extern __vector int si[][4];
 extern __vector short ss[][4];
 extern __vector signed char sc[][4];
@@ -61,23 +63,23 @@ int do_sel(void)
 {
   int i = 0;
 
-  si[i][0] = __builtin_vsx_xxsel_4si (si[i][1], si[i][2], si[i][3]); i++;
-  ss[i][0] = __builtin_vsx_xxsel_8hi (ss[i][1], ss[i][2], ss[i][3]); i++;
-  sc[i][0] = __builtin_vsx_xxsel_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
-  f[i][0] = __builtin_vsx_xxsel_4sf (f[i][1], f[i][2], f[i][3]); i++;
-  d[i][0] = __builtin_vsx_xxsel_2df (d[i][1], d[i][2], d[i][3]); i++;
-
-  si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], bi[i][3]); i++;
-  ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], bs[i][3]); i++;
-  sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], bc[i][3]); i++;
-  f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], bi[i][3]); i++;
-  d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], bl[i][3]); i++;
-
-  si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], ui[i][3]); i++;
-  ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], us[i][3]); i++;
-  sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], uc[i][3]); i++;
-  f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], ui[i][3]); i++;
-  d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], ul[i][3]); i++;
+  si[i][0] = vec_sel (si[i][1], si[i][2], ui[i][3]); i++;
+  ss[i][0] = vec_sel (ss[i][1], ss[i][2], us[i][3]); i++;
+  sc[i][0] = vec_sel (sc[i][1], sc[i][2], uc[i][3]); i++;
+  f[i][0] = vec_sel (f[i][1], f[i][2], f[i][3]); i++;
+  d[i][0] = vec_sel (d[i][1], d[i][2], d[i][3]); i++;
+
+  si[i][0] = vec_sel (si[i][1], si[i][2], bi[i][3]); i++;
+  ss[i][0] = vec_sel (ss[i][1], ss[i][2], bs[i][3]); i++;
+  sc[i][0] = vec_sel (sc[i][1], sc[i][2], bc[i][3]); i++;
+  f[i][0] = vec_sel (f[i][1], f[i][2], bi[i][3]); i++;
+  d[i][0] = vec_sel (d[i][1], d[i][2], bl[i][3]); i++;
+
+  si[i][0] = vec_sel (si[i][1], si[i][2], ui[i][3]); i++;
+  ss[i][0] = vec_sel (ss[i][1], ss[i][2], us[i][3]); i++;
+  sc[i][0] = vec_sel (sc[i][1], sc[i][2], uc[i][3]); i++;
+  f[i][0] = vec_sel (f[i][1], f[i][2], ui[i][3]); i++;
+  d[i][0] = vec_sel (d[i][1], d[i][2], ul[i][3]); i++;
 
   return i;
 }
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (7 preceding siblings ...)
  2024-05-29 16:05 ` [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates Carl Love
@ 2024-05-29 16:06 ` Carl Love
  2024-06-04  5:58   ` Kewen.Lin
  2024-05-29 16:08 ` [PATCH 10/13 ver 3] rs6000, remove __builtin_vsx_xvnegdp and, __builtin_vsx_xvnegsp built-ins Carl Love
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:06 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This was patch 8 in the previous series.  Updated patch per the feedback comments.

                            Carl 
--------------------------------------------------------------------

rs6000, remove __builtin_vsx_vperm_* built-ins

The undocumented built-ins:
  __builtin_vsx_vperm_16qi_uns,
  __builtin_vsx_vperm_1ti,
  __builtin_vsx_vperm_1ti_uns,
  __builtin_vsx_vperm_2df,
  __builtin_vsx_vperm_2di,
  __builtin_vsx_vperm_2di_uns,
  __builtin_vsx_vperm_4sf,
  __builtin_vsx_vperm_4si,
  __builtin_vsx_vperm_4si_uns

are duplicats of the __builtin_altivec_* builtins that are used by
the overloaded vec_perm built-in that is documented in the PVIPR.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_vperm_16qi_uns,
	__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
	__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
	__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
	__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns): Remove
	built-in definitions and comments.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_vperm_16qi_uns,
	__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
	__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
	__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
	__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns,
	__builtin_vsx_vperm): Change call to built-in to the  overloaded
	built-in vec_perm.
---
 gcc/config/rs6000/rs6000-builtins.def         | 33 -------------------
 .../gcc.target/powerpc/vsx-builtin-3.c        | 22 ++++++-------
 2 files changed, 11 insertions(+), 44 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index a78c52183bc..f02a8c4de45 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1529,39 +1529,6 @@
   const vf __builtin_vsx_uns_floato_v2di (vsll);
     UNS_FLOATO_V2DI unsfloatov2di {}
 
-; These are duplicates of __builtin_altivec_* counterparts, and are being
-; kept for backwards compatibility.  The reason for their existence is
-; unclear.  TODO: Consider deprecation/removal at some point.
-  const vsc __builtin_vsx_vperm_16qi (vsc, vsc, vuc);
-    VPERM_16QI_X altivec_vperm_v16qi {}
-
-  const vuc __builtin_vsx_vperm_16qi_uns (vuc, vuc, vuc);
-    VPERM_16QI_UNS_X altivec_vperm_v16qi_uns {}
-
-  const vsq __builtin_vsx_vperm_1ti (vsq, vsq, vsc);
-    VPERM_1TI_X altivec_vperm_v1ti {}
-
-  const vsq __builtin_vsx_vperm_1ti_uns (vsq, vsq, vsc);
-    VPERM_1TI_UNS_X altivec_vperm_v1ti_uns {}
-
-  const vd __builtin_vsx_vperm_2df (vd, vd, vuc);
-    VPERM_2DF_X altivec_vperm_v2df {}
-
-  const vsll __builtin_vsx_vperm_2di (vsll, vsll, vuc);
-    VPERM_2DI_X altivec_vperm_v2di {}
-
-  const vull __builtin_vsx_vperm_2di_uns (vull, vull, vuc);
-    VPERM_2DI_UNS_X altivec_vperm_v2di_uns {}
-
-  const vf __builtin_vsx_vperm_4sf (vf, vf, vuc);
-    VPERM_4SF_X altivec_vperm_v4sf {}
-
-  const vsi __builtin_vsx_vperm_4si (vsi, vsi, vuc);
-    VPERM_4SI_X altivec_vperm_v4si {}
-
-  const vui __builtin_vsx_vperm_4si_uns (vui, vui, vuc);
-    VPERM_4SI_UNS_X altivec_vperm_v4si_uns {}
-
   const vss __builtin_vsx_vperm_8hi (vss, vss, vuc);
     VPERM_8HI_X altivec_vperm_v8hi {}
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
index e20d3f03c86..f06d871b6b1 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
@@ -88,17 +88,17 @@ int do_perm(void)
 {
   int i = 0;
 
-  si[i][0] = __builtin_vsx_vperm_4si (si[i][1], si[i][2], uc[i][3]); i++;
-  ss[i][0] = __builtin_vsx_vperm_8hi (ss[i][1], ss[i][2], uc[i][3]); i++;
-  sc[i][0] = __builtin_vsx_vperm_16qi (sc[i][1], sc[i][2], uc[i][3]); i++;
-  f[i][0] = __builtin_vsx_vperm_4sf (f[i][1], f[i][2], uc[i][3]); i++;
-  d[i][0] = __builtin_vsx_vperm_2df (d[i][1], d[i][2], uc[i][3]); i++;
-
-  si[i][0] = __builtin_vsx_vperm (si[i][1], si[i][2], uc[i][3]); i++;
-  ss[i][0] = __builtin_vsx_vperm (ss[i][1], ss[i][2], uc[i][3]); i++;
-  sc[i][0] = __builtin_vsx_vperm (sc[i][1], sc[i][2], uc[i][3]); i++;
-  f[i][0] = __builtin_vsx_vperm (f[i][1], f[i][2], uc[i][3]); i++;
-  d[i][0] = __builtin_vsx_vperm (d[i][1], d[i][2], uc[i][3]); i++;
+  si[i][0] = vec_perm (si[i][1], si[i][2], uc[i][3]); i++;
+  ss[i][0] = vec_perm (ss[i][1], ss[i][2], uc[i][3]); i++;
+  sc[i][0] = vec_perm (sc[i][1], sc[i][2], uc[i][3]); i++;
+  f[i][0] = vec_perm (f[i][1], f[i][2], uc[i][3]); i++;
+  d[i][0] = vec_perm (d[i][1], d[i][2], uc[i][3]); i++;
+
+  si[i][0] = vec_perm (si[i][1], si[i][2], uc[i][3]); i++;
+  ss[i][0] = vec_perm (ss[i][1], ss[i][2], uc[i][3]); i++;
+  sc[i][0] = vec_perm (sc[i][1], sc[i][2], uc[i][3]); i++;
+  f[i][0] = vec_perm (f[i][1], f[i][2], uc[i][3]); i++;
+  d[i][0] = vec_perm (d[i][1], d[i][2], uc[i][3]); i++;
 
   return i;
 }
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 10/13 ver 3] rs6000, remove __builtin_vsx_xvnegdp and, __builtin_vsx_xvnegsp built-ins
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (8 preceding siblings ...)
  2024-05-29 16:06 ` [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins Carl Love
@ 2024-05-29 16:08 ` Carl Love
  2024-05-29 16:10 ` [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:08 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

 This was patch 9 in the previous series.  It was previously approved.  Reposting for completeness.

                                 Carl
-----------------------------------------------------

rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins

The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are
redundant.  The overloaded vec_neg built-in provides the same
functionality.  The two buit-ins are not documented nor are there any
test cases for them.

Remove the definitions so users will use the overloaded vec_neg built-in
which is documented in the PVIPR.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvnegdp,
	__builtin_vsx_xvnegsp): Remove built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index f02a8c4de45..64690b9b9b5 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1736,12 +1736,6 @@
   const vf __builtin_vsx_xvnabssp (vf);
     XVNABSSP vsx_nabsv4sf2 {}
 
-  const vd __builtin_vsx_xvnegdp (vd);
-    XVNEGDP negv2df2 {}
-
-  const vf __builtin_vsx_xvnegsp (vf);
-    XVNEGSP negv4sf2 {}
-
   const vd __builtin_vsx_xvnmadddp (vd, vd, vd);
     XVNMADDDP nfmav2df4 {}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (9 preceding siblings ...)
  2024-05-29 16:08 ` [PATCH 10/13 ver 3] rs6000, remove __builtin_vsx_xvnegdp and, __builtin_vsx_xvnegsp built-ins Carl Love
@ 2024-05-29 16:10 ` Carl Love
  2024-06-04  5:58   ` Kewen.Lin
  2024-05-29 16:11 ` [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in Carl Love
  2024-05-29 16:16 ` [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins Carl Love
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:10 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

 This was patch 10 from the previous series.  The patch was updated to address feedback comments.

                            Carl 
---------------------------------------------------

rs6000, extend vec_xxpermdi built-in for __int128 args

Add a new signed and unsigned overloaded instances for vec_xxpermdi

   __int128 vec_xxpermdi (__int128, __int128, const int);
   __uint128 vec_xxpermdi (__uint128, __uint128, const int);

Update the documentation to include a reference to the new built-in
instances.

Add test cases for the new overloaded instances.

gcc/ChangeLog:
	* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
	overloaded built-in instances.
	* doc/extend.texi:  Add documentation for new overloaded built-in
	instances.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
---
 gcc/config/rs6000/rs6000-overload.def         |   4 +
 gcc/doc/extend.texi                           |   2 +
 .../powerpc/vec_perm-runnable-i128.c          | 229 ++++++++++++++++++
 3 files changed, 235 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c

diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index a210c5ad10d..45000f161e4 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4932,6 +4932,10 @@
     XXPERMDI_4SF  XXPERMDI_VF
   vd __builtin_vsx_xxpermdi (vd, vd, const int);
     XXPERMDI_2DF  XXPERMDI_VD
+  vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
+    XXPERMDI_1TI  XXPERMDI_1TI
+  vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
+    XXPERMDI_1TI  XXPERMDI_1TUI
 
 [VEC_XXSLDWI, vec_xxsldwi, __builtin_vsx_xxsldwi]
   vsc __builtin_vsx_xxsldwi (vsc, vsc, const int);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 0756230b19e..edfef1bdab7 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22555,6 +22555,8 @@ void vec_vsx_st (vector bool char, int, signed char *);
 vector double vec_xxpermdi (vector double, vector double, const int);
 vector float vec_xxpermdi (vector float, vector float, const int);
 vector long long vec_xxpermdi (vector long long, vector long long, const int);
+vector __int128 vec_xxpermdi (vector __int128, vector __int128, const int);
+vector __int128 vec_xxpermdi (vector __uint128, vector __uint128, const int);
 vector unsigned long long vec_xxpermdi (vector unsigned long long,
                                         vector unsigned long long, const int);
 vector int vec_xxpermdi (vector int, vector int, const int);
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
new file mode 100644
index 00000000000..2d5dce09404
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
@@ -0,0 +1,229 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-save-temps" } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+void print_i128 (unsigned __int128 val)
+{
+  printf(" 0x%016llx%016llx",
+         (unsigned long long)(val >> 64),
+         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
+}
+#endif
+
+extern void abort (void);
+
+union convert_union {
+  vector signed __int128    s128;
+  vector unsigned __int128  u128;
+  char  val[16];
+} convert;
+
+int check_u128_result(vector unsigned __int128 vresult_u128,
+		      vector unsigned __int128 expected_vresult_u128)
+{
+  /* Use a for loop to check each byte manually so the test case will
+     run with ISA 2.06.
+
+     Return 1 if they match, 0 otherwise.  */
+
+  int i;
+
+  union convert_union result;
+  union convert_union expected;
+
+  result.u128 = vresult_u128;
+  expected.u128 = expected_vresult_u128;
+
+  /* Check if each byte of the result and expected match. */
+  for (i = 0; i < 16; i++)
+    {
+      if (result.val[i] != expected.val[i])
+	return 0;
+    }
+  return 1;
+}
+
+int check_s128_result(vector signed __int128 vresult_s128,
+		      vector signed __int128 expected_vresult_s128)
+{
+  /* Convert the arguments to unsigned, then check equality.  */
+  union convert_union result;
+  union convert_union expected;
+
+  result.s128 = vresult_s128;
+  expected.s128 = expected_vresult_s128;
+
+  return check_u128_result (result.u128, expected.u128);
+}
+
+
+int
+main (int argc, char *argv [])
+{
+  int i;
+  
+  vector signed __int128 src_va_s128;
+  vector signed __int128 src_vb_s128;
+  vector signed __int128 vresult_s128;
+  vector signed __int128 expected_vresult_s128;
+
+  vector unsigned __int128 src_va_u128;
+  vector unsigned __int128 src_vb_u128;
+  vector unsigned __int128 src_vc_u128;
+  vector unsigned __int128 vresult_u128;
+  vector unsigned __int128 expected_vresult_u128;
+
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = src_va_s128 << 64; 
+  src_va_s128 |= (vector signed __int128) {0x22446688AACCEE00};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vb_s128 = src_vb_s128 << 64;
+  src_vb_s128 |= (vector signed __int128) {0x3333333333333333};
+
+  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
+  src_va_u128 = src_va_u128 << 64;
+  src_va_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
+  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
+  src_vb_u128 = src_vb_u128 << 64;
+  src_vb_u128 |= (vector unsigned __int128) {0x5555555555555555};
+
+
+  /* Signed 128-bit arguments.  */
+  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x1);
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0x3333333333333333};
+#else
+  /* LE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0x22446688AACCEE00};
+#endif
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x1) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x2);
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0x22446688AACCEE00};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0xFEDCBA9876543210};
+#else
+  /* LE expected results  */
+  expected_vresult_s128 = (vector signed __int128) {0x3333333333333333};
+  expected_vresult_s128 = expected_vresult_s128 << 64;
+  expected_vresult_s128 |= (vector signed __int128) {0x123456789ABCDEF0};
+#endif
+
+  if (!check_s128_result (vresult_s128, expected_vresult_s128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x2) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_s128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_s128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  /* Unigned arguments.  */
+  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x1);
+
+  #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0x5555555555555555};
+#else
+  /* LE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
+#endif
+
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x1) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+  /* Unigned arguments.  */
+  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x2);
+
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  /* BE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0x1133557799BBDD00};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0xA987654FEDCB3210};
+#else
+  /* LE expected results */
+  expected_vresult_u128 = (vector unsigned __int128) {0x5555555555555555};
+  expected_vresult_u128 = expected_vresult_u128 << 64;
+  expected_vresult_u128 |= (vector unsigned __int128) {0x13579ACE02468BDF};
+#endif
+  
+  if (!check_u128_result (vresult_u128, expected_vresult_u128))
+#if DEBUG
+    {
+      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x2) result does not match expected output.\n");
+      printf ("  src_va_s128:     ");
+      print_i128 ((unsigned __int128) src_va_s128);
+      printf ("\n  src_vb_s128:     ");
+      print_i128 ((unsigned __int128) src_vb_s128);
+      printf ("\n  Result:          ");
+      print_i128 ((unsigned __int128) vresult_u128);
+      printf ("\n  Expected result: ");
+      print_i128 ((unsigned __int128) expected_vresult_u128);
+      printf ("\n");
+    }
+#else
+    abort ();
+#endif
+
+    return 0;
+}
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (10 preceding siblings ...)
  2024-05-29 16:10 ` [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
@ 2024-05-29 16:11 ` Carl Love
  2024-06-04  5:59   ` Kewen.Lin
  2024-05-29 16:16 ` [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins Carl Love
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:11 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This was patch 11 from the previous series.  Patch was updated to address feedback comments.

                       Carl 
----------------------------------------------------------

rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in

The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded
__builtin_altivec_vcmpeqfp_p built-in.  The built-in is undocumented and
there are no test cases for it.  The patch removes built-in
__builtin_vsx_xvcmpeqsp_p.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp_p):
	Remove built-in definition.
---
 gcc/config/rs6000/rs6000-builtins.def | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 64690b9b9b5..48ebc018a8d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1619,9 +1619,6 @@
   const vf __builtin_vsx_xvcmpeqsp (vf, vf);
     XVCMPEQSP vector_eqv4sf {}
 
-  const signed int __builtin_vsx_xvcmpeqsp_p (signed int, vf, vf);
-    XVCMPEQSP_P vector_eq_v4sf_p {pred}
-
   const vd __builtin_vsx_xvcmpgedp (vd, vd);
     XVCMPGEDP vector_gev2df {}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins.
  2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
                   ` (11 preceding siblings ...)
  2024-05-29 16:11 ` [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in Carl Love
@ 2024-05-29 16:16 ` Carl Love
  2024-06-04  5:59   ` Kewen.Lin
  12 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-05-29 16:16 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, Kewen.Lin, bergner

This was patch 13 from the previous series.  Note the previous series patch 12 was dropped.  This patch is the same as the previous version.  The additional work to remove  __builtin_vec_set_v1ti, __builtin_vec_set_v2di,  __builtin_vec_set_v2d per the feedback comments with equivalent gimple code is being deferred to a future patch.  The goal of this series was simply to remove duplicated built-ins, extending overloaded built-ins as needed.  Adding the needed gimple code to remove the additional built-ins is beyond the goal of this patch series.

                             Carl 
-------------------------------------------------------

rs6000, remove vector set and vector init built-ins.

The vector init built-ins:

  __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
  __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
  __builtin_vec_init_v2di, __builtin_vec_init_v2df,
  __builtin_vec_set_v1ti

perform the same operation as initializing the vector in C code.  For
example:

  result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
  result_v4si = {1, 2, 3, 4};

These two constructs were tested and verified they generate identical
assembly instructions with no optimization and -O3 optimization.

The vector set built-ins:

  __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
  __builtin_vec_set_v4si, __builtin_vec_set_v4sf

perform the same operation as setting a specific element in the vector in
C code.  For example:

  src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
  src_v4si[index] = int_val;

The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.

All of the above built-ins that are removed do not have test cases and
are not documented.

Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.

The built-ins are removed as they don't provide any benefit over just
using C code.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
	__builtin_vec_init_v8hi, __builtin_vec_init_v4si,
	__builtin_vec_init_v4sf, __builtin_vec_init_v2di,
	__builtin_vec_init_v2df, __builtin_vec_set_v1ti,
	__builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
	__builtin_vec_set_v4si, __builtin_vec_set_v4sf,
	__builtin_vec_set_v2di, __builtin_vec_set_v2df,
	__builtin_vec_set_v1ti): Remove built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 42 ++-------------------------
 1 file changed, 2 insertions(+), 40 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 48ebc018a8d..8349d45169f 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1118,37 +1118,6 @@
   const signed short __builtin_vec_ext_v8hi (vss, signed int);
     VEC_EXT_V8HI nothing {extract}
 
-  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
-            signed char, signed char, signed char, signed char, signed char, \
-            signed char, signed char, signed char, signed char, signed char, \
-            signed char, signed char, signed char);
-    VEC_INIT_V16QI nothing {init}
-
-  const vf __builtin_vec_init_v4sf (float, float, float, float);
-    VEC_INIT_V4SF nothing {init}
-
-  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
-                                     signed int);
-    VEC_INIT_V4SI nothing {init}
-
-  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short,\
-             signed short, signed short, signed short, signed short, \
-             signed short);
-    VEC_INIT_V8HI nothing {init}
-
-  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
-    VEC_SET_V16QI nothing {set}
-
-  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
-    VEC_SET_V4SF nothing {set}
-
-  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
-    VEC_SET_V4SI nothing {set}
-
-  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
-    VEC_SET_V8HI nothing {set}
-
-
 ; Cell builtins.
 [cell]
   pure vsc __builtin_altivec_lvlx (signed long, const void *);
@@ -1295,15 +1264,8 @@
   const signed long long __builtin_vec_ext_v2di (vsll, signed int);
     VEC_EXT_V2DI nothing {extract}
 
-  const vsq __builtin_vec_init_v1ti (signed __int128);
-    VEC_INIT_V1TI nothing {init}
-
-  const vd __builtin_vec_init_v2df (double, double);
-    VEC_INIT_V2DF nothing {init}
-
-  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
-    VEC_INIT_V2DI nothing {init}
-
+;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
+;; resolve_vec_insert(), rs6000-c.cc
   const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
     VEC_SET_V1TI nothing {set}
 
-- 
2.45.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition
  2024-05-29 15:56 ` [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition Carl Love
@ 2024-06-04  5:58   ` Kewen.Lin
  0 siblings, 0 replies; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:58 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/29 23:56, Carl Love wrote:
> This patch was updated per the feedback comment from the previous version in series 2.
> 
>                              Carl 
> -------------------------------------------------------------------
> 
> rs6000, fix error in unsigned vector float to unsigned int built-in definitions
> 
> The built-in __builtin_vsx_vunsigned_v2df is supposed to take a vector of
> doubles and return a vector of unsigned long long ints.  Similarly
> __builtin_vsx_vunsigned_v4sf takes a vector of floats an is supposed to
> return a vector of unsinged ints.  The definitions are using the signed
> version of the instructions not the unsigned version of the instruction.
> The results should also be unsigned.  The builtins are used by the
> overloaded vec_unsigned builtin which has an unsigned result.
> 
> Similarly the built-ins __builtin_vsx_vunsignede_v2df and
> __builtin_vsx_vunsignedo_v2df are supposed to return an unsigned result.
> If the floating point argument is negative, the unsigned result is zero.
> The built-ins are used in the overloaded built-in vec_unsignede and
> vec_unsignedo respectively.
> 
> Add a test cases for a negative floating point arguments for each of the
> above built-ins.

OK for trunk, thanks!

BR,
Kewen

> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_vunsigned_v2df,
> 	__builtin_vsx_vunsigned_v4sf, __builtin_vsx_vunsignede_v2df,
> 	__builtin_vsx_vunsignedo_v2df): Change the result type to unsigned.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/builtins-3-runnable.c: Add tests for
> 	vec_unsignede and vec_unsignedo with negative arguments.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         | 12 ++++----
>  .../gcc.target/powerpc/builtins-3-runnable.c  | 30 +++++++++++++++++--
>  2 files changed, 33 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index c6d2ea1bc39..bf9a0ae22fc 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1580,16 +1580,16 @@
>    const vsi __builtin_vsx_vsignedo_v2df (vd);
>      VEC_VSIGNEDO_V2DF vsignedo_v2df {}
>  
> -  const vsll __builtin_vsx_vunsigned_v2df (vd);
> -    VEC_VUNSIGNED_V2DF vsx_xvcvdpsxds {}
> +  const vull __builtin_vsx_vunsigned_v2df (vd);
> +    VEC_VUNSIGNED_V2DF vsx_xvcvdpuxds {}
>  
> -  const vsi __builtin_vsx_vunsigned_v4sf (vf);
> -    VEC_VUNSIGNED_V4SF vsx_xvcvspsxws {}
> +  const vui __builtin_vsx_vunsigned_v4sf (vf);
> +    VEC_VUNSIGNED_V4SF vsx_xvcvspuxws {}
>  
> -  const vsi __builtin_vsx_vunsignede_v2df (vd);
> +  const vui __builtin_vsx_vunsignede_v2df (vd);
>      VEC_VUNSIGNEDE_V2DF vunsignede_v2df {}
>  
> -  const vsi __builtin_vsx_vunsignedo_v2df (vd);
> +  const vui __builtin_vsx_vunsignedo_v2df (vd);
>      VEC_VUNSIGNEDO_V2DF vunsignedo_v2df {}
>  
>    const vf __builtin_vsx_xscvdpsp (double);
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> index 0231a1fd086..5dcdfbee791 100644
> --- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> @@ -313,6 +313,14 @@ int main()
>  	test_unsigned_int_result (ALL, vec_uns_int_result,
>  				  vec_uns_int_expected);
>  
> +	/* Convert single precision float to  unsigned int.  Negative
> +	   arguments.  */
> +	vec_flt0 = (vector float){-14.930, -834.49, -3.3, -5.4};
> +	vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
> +	vec_uns_int_result = vec_unsigned (vec_flt0);
> +	test_unsigned_int_result (ALL, vec_uns_int_result,
> +				  vec_uns_int_expected);
> +
>  	/* Convert double precision float to long long unsigned int */
>  	vec_dble0 = (vector double){124.930, 8134.49};
>  	vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
> @@ -320,10 +328,18 @@ int main()
>  	test_ll_unsigned_int_result (vec_ll_uns_int_result,
>  				     vec_ll_uns_int_expected);
>  
> +	/* Convert double precision float to long long unsigned int. Negative
> +	   arguments.  */
> +	vec_dble0 = (vector double){-24.93, -134.9};
> +	vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
> +	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
> +	test_ll_unsigned_int_result (vec_ll_uns_int_result,
> +				     vec_ll_uns_int_expected);
> +
>  	/* Convert double precision vector float to vector unsigned int,
> -	   even words */
> -	vec_dble0 = (vector double){3124.930, 8234.49};
> -	vec_uns_int_expected = (vector unsigned int){3124, 0, 8234, 0};
> +	   even words.  Negative arguments */
> +	vec_dble0 = (vector double){-124.930, -234.49};
> +	vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
>  	vec_uns_int_result = vec_unsignede (vec_dble0);
>  	test_unsigned_int_result (EVEN, vec_uns_int_result,
>  				  vec_uns_int_expected);
> @@ -335,5 +351,13 @@ int main()
>  	vec_uns_int_result = vec_unsignedo (vec_dble0);
>  	test_unsigned_int_result (ODD, vec_uns_int_result,
>  				  vec_uns_int_expected);
> +
> +	/* Convert double precision vector float to vector unsigned int,
> +	   odd words.  Negative arguments.  */
> +	vec_dble0 = (vector double){-924.930, -1234.49};
> +	vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
> +	vec_uns_int_result = vec_unsignedo (vec_dble0);
> +	test_unsigned_int_result (ODD, vec_uns_int_result,
> +				  vec_uns_int_expected);
>  }
>  

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments
  2024-05-29 16:03 ` [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments Carl Love
@ 2024-06-04  5:58   ` Kewen.Lin
  2024-06-13 15:35     ` Carl Love
  0 siblings, 1 reply; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:58 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/30 00:03, Carl Love wrote:
> This was patch 6 in the previous series.  Updated the documentation file per the comments.  No functional changes to the patch.
> 
>                           Carl 
> ------------------------------------------------------------
> 
> rs6000, add overloaded vec_sel with int128 arguments
> 
> Extend the vec_sel built-in to take three signed/unsigned int128 arguments
> and return a signed/unsigned int128 result.
> 
> Extending the vec_sel built-in makes the existing buit-ins
> __builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
> patch removes these built-ins.
> 
> The patch adds documentation and test cases for the new overloaded vec_sel
> built-ins.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
> 	__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
> 	* config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
> 	definitions.
> 	* doc/extend.texi: Add documentation for new vec_sel instances.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/vec-sel-runnable-i128.c: New test file.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   6 -
>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>  gcc/doc/extend.texi                           |  12 ++
>  .../powerpc/vec-sel-runnable-i128.c           | 129 ++++++++++++++++++
>  4 files changed, 145 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 13e36df008d..ea0da77f13e 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1904,12 +1904,6 @@
>    const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
>      XXSEL_16QI_UNS vector_select_v16qi_uns {}
>  
> -  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
> -    XXSEL_1TI vector_select_v1ti {}
> -
> -  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
> -    XXSEL_1TI_UNS vector_select_v1ti_uns {}
> -
>    const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
>      XXSEL_2DF vector_select_v2df {}
>  
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index 4d857bb1af3..a210c5ad10d 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -3274,6 +3274,10 @@
>      VSEL_2DF  VSEL_2DF_B
>    vd __builtin_vec_sel (vd, vd, vull);
>      VSEL_2DF  VSEL_2DF_U
> +  vsq __builtin_vec_sel (vsq, vsq, vsq);
> +    VSEL_1TI  VSEL_1TI_S
> +  vuq __builtin_vec_sel (vuq, vuq, vuq);
> +    VSEL_1TI_UNS  VSEL_1TI_U

I just noticed that for integral types, such as: signed/unsigned int, we have six instances:

  vsi __builtin_vec_sel (vsi, vsi, vbi);
    VSEL_4SI  VSEL_4SI_B
  vsi __builtin_vec_sel (vsi, vsi, vui);
    VSEL_4SI  VSEL_4SI_U
  vui __builtin_vec_sel (vui, vui, vbi);
    VSEL_4SI_UNS  VSEL_4SI_UB
  vui __builtin_vec_sel (vui, vui, vui);
    VSEL_4SI_UNS  VSEL_4SI_UU
  vbi __builtin_vec_sel (vbi, vbi, vbi);
    VSEL_4SI_UNS  VSEL_4SI_BB
  vbi __builtin_vec_sel (vbi, vbi, vui);

It considers the control vector can only have unsigned and bool types, also consider the
return type can be bool.  It aligns with what PVIPR defines, so here we should have:

vsq __builtin_vec_sel (vsq, vsq, vbq);
vsq __builtin_vec_sel (vsq, vsq, vuq);
vuq __builtin_vec_sel (vuq, vuq, vbq);
vuq __builtin_vec_sel (vuq, vuq, vuq);
vbq __builtin_vec_sel (vbq, vbq, vbq);
vbq __builtin_vec_sel (vbq, vbq, vuq);

Sorry that I didn't find this in the previous review.


>  ; The following variants are deprecated.
>    vsll __builtin_vec_sel (vsll, vsll, vsll);
>      VSEL_2DI_B  VSEL_2DI_S
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index b88e61641a2..0756230b19e 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -21372,6 +21372,18 @@ Additional built-in functions are available for the 64-bit PowerPC
>  family of processors, for efficient use of 128-bit floating point
>  (@code{__float128}) values.
>  
> +Vector select
> +
> +@smallexample
> +vector signed __int128 vec_sel (vector signed __int128,
> +               vector signed __int128, vector signed __int128);
> +vector unsigned __int128 vec_sel (vector unsigned __int128,
> +               vector unsigned __int128, vector unsigned __int128);
> +@end smallexample

As above, the documentation here has to consider vector bool __int128 and note that
the control vector are of type either vector unsigned __int128 or vector bool __int128.

> +
> +The instance is an extension of the exiting overloaded built-in @code{vec_sel}
> +that is documented in the PVIPR.
> +
>  @node Basic PowerPC Built-in Functions Available on ISA 2.06
>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.06
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
> new file mode 100644
> index 00000000000..d82225cc847
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
> @@ -0,0 +1,129 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target vmx_hw } */
> +/* { dg-options "-save-temps" } */
> +/* { dg-final { scan-assembler-times "xxsel" 2 } } */

Nit: Can we rename this case to builtins-10.c and separate it into
one dg-do compile and the other dg-do "run" (like some builtins-*.c
which have *-x.c and *-x-runnable.c)?

It needs some more adjustments as the overloaded instances change.

BR,
Kewen

> +
> +#include <altivec.h>
> +
> +#define DEBUG 0
> +
> +#if DEBUG
> +#include <stdio.h>
> +void print_i128 (unsigned __int128 val)
> +{
> +  printf(" 0x%016llx%016llx",
> +         (unsigned long long)(val >> 64),
> +         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
> +}
> +#endif
> +
> +extern void abort (void);
> +
> +union convert_union {
> +  vector signed __int128    s128;
> +  vector unsigned __int128  u128;
> +  char  val[16];
> +} convert;
> +
> +int check_u128_result(vector unsigned __int128 vresult_u128,
> +		      vector unsigned __int128 expected_vresult_u128)
> +{
> +  /* Use a for loop to check each byte manually so the test case will run
> +     with ISA 2.06.
> +
> +     Return 1 if they match, 0 otherwise.  */
> +
> +  int i;
> +
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.u128 = vresult_u128;
> +  expected.u128 = expected_vresult_u128;
> +
> +  /* Check if each byte of the result and expected match. */
> +  for (i = 0; i < 16; i++)
> +    {
> +      if (result.val[i] != expected.val[i])
> +	return 0;
> +    }
> +  return 1;
> +}
> +
> +int check_s128_result(vector signed __int128 vresult_s128,
> +		      vector signed __int128 expected_vresult_s128)
> +{
> +  /* Convert the arguments to unsigned, then check equality.  */
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.s128 = vresult_s128;
> +  expected.s128 = expected_vresult_s128;
> +
> +  return check_u128_result (result.u128, expected.u128);
> +}
> +
> +
> +int
> +main (int argc, char *argv [])
> +{
> +  int i;
> +  
> +  vector signed __int128 src_va_s128;
> +  vector signed __int128 src_vb_s128;
> +  vector signed __int128 src_vc_s128;
> +  vector signed __int128 vresult_s128;
> +  vector signed __int128 expected_vresult_s128;
> +
> +  vector unsigned __int128 src_va_u128;
> +  vector unsigned __int128 src_vb_u128;
> +  vector unsigned __int128 src_vc_u128;
> +  vector unsigned __int128 vresult_u128;
> +  vector unsigned __int128 expected_vresult_u128;
> +
> +  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> +  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> +  src_vc_s128 = (vector signed __int128) {0x3333333333333333};
> +  expected_vresult_s128 = (vector signed __int128) {0x32147658ba9cfed0};
> +
> +  /* Signed arguments.  */
> +  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_s128);
> +
> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_s128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_s128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_s128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
> +  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
> +  src_vc_u128 = (vector unsigned __int128) {0x5555555555555555};
> +  expected_vresult_u128 = (vector unsigned __int128) {0x0307CFCF47439A9A};
> +
> +  /* Unigned arguments.  */
> +  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
> +
> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_u128) result does not match expected output.\n");
> +      printf ("  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_u128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_u128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +    return 0;
> +}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates
  2024-05-29 16:05 ` [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates Carl Love
@ 2024-06-04  5:58   ` Kewen.Lin
  0 siblings, 0 replies; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:58 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/30 00:05, Carl Love wrote:
> This was patch 7 in the previous series.  Patch was updated to address the feedback comments.
> 
>                             Carl 
> ------------------------------------------------------------------------
> 
> rs6000, remove the vec_xxsel built-ins, they are duplicates
> 
> The following undocumented built-ins are covered by the existing overloaded
> vec_sel built-in definitions.
> 
>   const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
> same as vsc __builtin_vec_sel (vsc, vsc, vuc);  (overloaded vec_sel)
> 
>   const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
> same as vuc __builtin_vec_sel (vuc, vuc, vuc);  (overloaded vec_sel)
> 
>   const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
> same as  vd __builtin_vec_sel (vd, vd, vull);   (overloaded vec_sel)
> 
>   const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
> same as vsll __builtin_vec_sel (vsll, vsll, vsll);  (overloaded vec_sel)
> 
>   const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
> same as vull __builtin_vec_sel (vull, vull, vsll);  (overloaded vec_sel)
> 
>   const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
> same as vf __builtin_vec_sel (vf, vf, vsi)          (overloaded vec_sel)
> 
>   const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
> same as vsi __builtin_vec_sel (vsi, vsi, vbi);      (overloaded vec_sel)
> 
>   const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
> same as vui __builtin_vec_sel (vui, vui, vui);      (overloaded vec_sel)
> 
>   const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
> same as vss __builtin_vec_sel (vss, vss, vbs);      (overloaded vec_sel)
> 
>   const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
> same as vus __builtin_vec_sel (vus, vus, vus);      (overloaded vec_sel)
> 
> This patch removed the duplicate built-in definitions so users will only
> use the documented vec_sel built-in.  The __builtin_vsx_xxsel_[4si, 8hi,
> 16qi, 4sf, 2df] tests are also removed.

OK for trunk, thanks!

BR,
Kewen

> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_16qi,
> 	__builtin_vsx_xxsel_16qi_uns, __builtin_vsx_xxsel_2df,
> 	__builtin_vsx_xxsel_2di,	__builtin_vsx_xxsel_2di_uns,
> 	__builtin_vsx_xxsel_4sf, 	__builtin_vsx_xxsel_4si,
> 	__builtin_vsx_xxsel_4si_uns, 	__builtin_vsx_xxsel_8hi,
> 	__builtin_vsx_xxsel_8hi_uns): Remove 	built-in definitions.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_xxsel_4si,
> 	__builtin_vsx_xxsel_8hi, __builtin_vsx_xxsel_16qi,
> 	__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_2df,
> 	__builtin_vsx_xxsel): Change built-in call to overloaded built-in
> 	call vec_sel.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         | 30 ----------------
>  .../gcc.target/powerpc/vsx-builtin-3.c        | 36 ++++++++++---------
>  2 files changed, 19 insertions(+), 47 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index ea0da77f13e..a78c52183bc 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1898,36 +1898,6 @@
>    const vss __builtin_vsx_xxpermdi_8hi (vss, vss, const int<2>);
>      XXPERMDI_8HI vsx_xxpermdi_v8hi {}
>  
> -  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
> -    XXSEL_16QI vector_select_v16qi {}
> -
> -  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
> -    XXSEL_16QI_UNS vector_select_v16qi_uns {}
> -
> -  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
> -    XXSEL_2DF vector_select_v2df {}
> -
> -  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
> -    XXSEL_2DI vector_select_v2di {}
> -
> -  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
> -    XXSEL_2DI_UNS vector_select_v2di_uns {}
> -
> -  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
> -    XXSEL_4SF vector_select_v4sf {}
> -
> -  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
> -    XXSEL_4SI vector_select_v4si {}
> -
> -  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
> -    XXSEL_4SI_UNS vector_select_v4si_uns {}
> -
> -  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
> -    XXSEL_8HI vector_select_v8hi {}
> -
> -  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
> -    XXSEL_8HI_UNS vector_select_v8hi_uns {}
> -
>    const vsc __builtin_vsx_xxsldwi_16qi (vsc, vsc, const int<2>);
>      XXSLDWI_16QI vsx_xxsldwi_v16qi {}
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
> index ff875c55304..e20d3f03c86 100644
> --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
> @@ -37,6 +37,8 @@
>  /* { dg-final { scan-assembler "xvcvsxdsp" } } */
>  /* { dg-final { scan-assembler "xvcvuxdsp" } } */
>  
> +#include <altivec.h>
> +
>  extern __vector int si[][4];
>  extern __vector short ss[][4];
>  extern __vector signed char sc[][4];
> @@ -61,23 +63,23 @@ int do_sel(void)
>  {
>    int i = 0;
>  
> -  si[i][0] = __builtin_vsx_xxsel_4si (si[i][1], si[i][2], si[i][3]); i++;
> -  ss[i][0] = __builtin_vsx_xxsel_8hi (ss[i][1], ss[i][2], ss[i][3]); i++;
> -  sc[i][0] = __builtin_vsx_xxsel_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
> -  f[i][0] = __builtin_vsx_xxsel_4sf (f[i][1], f[i][2], f[i][3]); i++;
> -  d[i][0] = __builtin_vsx_xxsel_2df (d[i][1], d[i][2], d[i][3]); i++;
> -
> -  si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], bi[i][3]); i++;
> -  ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], bs[i][3]); i++;
> -  sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], bc[i][3]); i++;
> -  f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], bi[i][3]); i++;
> -  d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], bl[i][3]); i++;
> -
> -  si[i][0] = __builtin_vsx_xxsel (si[i][1], si[i][2], ui[i][3]); i++;
> -  ss[i][0] = __builtin_vsx_xxsel (ss[i][1], ss[i][2], us[i][3]); i++;
> -  sc[i][0] = __builtin_vsx_xxsel (sc[i][1], sc[i][2], uc[i][3]); i++;
> -  f[i][0] = __builtin_vsx_xxsel (f[i][1], f[i][2], ui[i][3]); i++;
> -  d[i][0] = __builtin_vsx_xxsel (d[i][1], d[i][2], ul[i][3]); i++;
> +  si[i][0] = vec_sel (si[i][1], si[i][2], ui[i][3]); i++;
> +  ss[i][0] = vec_sel (ss[i][1], ss[i][2], us[i][3]); i++;
> +  sc[i][0] = vec_sel (sc[i][1], sc[i][2], uc[i][3]); i++;
> +  f[i][0] = vec_sel (f[i][1], f[i][2], f[i][3]); i++;
> +  d[i][0] = vec_sel (d[i][1], d[i][2], d[i][3]); i++;
> +
> +  si[i][0] = vec_sel (si[i][1], si[i][2], bi[i][3]); i++;
> +  ss[i][0] = vec_sel (ss[i][1], ss[i][2], bs[i][3]); i++;
> +  sc[i][0] = vec_sel (sc[i][1], sc[i][2], bc[i][3]); i++;
> +  f[i][0] = vec_sel (f[i][1], f[i][2], bi[i][3]); i++;
> +  d[i][0] = vec_sel (d[i][1], d[i][2], bl[i][3]); i++;
> +
> +  si[i][0] = vec_sel (si[i][1], si[i][2], ui[i][3]); i++;
> +  ss[i][0] = vec_sel (ss[i][1], ss[i][2], us[i][3]); i++;
> +  sc[i][0] = vec_sel (sc[i][1], sc[i][2], uc[i][3]); i++;
> +  f[i][0] = vec_sel (f[i][1], f[i][2], ui[i][3]); i++;
> +  d[i][0] = vec_sel (d[i][1], d[i][2], ul[i][3]); i++;
>  
>    return i;
>  }




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins
  2024-05-29 16:06 ` [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins Carl Love
@ 2024-06-04  5:58   ` Kewen.Lin
  0 siblings, 0 replies; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:58 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/30 00:06, Carl Love wrote:
> This was patch 8 in the previous series.  Updated patch per the feedback comments.
> 
>                             Carl 
> --------------------------------------------------------------------
> 
> rs6000, remove __builtin_vsx_vperm_* built-ins
> 
> The undocumented built-ins:
>   __builtin_vsx_vperm_16qi_uns,
>   __builtin_vsx_vperm_1ti,
>   __builtin_vsx_vperm_1ti_uns,
>   __builtin_vsx_vperm_2df,
>   __builtin_vsx_vperm_2di,
>   __builtin_vsx_vperm_2di_uns,
>   __builtin_vsx_vperm_4sf,
>   __builtin_vsx_vperm_4si,
>   __builtin_vsx_vperm_4si_uns
> 
> are duplicats of the __builtin_altivec_* builtins that are used by
> the overloaded vec_perm built-in that is documented in the PVIPR.

OK for trunk, thanks!

BR,
Kewen

> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_vperm_16qi_uns,
> 	__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
> 	__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
> 	__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
> 	__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns): Remove
> 	built-in definitions and comments.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_vperm_16qi_uns,
> 	__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
> 	__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
> 	__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
> 	__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns,
> 	__builtin_vsx_vperm): Change call to built-in to the  overloaded
> 	built-in vec_perm.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         | 33 -------------------
>  .../gcc.target/powerpc/vsx-builtin-3.c        | 22 ++++++-------
>  2 files changed, 11 insertions(+), 44 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index a78c52183bc..f02a8c4de45 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1529,39 +1529,6 @@
>    const vf __builtin_vsx_uns_floato_v2di (vsll);
>      UNS_FLOATO_V2DI unsfloatov2di {}
>  
> -; These are duplicates of __builtin_altivec_* counterparts, and are being
> -; kept for backwards compatibility.  The reason for their existence is
> -; unclear.  TODO: Consider deprecation/removal at some point.
> -  const vsc __builtin_vsx_vperm_16qi (vsc, vsc, vuc);
> -    VPERM_16QI_X altivec_vperm_v16qi {}
> -
> -  const vuc __builtin_vsx_vperm_16qi_uns (vuc, vuc, vuc);
> -    VPERM_16QI_UNS_X altivec_vperm_v16qi_uns {}
> -
> -  const vsq __builtin_vsx_vperm_1ti (vsq, vsq, vsc);
> -    VPERM_1TI_X altivec_vperm_v1ti {}
> -
> -  const vsq __builtin_vsx_vperm_1ti_uns (vsq, vsq, vsc);
> -    VPERM_1TI_UNS_X altivec_vperm_v1ti_uns {}
> -
> -  const vd __builtin_vsx_vperm_2df (vd, vd, vuc);
> -    VPERM_2DF_X altivec_vperm_v2df {}
> -
> -  const vsll __builtin_vsx_vperm_2di (vsll, vsll, vuc);
> -    VPERM_2DI_X altivec_vperm_v2di {}
> -
> -  const vull __builtin_vsx_vperm_2di_uns (vull, vull, vuc);
> -    VPERM_2DI_UNS_X altivec_vperm_v2di_uns {}
> -
> -  const vf __builtin_vsx_vperm_4sf (vf, vf, vuc);
> -    VPERM_4SF_X altivec_vperm_v4sf {}
> -
> -  const vsi __builtin_vsx_vperm_4si (vsi, vsi, vuc);
> -    VPERM_4SI_X altivec_vperm_v4si {}
> -
> -  const vui __builtin_vsx_vperm_4si_uns (vui, vui, vuc);
> -    VPERM_4SI_UNS_X altivec_vperm_v4si_uns {}
> -
>    const vss __builtin_vsx_vperm_8hi (vss, vss, vuc);
>      VPERM_8HI_X altivec_vperm_v8hi {}
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
> index e20d3f03c86..f06d871b6b1 100644
> --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
> @@ -88,17 +88,17 @@ int do_perm(void)
>  {
>    int i = 0;
>  
> -  si[i][0] = __builtin_vsx_vperm_4si (si[i][1], si[i][2], uc[i][3]); i++;
> -  ss[i][0] = __builtin_vsx_vperm_8hi (ss[i][1], ss[i][2], uc[i][3]); i++;
> -  sc[i][0] = __builtin_vsx_vperm_16qi (sc[i][1], sc[i][2], uc[i][3]); i++;
> -  f[i][0] = __builtin_vsx_vperm_4sf (f[i][1], f[i][2], uc[i][3]); i++;
> -  d[i][0] = __builtin_vsx_vperm_2df (d[i][1], d[i][2], uc[i][3]); i++;
> -
> -  si[i][0] = __builtin_vsx_vperm (si[i][1], si[i][2], uc[i][3]); i++;
> -  ss[i][0] = __builtin_vsx_vperm (ss[i][1], ss[i][2], uc[i][3]); i++;
> -  sc[i][0] = __builtin_vsx_vperm (sc[i][1], sc[i][2], uc[i][3]); i++;
> -  f[i][0] = __builtin_vsx_vperm (f[i][1], f[i][2], uc[i][3]); i++;
> -  d[i][0] = __builtin_vsx_vperm (d[i][1], d[i][2], uc[i][3]); i++;
> +  si[i][0] = vec_perm (si[i][1], si[i][2], uc[i][3]); i++;
> +  ss[i][0] = vec_perm (ss[i][1], ss[i][2], uc[i][3]); i++;
> +  sc[i][0] = vec_perm (sc[i][1], sc[i][2], uc[i][3]); i++;
> +  f[i][0] = vec_perm (f[i][1], f[i][2], uc[i][3]); i++;
> +  d[i][0] = vec_perm (d[i][1], d[i][2], uc[i][3]); i++;
> +
> +  si[i][0] = vec_perm (si[i][1], si[i][2], uc[i][3]); i++;
> +  ss[i][0] = vec_perm (ss[i][1], ss[i][2], uc[i][3]); i++;
> +  sc[i][0] = vec_perm (sc[i][1], sc[i][2], uc[i][3]); i++;
> +  f[i][0] = vec_perm (f[i][1], f[i][2], uc[i][3]); i++;
> +  d[i][0] = vec_perm (d[i][1], d[i][2], uc[i][3]); i++;
>  
>    return i;
>  }


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args
  2024-05-29 16:10 ` [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
@ 2024-06-04  5:58   ` Kewen.Lin
  2024-06-13 15:35     ` Carl Love
  0 siblings, 1 reply; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:58 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/30 00:10, Carl Love wrote:
>  This was patch 10 from the previous series.  The patch was updated to address feedback comments.
> 
>                             Carl 
> ---------------------------------------------------
> 
> rs6000, extend vec_xxpermdi built-in for __int128 args
> 
> Add a new signed and unsigned overloaded instances for vec_xxpermdi
> 
>    __int128 vec_xxpermdi (__int128, __int128, const int);
>    __uint128 vec_xxpermdi (__uint128, __uint128, const int);
> 
> Update the documentation to include a reference to the new built-in
> instances.
> 
> Add test cases for the new overloaded instances.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
> 	overloaded built-in instances.
> 	* doc/extend.texi:  Add documentation for new overloaded built-in
> 	instances.
> 
> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
> ---
>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>  gcc/doc/extend.texi                           |   2 +
>  .../powerpc/vec_perm-runnable-i128.c          | 229 ++++++++++++++++++
>  3 files changed, 235 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
> 
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index a210c5ad10d..45000f161e4 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4932,6 +4932,10 @@
>      XXPERMDI_4SF  XXPERMDI_VF
>    vd __builtin_vsx_xxpermdi (vd, vd, const int);
>      XXPERMDI_2DF  XXPERMDI_VD
> +  vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
> +    XXPERMDI_1TI  XXPERMDI_1TI
> +  vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
> +    XXPERMDI_1TI  XXPERMDI_1TUI

Nits:
  - Move them before "vf __builtin_vsx_xxpermdi (vf, vf, const int);" so
    they are close to instances for other integral types.
  - As the existing name convention, _{SQ,UQ} are better.

    vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
       XXPERMDI_1TI  XXPERMDI_1SQ
    vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
       XXPERMDI_1TI  XXPERMDI_1UQ

>  
>  [VEC_XXSLDWI, vec_xxsldwi, __builtin_vsx_xxsldwi]
>    vsc __builtin_vsx_xxsldwi (vsc, vsc, const int);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 0756230b19e..edfef1bdab7 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22555,6 +22555,8 @@ void vec_vsx_st (vector bool char, int, signed char *);
>  vector double vec_xxpermdi (vector double, vector double, const int);
>  vector float vec_xxpermdi (vector float, vector float, const int);
>  vector long long vec_xxpermdi (vector long long, vector long long, const int);

> +vector __int128 vec_xxpermdi (vector __int128, vector __int128, const int);
> +vector __int128 vec_xxpermdi (vector __uint128, vector __uint128, const int);

Nit: These two lines break the long long and unsigned long long lines, can you move
them one line upward?  Also using the explicit "signed" and "unsigned" would be
better than "__{u,}int128".

>  vector unsigned long long vec_xxpermdi (vector unsigned long long,
>                                          vector unsigned long long, const int);
>  vector int vec_xxpermdi (vector int, vector int, const int);
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
> new file mode 100644
> index 00000000000..2d5dce09404
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
> @@ -0,0 +1,229 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target vmx_hw } */
> +/* { dg-options "-save-temps" } */

Nit: dg-options line isn't needed as it doesn't check assembly.

BR,
Kewen

> +
> +#include <altivec.h>
> +
> +#define DEBUG 0
> +
> +#if DEBUG
> +#include <stdio.h>
> +void print_i128 (unsigned __int128 val)
> +{
> +  printf(" 0x%016llx%016llx",
> +         (unsigned long long)(val >> 64),
> +         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
> +}
> +#endif
> +
> +extern void abort (void);
> +
> +union convert_union {
> +  vector signed __int128    s128;
> +  vector unsigned __int128  u128;
> +  char  val[16];
> +} convert;
> +
> +int check_u128_result(vector unsigned __int128 vresult_u128,
> +		      vector unsigned __int128 expected_vresult_u128)
> +{
> +  /* Use a for loop to check each byte manually so the test case will
> +     run with ISA 2.06.
> +
> +     Return 1 if they match, 0 otherwise.  */
> +
> +  int i;
> +
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.u128 = vresult_u128;
> +  expected.u128 = expected_vresult_u128;
> +
> +  /* Check if each byte of the result and expected match. */
> +  for (i = 0; i < 16; i++)
> +    {
> +      if (result.val[i] != expected.val[i])
> +	return 0;
> +    }
> +  return 1;
> +}
> +
> +int check_s128_result(vector signed __int128 vresult_s128,
> +		      vector signed __int128 expected_vresult_s128)
> +{
> +  /* Convert the arguments to unsigned, then check equality.  */
> +  union convert_union result;
> +  union convert_union expected;
> +
> +  result.s128 = vresult_s128;
> +  expected.s128 = expected_vresult_s128;
> +
> +  return check_u128_result (result.u128, expected.u128);
> +}
> +
> +
> +int
> +main (int argc, char *argv [])
> +{
> +  int i;
> +  
> +  vector signed __int128 src_va_s128;
> +  vector signed __int128 src_vb_s128;
> +  vector signed __int128 vresult_s128;
> +  vector signed __int128 expected_vresult_s128;
> +
> +  vector unsigned __int128 src_va_u128;
> +  vector unsigned __int128 src_vb_u128;
> +  vector unsigned __int128 src_vc_u128;
> +  vector unsigned __int128 vresult_u128;
> +  vector unsigned __int128 expected_vresult_u128;
> +
> +  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> +  src_va_s128 = src_va_s128 << 64; 
> +  src_va_s128 |= (vector signed __int128) {0x22446688AACCEE00};
> +  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> +  src_vb_s128 = src_vb_s128 << 64;
> +  src_vb_s128 |= (vector signed __int128) {0x3333333333333333};
> +
> +  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
> +  src_va_u128 = src_va_u128 << 64;
> +  src_va_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
> +  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
> +  src_vb_u128 = src_vb_u128 << 64;
> +  src_vb_u128 |= (vector unsigned __int128) {0x5555555555555555};
> +
> +
> +  /* Signed 128-bit arguments.  */
> +  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x1);
> +
> +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> +  /* BE expected results  */
> +  expected_vresult_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> +  expected_vresult_s128 = expected_vresult_s128 << 64;
> +  expected_vresult_s128 |= (vector signed __int128) {0x3333333333333333};
> +#else
> +  /* LE expected results  */
> +  expected_vresult_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> +  expected_vresult_s128 = expected_vresult_s128 << 64;
> +  expected_vresult_s128 |= (vector signed __int128) {0x22446688AACCEE00};
> +#endif
> +
> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x1) result does not match expected output.\n");
> +      printf ("  src_va_s128:     ");
> +      print_i128 ((unsigned __int128) src_va_s128);
> +      printf ("\n  src_vb_s128:     ");
> +      print_i128 ((unsigned __int128) src_vb_s128);
> +      printf ("\n  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_s128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_s128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x2);
> +
> +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> +  /* BE expected results  */
> +  expected_vresult_s128 = (vector signed __int128) {0x22446688AACCEE00};
> +  expected_vresult_s128 = expected_vresult_s128 << 64;
> +  expected_vresult_s128 |= (vector signed __int128) {0xFEDCBA9876543210};
> +#else
> +  /* LE expected results  */
> +  expected_vresult_s128 = (vector signed __int128) {0x3333333333333333};
> +  expected_vresult_s128 = expected_vresult_s128 << 64;
> +  expected_vresult_s128 |= (vector signed __int128) {0x123456789ABCDEF0};
> +#endif
> +
> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x2) result does not match expected output.\n");
> +      printf ("  src_va_s128:     ");
> +      print_i128 ((unsigned __int128) src_va_s128);
> +      printf ("\n  src_vb_s128:     ");
> +      print_i128 ((unsigned __int128) src_vb_s128);
> +      printf ("\n  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_s128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_s128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  /* Unigned arguments.  */
> +  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x1);
> +
> +  #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> +  /* BE expected results */
> +  expected_vresult_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
> +  expected_vresult_u128 = expected_vresult_u128 << 64;
> +  expected_vresult_u128 |= (vector unsigned __int128) {0x5555555555555555};
> +#else
> +  /* LE expected results */
> +  expected_vresult_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
> +  expected_vresult_u128 = expected_vresult_u128 << 64;
> +  expected_vresult_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
> +#endif
> +
> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x1) result does not match expected output.\n");
> +      printf ("  src_va_s128:     ");
> +      print_i128 ((unsigned __int128) src_va_s128);
> +      printf ("\n  src_vb_s128:     ");
> +      print_i128 ((unsigned __int128) src_vb_s128);
> +      printf ("\n  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_u128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_u128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +  /* Unigned arguments.  */
> +  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x2);
> +
> +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> +  /* BE expected results */
> +  expected_vresult_u128 = (vector unsigned __int128) {0x1133557799BBDD00};
> +  expected_vresult_u128 = expected_vresult_u128 << 64;
> +  expected_vresult_u128 |= (vector unsigned __int128) {0xA987654FEDCB3210};
> +#else
> +  /* LE expected results */
> +  expected_vresult_u128 = (vector unsigned __int128) {0x5555555555555555};
> +  expected_vresult_u128 = expected_vresult_u128 << 64;
> +  expected_vresult_u128 |= (vector unsigned __int128) {0x13579ACE02468BDF};
> +#endif
> +  
> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
> +#if DEBUG
> +    {
> +      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x2) result does not match expected output.\n");
> +      printf ("  src_va_s128:     ");
> +      print_i128 ((unsigned __int128) src_va_s128);
> +      printf ("\n  src_vb_s128:     ");
> +      print_i128 ((unsigned __int128) src_vb_s128);
> +      printf ("\n  Result:          ");
> +      print_i128 ((unsigned __int128) vresult_u128);
> +      printf ("\n  Expected result: ");
> +      print_i128 ((unsigned __int128) expected_vresult_u128);
> +      printf ("\n");
> +    }
> +#else
> +    abort ();
> +#endif
> +
> +    return 0;
> +}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in
  2024-05-29 16:11 ` [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in Carl Love
@ 2024-06-04  5:59   ` Kewen.Lin
  0 siblings, 0 replies; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:59 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/5/30 00:11, Carl Love wrote:
> This was patch 11 from the previous series.  Patch was updated to address feedback comments.
> 
>                        Carl 
> ----------------------------------------------------------
> 
> rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in
> 
> The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded
> __builtin_altivec_vcmpeqfp_p built-in.  The built-in is undocumented and
> there are no test cases for it.  The patch removes built-in
> __builtin_vsx_xvcmpeqsp_p.

OK for trunk, thanks!

BR,
Kewen

> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp_p):
> 	Remove built-in definition.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 64690b9b9b5..48ebc018a8d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1619,9 +1619,6 @@
>    const vf __builtin_vsx_xvcmpeqsp (vf, vf);
>      XVCMPEQSP vector_eqv4sf {}
>  
> -  const signed int __builtin_vsx_xvcmpeqsp_p (signed int, vf, vf);
> -    XVCMPEQSP_P vector_eq_v4sf_p {pred}
> -
>    const vd __builtin_vsx_xvcmpgedp (vd, vd);
>      XVCMPGEDP vector_gev2df {}
>  




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins.
  2024-05-29 16:16 ` [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins Carl Love
@ 2024-06-04  5:59   ` Kewen.Lin
  2024-06-13 15:35     ` Carl Love
  0 siblings, 1 reply; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  5:59 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/30 00:16, Carl Love wrote:
> This was patch 13 from the previous series.  Note the previous series patch 12 was dropped.  This patch is the same as the previous version.  The additional work to remove  __builtin_vec_set_v1ti, __builtin_vec_set_v2di,  __builtin_vec_set_v2d per the feedback comments with equivalent gimple code is being deferred to a future patch.  The goal of this series was simply to remove duplicated built-ins, extending overloaded built-ins as needed.  Adding the needed gimple code to remove the additional built-ins is beyond the goal of this patch series.
> 
>                              Carl 
> -------------------------------------------------------
> 
> rs6000, remove vector set and vector init built-ins.
> 
> The vector init built-ins:
> 
>   __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
>   __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
>   __builtin_vec_init_v2di, __builtin_vec_init_v2df,
>   __builtin_vec_set_v1ti

Typo here, s/__builtin_vec_set_v1ti/__builtin_vec_init_v1ti/

> 
> perform the same operation as initializing the vector in C code.  For
> example:
> 
>   result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
>   result_v4si = {1, 2, 3, 4};
> 
> These two constructs were tested and verified they generate identical
> assembly instructions with no optimization and -O3 optimization.
> 
> The vector set built-ins:
> 
>   __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
>   __builtin_vec_set_v4si, __builtin_vec_set_v4sf

Please also add the reserved ones (...v1ti/v2di/v2df), as they are the 
same too, temporarily reserving them for the uses in resolve_vec_insert()
doesn't affect this.

> 
> perform the same operation as setting a specific element in the vector in
> C code.  For example:
> 
>   src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
>   src_v4si[index] = int_val;
> 
> The built-in actually generates more instructions than the inline C code
> with no optimization but is identical with -O3 optimizations.
> 
> All of the above built-ins that are removed do not have test cases and
> are not documented.
> 
> Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
> __builtin_vec_set_v2df are not removed as they are used in function
> resolve_vec_insert() in file rs6000-c.cc.
> 
> The built-ins are removed as they don't provide any benefit over just
> using C code.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
> 	__builtin_vec_init_v8hi, __builtin_vec_init_v4si,
> 	__builtin_vec_init_v4sf, __builtin_vec_init_v2di,
> 	__builtin_vec_init_v2df, __builtin_vec_set_v1ti,

Typo, s/__builtin_vec_set_v1ti/__builtin_vec_init_v1ti/

> 	__builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
> 	__builtin_vec_set_v4si, __builtin_vec_set_v4sf,
> 	__builtin_vec_set_v2di, __builtin_vec_set_v2df,
> 	__builtin_vec_set_v1ti): Remove built-in definitions.

The last three ones are not actually removed.

> ---
>  gcc/config/rs6000/rs6000-builtins.def | 42 ++-------------------------
>  1 file changed, 2 insertions(+), 40 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 48ebc018a8d..8349d45169f 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1118,37 +1118,6 @@
>    const signed short __builtin_vec_ext_v8hi (vss, signed int);
>      VEC_EXT_V8HI nothing {extract}
>  
> -  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
> -            signed char, signed char, signed char, signed char, signed char, \
> -            signed char, signed char, signed char, signed char, signed char, \
> -            signed char, signed char, signed char);
> -    VEC_INIT_V16QI nothing {init}
> -
> -  const vf __builtin_vec_init_v4sf (float, float, float, float);
> -    VEC_INIT_V4SF nothing {init}
> -
> -  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
> -                                     signed int);
> -    VEC_INIT_V4SI nothing {init}
> -
> -  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short,\
> -             signed short, signed short, signed short, signed short, \
> -             signed short);
> -    VEC_INIT_V8HI nothing {init}
> -
> -  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
> -    VEC_SET_V16QI nothing {set}
> -
> -  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
> -    VEC_SET_V4SF nothing {set}
> -
> -  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
> -    VEC_SET_V4SI nothing {set}
> -
> -  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
> -    VEC_SET_V8HI nothing {set}
> -
> -
>  ; Cell builtins.
>  [cell]
>    pure vsc __builtin_altivec_lvlx (signed long, const void *);
> @@ -1295,15 +1264,8 @@
>    const signed long long __builtin_vec_ext_v2di (vsll, signed int);
>      VEC_EXT_V2DI nothing {extract}
>  
> -  const vsq __builtin_vec_init_v1ti (signed __int128);
> -    VEC_INIT_V1TI nothing {init}
> -
> -  const vd __builtin_vec_init_v2df (double, double);
> -    VEC_INIT_V2DF nothing {init}
> -
> -  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
> -    VEC_INIT_V2DI nothing {init}
> -
> +;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
> +;; resolve_vec_insert(), rs6000-c.cc

It would be good to place one TODO here, something like:

;; TODO: Remove VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI once the uses
;; in resolve_vec_insert are replaced by the equivalent gimple statements.

>    const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
>      VEC_SET_V1TI nothing {set}
>  

BR,
Kewen


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/13 ver 3] rs6000, Remove __builtin_vsx_cmple* builtins
  2024-05-29 15:52 ` [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins Carl Love
@ 2024-06-04  6:00   ` Kewen.Lin
  2024-06-05 22:25     ` Carl Love
  0 siblings, 1 reply; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  6:00 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/5/29 23:52, Carl Love wrote:
> This patch was approved in the previous series.  There are no changes to this patch.  Reposting for completeness. 

I guess you can just push the approved ones, as there is no dependency
between any two of them?  It can help to reduce the size of this series.

BR,
Kewen

> 
>                      Carl 
> -------------------------------------------------------
> 
> rs6000, Remove __builtin_vsx_cmple* builtins
> 
> The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
> __builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take
> unsigned arguments and return an unsigned result.  The current definitions
> take signed arguments and return signed results which is incorrect.
> 
> The signed and unsigned versions of __builtin_vsx_cmple* are not
> documented in extend.texi.  Also there are no test cases for the
> built-ins.
> 
> Users can use the existing vec_cmple as PVIPR defines instead of
> __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
> __builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi,
> __builtin_vsx_cmple_16qi, __builtin_vsx_cmple_2di,
> __builtin_vsx_cmple_4si and __builtin_vsx_cmple_8hi,
> __builtin_altivec_cmple_1ti, __builtin_altivec_cmple_u1ti.
> 
> Hence these built-ins are redundant and are removed by this patch.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtin.cc (RS6000_BIF_CMPLE_16QI,
> 	RS6000_BIF_CMPLE_U16QI, RS6000_BIF_CMPLE_8HI,
> 	RS6000_BIF_CMPLE_U8HI, RS6000_BIF_CMPLE_4SI, RS6000_BIF_CMPLE_U4SI,
> 	RS6000_BIF_CMPLE_2DI, RS6000_BIF_CMPLE_U2DI, RS6000_BIF_CMPLE_1TI,
> 	RS6000_BIF_CMPLE_U1TI): Remove case statements.
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_cmple_16qi,
> 	__builtin_vsx_cmple_2di, __builtin_vsx_cmple_4si,
> 	__builtin_vsx_cmple_8hi, __builtin_vsx_cmple_u16qi,
> 	__builtin_vsx_cmple_u2di, __builtin_vsx_cmple_u4si,
> 	__builtin_vsx_cmple_u8hi): Remove buit-in definitions.
> ---
>  gcc/config/rs6000/rs6000-builtin.cc   | 13 ------------
>  gcc/config/rs6000/rs6000-builtins.def | 30 ---------------------------
>  2 files changed, 43 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
> index 320affd79e3..ac9f16fe51a 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -2027,19 +2027,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>        fold_compare_helper (gsi, GT_EXPR, stmt);
>        return true;
>  
> -    case RS6000_BIF_CMPLE_16QI:
> -    case RS6000_BIF_CMPLE_U16QI:
> -    case RS6000_BIF_CMPLE_8HI:
> -    case RS6000_BIF_CMPLE_U8HI:
> -    case RS6000_BIF_CMPLE_4SI:
> -    case RS6000_BIF_CMPLE_U4SI:
> -    case RS6000_BIF_CMPLE_2DI:
> -    case RS6000_BIF_CMPLE_U2DI:
> -    case RS6000_BIF_CMPLE_1TI:
> -    case RS6000_BIF_CMPLE_U1TI:
> -      fold_compare_helper (gsi, LE_EXPR, stmt);
> -      return true;
> -
>      /* flavors of vec_splat_[us]{8,16,32}.  */
>      case RS6000_BIF_VSPLTISB:
>      case RS6000_BIF_VSPLTISH:
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 3bc7fed6956..7c36976a089 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1337,30 +1337,6 @@
>    const vss __builtin_vsx_cmpge_u8hi (vus, vus);
>      CMPGE_U8HI vector_nltuv8hi {}
>  
> -  const vsc __builtin_vsx_cmple_16qi (vsc, vsc);
> -    CMPLE_16QI vector_ngtv16qi {}
> -
> -  const vsll __builtin_vsx_cmple_2di (vsll, vsll);
> -    CMPLE_2DI vector_ngtv2di {}
> -
> -  const vsi __builtin_vsx_cmple_4si (vsi, vsi);
> -    CMPLE_4SI vector_ngtv4si {}
> -
> -  const vss __builtin_vsx_cmple_8hi (vss, vss);
> -    CMPLE_8HI vector_ngtv8hi {}
> -
> -  const vsc __builtin_vsx_cmple_u16qi (vsc, vsc);
> -    CMPLE_U16QI vector_ngtuv16qi {}
> -
> -  const vsll __builtin_vsx_cmple_u2di (vsll, vsll);
> -    CMPLE_U2DI vector_ngtuv2di {}
> -
> -  const vsi __builtin_vsx_cmple_u4si (vsi, vsi);
> -    CMPLE_U4SI vector_ngtuv4si {}
> -
> -  const vss __builtin_vsx_cmple_u8hi (vss, vss);
> -    CMPLE_U8HI vector_ngtuv8hi {}
> -
>    const vd __builtin_vsx_concat_2df (double, double);
>      CONCAT_2DF vsx_concat_v2df {}
>  
> @@ -3117,12 +3093,6 @@
>    const vbq __builtin_altivec_cmpge_u1ti (vuq, vuq);
>      CMPGE_U1TI vector_nltuv1ti {}
>  
> -  const vbq __builtin_altivec_cmple_1ti (vsq, vsq);
> -    CMPLE_1TI vector_ngtv1ti {}
> -
> -  const vbq __builtin_altivec_cmple_u1ti (vuq, vuq);
> -    CMPLE_U1TI vector_ngtuv1ti {}
> -
>    const unsigned long long __builtin_altivec_cntmbb (vuc, const int<1>);
>      VCNTMBB vec_cntmb_v16qi {}
>  


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions
  2024-05-29 16:00 ` [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions Carl Love
@ 2024-06-04  6:20   ` Kewen.Lin
  0 siblings, 0 replies; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  6:20 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/30 00:00, Carl Love wrote:
> This is a new patch to removed the built-ins that were inadvertently missing in the previous series.
> 
>                               Carl 
> --------------------------------------------------------------
> 
> rs6000, Remove redundant float/double type conversions

Nit: s! float/double type conversions! vector float/double conversion builtins!

OK for trunk with this subject tweaked.

BR,
Kewen

> 
> The following built-ins are redundant as they are covered by another
> overloaded built-in.
> 
>   __builtin_vsx_xvcvspdp covered by vec_double{e,o}
>   __builtin_vsx_xvcvdpsp covered by vec_float{e,o}
>   __builtin_vsx_xvcvsxwdp covered by vec_double{e,o}
>   __builtin_vsx_xvcvuxddp_uns covered by  vec_double
> 
> Remove the redundant built-ins. They are not documented nor do they have
> test cases.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspdp,
> 	__builtin_vsx_xvcvdpsp, __builtin_vsx_xvcvsxwdp,
> 	__builtin_vsx_xvcvuxddp_uns): Remove.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index cea2649b86c..6049f3a4599 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1679,9 +1679,6 @@
>    const signed int __builtin_vsx_xvcmpgtsp_p (signed int, vf, vf);
>      XVCMPGTSP_P vector_gt_v4sf_p {pred}
>  
> -  const vf __builtin_vsx_xvcvdpsp (vd);
> -    XVCVDPSP vsx_xvcvdpsp {}
> -
>    const vsll __builtin_vsx_xvcvdpsxds (vd);
>      XVCVDPSXDS vsx_fix_truncv2dfv2di2 {}
>  
> @@ -1691,9 +1688,6 @@
>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>  
> -  const vd __builtin_vsx_xvcvspdp (vf);
> -    XVCVSPDP vsx_xvcvspdp {}
> -
>    const vsll __builtin_vsx_xvcvspsxds (vf);
>      VEC_VSIGNEDE_V4SF vsignede_v4sf {}
>  
> @@ -1715,9 +1709,6 @@
>    const vf __builtin_vsx_xvcvsxdsp (vsll);
>      XVCVSXDSP vsx_xvcvsxdsp {}
>  
> -  const vd __builtin_vsx_xvcvsxwdp (vsi);
> -    XVCVSXWDP vsx_xvcvsxwdp {}
> -
>    const vf __builtin_vsx_xvcvsxwsp (vsi);
>      XVCVSXWSP vsx_floatv4siv4sf2 {}
>  
> @@ -1727,9 +1718,6 @@
>    const vd __builtin_vsx_xvcvuxddp_scale (vsll, const int<5>);
>      XVCVUXDDP_SCALE vsx_xvcvuxddp_scale {}
>  
> -  const vd __builtin_vsx_xvcvuxddp_uns (vull);
> -    XVCVUXDDP_UNS vsx_floatunsv2div2df2 {}
> -
>    const vf __builtin_vsx_xvcvuxdsp (vull);
>      XVCVUXDSP vsx_xvcvuxdsp {}
>  


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins
  2024-05-29 15:58 ` [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins Carl Love
@ 2024-06-04  7:19   ` Kewen.Lin
  2024-06-13 15:35     ` Carl Love
  0 siblings, 1 reply; 30+ messages in thread
From: Kewen.Lin @ 2024-06-04  7:19 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi,

on 2024/5/29 23:58, Carl Love wrote:
> Updated the patch per the feedback comments from the previous version.
> 
>                                  Carl 
> -------------------------------------------------------
> 
> rs6000, extend the current vec_{un,}signed{e,o} built-ins
> 
> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
> convert a vector of floats to signed/unsigned long long ints.  Extend the
> existing vec_{un,}signed{e,o} built-ins to handle the argument
> vector of floats to return the even/odd signed/unsigned integers.
> 
> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
> built-ins.
> 
> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
> now for internal use only. They are not documented and they do not
> have testcases.
> > The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
> vec_signed{e,o}, remove.
> 
> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
> vec_unsigned{e,o}, remove.
> 
> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
> vec_unsigned, remove.
> 
> The __builtin_vsx_xvcvspuxws is redundante as it is covered by
> vec_unsigned, remove.

I perfer to move these removals into sub-patch 2/13 or split them out into
a new patch, since they don't match the subject of this patch.  Moving it
to sub-patch 2/13 looks good as they are all about vec_{un,}signed{,e,o}.

> 
> Add testcases and update documentation.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low,
> 	__builtin_vsx_xvcvspuxds_low): New built-in definitions.
> 	(__builtin_vsx_xvcvspuxds): Fix return type.
> 	(XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
> 	VEC_VUNSIGNEDE_V4SF respectively.
> 	(vsx_xvcvspsxds, vsx_xvcvspuxds): Renamed vsignede_v4sf,
> 	vunsignede_v4sf respectively.
> 	(__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws,
> 	__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws): Removed.
> 	* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
> 	vec_unsignede,vec_unsignedo):  Add new overloaded specifications.
> 	* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
> 	vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
> 	* doc/extend.texi (vec_signedo, vec_signede): Add documentation.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/builtins-3-runnable.c: New tests for the added
> 	overloaded built-ins.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         | 25 ++----
>  gcc/config/rs6000/rs6000-overload.def         |  8 ++
>  gcc/config/rs6000/vsx.md                      | 88 +++++++++++++++++++
>  gcc/doc/extend.texi                           | 10 +++
>  .../gcc.target/powerpc/builtins-3-runnable.c  | 51 +++++++++--
>  5 files changed, 157 insertions(+), 25 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index bf9a0ae22fc..cea2649b86c 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1688,32 +1688,23 @@
>    const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
>      XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
>  
> -  const vsi __builtin_vsx_xvcvdpsxws (vd);
> -    XVCVDPSXWS vsx_xvcvdpsxws {}
> -
> -  const vsll __builtin_vsx_xvcvdpuxds (vd);
> -    XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
> -
>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>  
> -  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
> -    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
> -
> -  const vsi __builtin_vsx_xvcvdpuxws (vd);
> -    XVCVDPUXWS vsx_xvcvdpuxws {}
> -
>    const vd __builtin_vsx_xvcvspdp (vf);
>      XVCVSPDP vsx_xvcvspdp {}
>  
>    const vsll __builtin_vsx_xvcvspsxds (vf);
> -    XVCVSPSXDS vsx_xvcvspsxds {}
> +    VEC_VSIGNEDE_V4SF vsignede_v4sf {}

We should rename __builtin_vsx_xvcvspsxds to
__builtin_vsx_vsignede_v4sf, one reason is to align with
the existing others, one more important thing
is that it doesn't generate 1-1 mapping xvcvspsxds,
putting that mnemonic can be misleading.

> +
> +  const vsll __builtin_vsx_xvcvspsxds_low (vf);

Ditto.

> +    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
>  
> -  const vsll __builtin_vsx_xvcvspuxds (vf);> -    XVCVSPUXDS vsx_xvcvspuxds {}
> +  const vull __builtin_vsx_xvcvspuxds (vf);

Ditto.

> +    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
>  
> -  const vsi __builtin_vsx_xvcvspuxws (vf);
> -    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
> +  const vull __builtin_vsx_xvcvspuxds_low (vf);

Ditto.

> +    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
>  
>    const vd __builtin_vsx_xvcvsxddp (vsll);
>      XVCVSXDDP vsx_floatv2div2df2 {}
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index 84bd9ae6554..4d857bb1af3 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -3307,10 +3307,14 @@
>  [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
>    vsi __builtin_vec_vsignede (vd);
>      VEC_VSIGNEDE_V2DF
> +  vsll __builtin_vec_vsignede (vf);
> +    VEC_VSIGNEDE_V4SF
>  
>  [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
>    vsi __builtin_vec_vsignedo (vd);
>      VEC_VSIGNEDO_V2DF
> +  vsll __builtin_vec_vsignedo (vf);
> +    VEC_VSIGNEDO_V4SF
>  
>  [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
>    vsi __builtin_vec_signexti (vsc);
> @@ -4433,10 +4437,14 @@
>  [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
>    vui __builtin_vec_vunsignede (vd);
>      VEC_VUNSIGNEDE_V2DF
> +  vull __builtin_vec_vunsignede (vf);
> +    VEC_VUNSIGNEDE_V4SF
>  
>  [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
>    vui __builtin_vec_vunsignedo (vd);
>      VEC_VUNSIGNEDO_V2DF
> +  vull __builtin_vec_vunsignedo (vf);
> +    VEC_VUNSIGNEDO_V4SF
>  
>  [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
>    vui __builtin_vec_extract_exp (vf);
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index f135fa079bd..a8f3d459232 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -2704,6 +2704,94 @@ (define_expand "vsx_xvcvsp<su>xds"
>    DONE;
>  })
>  
> +;; Convert low vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit signed

Maybe:

;; Convert float vector even elements to {un,}signed long long vector

> +(define_expand "vsignede_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +       /* Shift left one word to put even word in correct location */
> +       rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +       rtx rtx_val = GEN_INT (4);
> +       emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +       emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
> +    }

I think this is wrong, even elements on BE is word 0 and 2, it doesn't
requires vector shifting (similar to doublee<mode>2), while LE needs.

> +  else
> +    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
> +;; Convert high vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit signed

Ditto.

> +(define_expand "vsignedo_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));

As above, this is for odd elements, so BE needs vector shifting while LE doesn't.

The vunsigned* below need the according fixes.

> +  else
> +    {
> +      /* Shift left one word to put even word in correct location */
> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
> +    }
> +
> +  DONE;
> +})
> +
> +;; Convert low vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit unsigned integers.
> +(define_expand "vunsignede_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      /* Shift left one word to put even word in correct location */
> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
> +    }
> +  else
> +    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
> +;; Convert high vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit unsigned integers.
> +(define_expand "vunsignedo_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
> +  else
> +    {
> +      /* Shift left one word to put even word in correct location */
> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +					  rtx_val));
> +      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
> +    }
> +
> +  DONE;
> +})
> +
>  ;; Generate float2 double
>  ;; convert two double to float
>  (define_expand "float2_v2df"
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 267fccd1512..b88e61641a2 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22577,6 +22577,16 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
>  @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>  
> +@smallexample
> +vector signed signed long long vec_signedo (vector float);
> +vector signed signed long long vec_signede (vector float);
> +vector unsigned signed long long vec_signedo (vector float);
> +vector unsigned signed long long vec_signede (vector float);
> +@end smallexample

Nit: s/signed long/long/

BR,
Kewen

> +
> +The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are
> +additional extensions to the built-ins as documented in the PVIPR.
> +
>  @node PowerPC AltiVec Built-in Functions Available on ISA 2.07
>  @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> index 5dcdfbee791..557befc9a4a 100644
> --- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> @@ -3,7 +3,7 @@
>  /* { dg-options "-maltivec -mvsx" } */
>  
>  #include <altivec.h> // vector
> -
> +#define DEBUG 1
>  #ifdef DEBUG
>  #include <stdio.h>
>  #endif
> @@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned int vec_result,
>  }
>  
>  void test_ll_int_result(vector long long int vec_result,
> -			vector long long int vec_expected)
> +			vector long long int vec_expected,
> +			char *string)
>  {
>  	int i;
>  
>  	for (i = 0; i < 2; i++)
>  		if (vec_result[i] != vec_expected[i]) {
>  #ifdef DEBUG
> -			printf("Test_ll_int_result: ");
> +			printf("Test_ll_int_result %s: ", string);
>  			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
>  			       i, vec_result[i], i, vec_expected[i]);
>  #else
> @@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result,
>  }
>  
>  void test_ll_unsigned_int_result(vector long long unsigned int vec_result,
> -				 vector long long unsigned int vec_expected)
> +				 vector long long unsigned int vec_expected,
> +				 char *string)
>  {
>  	int i;
>  
>  	for (i = 0; i < 2; i++)
>  		if (vec_result[i] != vec_expected[i]) {
>  #ifdef DEBUG
> -			printf("Test_ll_unsigned_int_result: ");
> +			printf("Test_ll_unsigned_int_result %s: ", string);
>  			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
>  			       i, vec_result[i], i, vec_expected[i]);
>  #else
> @@ -292,7 +294,8 @@ int main()
>  	vec_dble0 = (vector double){-124.930, 81234.49};
>  	vec_ll_int_expected = (vector long long signed int){-124, 81234};
>  	vec_ll_int_result = vec_signed (vec_dble0);
> -	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected);
> +	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
> +			    "vec_signed");
>  
>  	/* Convert double precision vector float to vector int, even words */
>  	vec_dble0 = (vector double){-124.930, 81234.49};
> @@ -321,12 +324,44 @@ int main()
>  	test_unsigned_int_result (ALL, vec_uns_int_result,
>  				  vec_uns_int_expected);
>  
> +	/* Convert single precision vector float, even args, to vector
> +	   signed long long int.  */
> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +	vec_ll_int_expected = (vector signed long long int){834, -5};
> +	vec_ll_int_result = vec_signede (vec_flt0);
> +	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
> +			    "vec_signede");
> +
> +	/* Convert single precision vector float, odd args, to vector
> +	   signed long long int.  */
> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +	vec_ll_int_expected = (vector signed long long int){14, -3};
> +	vec_ll_int_result = vec_signedo (vec_flt0);
> +	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
> +			    "vec_signedo");
> +
> +	/* Convert single precision vector float, even args, to vector
> +	   unsigned long long int.  */
> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +	vec_ll_uns_int_expected = (vector unsigned long long int){834, 0};
> +	vec_ll_uns_int_result = vec_unsignede (vec_flt0);
> +	test_ll_unsigned_int_result (vec_ll_uns_int_result,
> +				     vec_ll_uns_int_expected, "vec_unsignede");
> +
> +	/* Convert single precision vector float, odd args, to vector
> +	   unsigned long long int.  */
> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +	vec_ll_uns_int_expected = (vector unsigned long long int){14, 0};
> +	vec_ll_uns_int_result = vec_unsignedo (vec_flt0);
> +	test_ll_unsigned_int_result (vec_ll_uns_int_result,
> +				     vec_ll_uns_int_expected, "vec_unsignedo");
> +
>  	/* Convert double precision float to long long unsigned int */
>  	vec_dble0 = (vector double){124.930, 8134.49};
>  	vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
>  	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>  	test_ll_unsigned_int_result (vec_ll_uns_int_result,
> -				     vec_ll_uns_int_expected);
> +				     vec_ll_uns_int_expected, "vec_unsigned");
>  
>  	/* Convert double precision float to long long unsigned int. Negative
>  	   arguments.  */
> @@ -334,7 +369,7 @@ int main()
>  	vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
>  	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>  	test_ll_unsigned_int_result (vec_ll_uns_int_result,
> -				     vec_ll_uns_int_expected);
> +				     vec_ll_uns_int_expected, "vec_unsigned");
>  
>  	/* Convert double precision vector float to vector unsigned int,
>  	   even words.  Negative arguments */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/13 ver 3] rs6000, Remove __builtin_vsx_cmple* builtins
  2024-06-04  6:00   ` [PATCH 1/13 ver 3] rs6000, " Kewen.Lin
@ 2024-06-05 22:25     ` Carl Love
  2024-06-06  2:40       ` Kewen.Lin
  0 siblings, 1 reply; 30+ messages in thread
From: Carl Love @ 2024-06-05 22:25 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: gcc-patches, Segher Boessenkool, bergner, Carl Love

Kewen:

On 6/3/24 23:00, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2024/5/29 23:52, Carl Love wrote:
>> This patch was approved in the previous series.  There are no changes to this patch.  Reposting for completeness. 
> I guess you can just push the approved ones, as there is no dependency
> between any two of them?  It can help to reduce the size of this series.

The patches do touch some similar files so they are not completely independent from a patch standpoint.  Functionally they are all independent.

I tried applying the approved patches only to the current mainline tree.  The approved patches were: 1,3,5 (with tweak), 6, 8, 9, 10, 12.  Patch 5 requires a little rebasing due to a little fuzz in the lines.  Not a big deal.  Patch 8 also doesn't apply cleanly with git.  The patch command gets a little confused when I tried to use it, so I had to manually "recreate" the patch.  The changes are straight forward so that is fairly easy.  The rest of the patches applied cleanly with git. I am guessing there will be some rebasing needed for the non-approved patches to apply them after the approved patches.

The main reason that I reposted everything was that the patch numbers changed and I wanted it to be fairly clear what was going on.  

I toyed with the idea of committing the 8 approved patches and then working on the additional 5 but I think that is hard as I would have to manually adjust the patch numbers to keep them lined up with version 3 or version 4 has a new numbering patches 1 to 5 (i.e. remapping of version 3 patch numbers).  Either way I think it would be hard/confusing. 

Given that separating out the approved and non-approved patches causes some re-basing issues, it is probably best to just update the 5 patches, posting them as version 4 and not re-post the whole series. I will just note in the header patch 0/13 the patches that have already been approved.  I hope that is ok?

                         Carl 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/13 ver 3] rs6000, Remove __builtin_vsx_cmple* builtins
  2024-06-05 22:25     ` Carl Love
@ 2024-06-06  2:40       ` Kewen.Lin
  0 siblings, 0 replies; 30+ messages in thread
From: Kewen.Lin @ 2024-06-06  2:40 UTC (permalink / raw)
  To: Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Hi Carl,

on 2024/6/6 06:25, Carl Love wrote:
> Kewen:
> 
> On 6/3/24 23:00, Kewen.Lin wrote:
>> Hi Carl,
>>
>> on 2024/5/29 23:52, Carl Love wrote:
>>> This patch was approved in the previous series.  There are no changes to this patch.  Reposting for completeness. 
>> I guess you can just push the approved ones, as there is no dependency
>> between any two of them?  It can help to reduce the size of this series.
> 
> The patches do touch some similar files so they are not completely independent from a patch standpoint.  Functionally they are all independent.
> 
> I tried applying the approved patches only to the current mainline tree.  The approved patches were: 1,3,5 (with tweak), 6, 8, 9, 10, 12.  Patch 5 requires a little rebasing due to a little fuzz in the lines.  Not a big deal.  Patch 8 also doesn't apply cleanly with git.  The patch command gets a little confused when I tried to use it, so I had to manually "recreate" the patch.  The changes are straight forward so that is fairly easy.  The rest of the patches applied cleanly with git. I am guessing there will be some rebasing needed for the non-approved patches to apply them after the approved patches.

IMHO, you can first reorder the patches in your WIP branch (like the approved ones go first) with git rebase -i,
then rebase with the latest trunk, there may be some conflicts but I'd expect there are not many.

> 
> The main reason that I reposted everything was that the patch numbers changed and I wanted it to be fairly clear what was going on.

OK, if you push them, you can also specify the commit hashes for the pushed ones in cover letter.

> 
> I toyed with the idea of committing the 8 approved patches and then working on the additional 5 but I think that is hard as I would have to manually adjust the patch numbers to keep them lined up with version 3 or version 4 has a new numbering patches 1 to 5 (i.e. remapping of version 3 patch numbers).  Either way I think it would be hard/confusing. 
> 
> Given that separating out the approved and non-approved patches causes some re-basing issues, it is probably best to just update the 5 patches, posting them as version 4 and not re-post the whole series. I will just note in the header patch 0/13 the patches that have already been approved.  I hope that is ok?

Sure, I'm totally fine if you prefer this way. :)

BR,
Kewen


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins
  2024-06-04  7:19   ` Kewen.Lin
@ 2024-06-13 15:35     ` Carl Love
  0 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-06-13 15:35 UTC (permalink / raw)
  To: Kewen.Lin, Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Kewen:

On 6/4/24 00:19, Kewen.Lin wrote:
> Hi,
> 
> on 2024/5/29 23:58, Carl Love wrote:
>> Updated the patch per the feedback comments from the previous version.
>>
>>                                  Carl 
>> -------------------------------------------------------
>>
>> rs6000, extend the current vec_{un,}signed{e,o} built-ins
>>
>> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
>> convert a vector of floats to signed/unsigned long long ints.  Extend the
>> existing vec_{un,}signed{e,o} built-ins to handle the argument
>> vector of floats to return the even/odd signed/unsigned integers.
>>
>> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
>> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
>> built-ins.
>>
>> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
>> now for internal use only. They are not documented and they do not
>> have testcases.
>>> The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
>> vec_signed{e,o}, remove.
>>
>> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
>> vec_unsigned{e,o}, remove.
>>
>> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
>> vec_unsigned, remove.
>>
>> The __builtin_vsx_xvcvspuxws is redundante as it is covered by
>> vec_unsigned, remove.
> 
> I perfer to move these removals into sub-patch 2/13 or split them out into
> a new patch, since they don't match the subject of this patch.  Moving it
> to sub-patch 2/13 looks good as they are all about vec_{un,}signed{,e,o}.

Yes, we need to have all of the vec_unsigned in the same patch.  Moved 
__builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws to patch 2.
> 
>>
>> Add testcases and update documentation.
>>
>> gcc/ChangeLog:
>> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low,
>> 	__builtin_vsx_xvcvspuxds_low): New built-in definitions.
>> 	(__builtin_vsx_xvcvspuxds): Fix return type.
>> 	(XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
>> 	VEC_VUNSIGNEDE_V4SF respectively.
>> 	(vsx_xvcvspsxds, vsx_xvcvspuxds): Renamed vsignede_v4sf,
>> 	vunsignede_v4sf respectively.
>> 	(__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws,
>> 	__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws): Removed.
>> 	* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
>> 	vec_unsignede,vec_unsignedo):  Add new overloaded specifications.
>> 	* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
>> 	vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
>> 	* doc/extend.texi (vec_signedo, vec_signede): Add documentation.
>>
>> gcc/testsuite/ChangeLog:
>> 	* gcc.target/powerpc/builtins-3-runnable.c: New tests for the added
>> 	overloaded built-ins.
>> ---
>>  gcc/config/rs6000/rs6000-builtins.def         | 25 ++----
>>  gcc/config/rs6000/rs6000-overload.def         |  8 ++
>>  gcc/config/rs6000/vsx.md                      | 88 +++++++++++++++++++
>>  gcc/doc/extend.texi                           | 10 +++
>>  .../gcc.target/powerpc/builtins-3-runnable.c  | 51 +++++++++--
>>  5 files changed, 157 insertions(+), 25 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
>> index bf9a0ae22fc..cea2649b86c 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1688,32 +1688,23 @@
>>    const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
>>      XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
>>  
>> -  const vsi __builtin_vsx_xvcvdpsxws (vd);
>> -    XVCVDPSXWS vsx_xvcvdpsxws {}
>> -
>> -  const vsll __builtin_vsx_xvcvdpuxds (vd);
>> -    XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
>> -
>>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>>  
>> -  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
>> -    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
>> -
>> -  const vsi __builtin_vsx_xvcvdpuxws (vd);
>> -    XVCVDPUXWS vsx_xvcvdpuxws {}
>> -
>>    const vd __builtin_vsx_xvcvspdp (vf);
>>      XVCVSPDP vsx_xvcvspdp {}
>>  
>>    const vsll __builtin_vsx_xvcvspsxds (vf);
>> -    XVCVSPSXDS vsx_xvcvspsxds {}
>> +    VEC_VSIGNEDE_V4SF vsignede_v4sf {}
> 
> We should rename __builtin_vsx_xvcvspsxds to
> __builtin_vsx_vsignede_v4sf, one reason is to align with
> the existing others, one more important thing
> is that it doesn't generate 1-1 mapping xvcvspsxds,
> putting that mnemonic can be misleading.

Yes, that would be more consistent. Changed.

> 
>> +
>> +  const vsll __builtin_vsx_xvcvspsxds_low (vf);
> 
> Ditto.
Changed.

> 
>> +    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
>>  
>> -  const vsll __builtin_vsx_xvcvspuxds (vf); -    XVCVSPUXDS vsx_xvcvspuxds {}
>> +  const vull __builtin_vsx_xvcvspuxds (vf);
> 
> Ditto.
Changed.

> 
>> +    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
>>  
>> -  const vsi __builtin_vsx_xvcvspuxws (vf);
>> -    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
>> +  const vull __builtin_vsx_xvcvspuxds_low (vf);
> 
> Ditto.
Changed. 

> 
>> +    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
>>  
>>    const vd __builtin_vsx_xvcvsxddp (vsll);
>>      XVCVSXDDP vsx_floatv2div2df2 {}
>> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
>> index 84bd9ae6554..4d857bb1af3 100644
>> --- a/gcc/config/rs6000/rs6000-overload.def
>> +++ b/gcc/config/rs6000/rs6000-overload.def
>> @@ -3307,10 +3307,14 @@
>>  [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
>>    vsi __builtin_vec_vsignede (vd);
>>      VEC_VSIGNEDE_V2DF
>> +  vsll __builtin_vec_vsignede (vf);
>> +    VEC_VSIGNEDE_V4SF
>>  
>>  [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
>>    vsi __builtin_vec_vsignedo (vd);
>>      VEC_VSIGNEDO_V2DF
>> +  vsll __builtin_vec_vsignedo (vf);
>> +    VEC_VSIGNEDO_V4SF
>>  
>>  [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
>>    vsi __builtin_vec_signexti (vsc);
>> @@ -4433,10 +4437,14 @@
>>  [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
>>    vui __builtin_vec_vunsignede (vd);
>>      VEC_VUNSIGNEDE_V2DF
>> +  vull __builtin_vec_vunsignede (vf);
>> +    VEC_VUNSIGNEDE_V4SF
>>  
>>  [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
>>    vui __builtin_vec_vunsignedo (vd);
>>      VEC_VUNSIGNEDO_V2DF
>> +  vull __builtin_vec_vunsignedo (vf);
>> +    VEC_VUNSIGNEDO_V4SF
>>  
>>  [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
>>    vui __builtin_vec_extract_exp (vf);
>> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
>> index f135fa079bd..a8f3d459232 100644
>> --- a/gcc/config/rs6000/vsx.md
>> +++ b/gcc/config/rs6000/vsx.md
>> @@ -2704,6 +2704,94 @@ (define_expand "vsx_xvcvsp<su>xds"
>>    DONE;
>>  })
>>  
>> +;; Convert low vector elements of 32-bit floating point numbers to vector of
>> +;; 64-bit signed
> 
> Maybe:
> 
> ;; Convert float vector even elements to {un,}signed long long vector

Changed the four comments to the suggested pattern.

> 
>> +(define_expand "vsignede_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    {
>> +       /* Shift left one word to put even word in correct location */
>> +       rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +       rtx rtx_val = GEN_INT (4);
>> +       emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +       emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
>> +    }
> 
> I think this is wrong, even elements on BE is word 0 and 2, it doesn't
> requires vector shifting (similar to doublee<mode>2), while LE needs.

OK, went thru this again, used gdb to look at how things get laied out in the registers.  I agree it loks like I have the shifting backwards for LE/BE on the even/odd stuff.  Fixing this requires fixing the expected results in the corresponding test case as they are backwards.   

> 
>> +  else
>> +    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
>> +
>> +  DONE;
>> +})
>> +
>> +;; Convert high vector elements of 32-bit floating point numbers to vector of
>> +;; 64-bit signed
> 
> Ditto.

Changed

> 
>> +(define_expand "vsignedo_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));
> 
> As above, this is for odd elements, so BE needs vector shifting while LE doesn't.

Chnaged this file and the corresponding expected test case results.
> 
> The vunsigned* below need the according fixes.
> 
>> +  else
>> +    {
>> +      /* Shift left one word to put even word in correct location */
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
>> +    }
>> +
>> +  DONE;
>> +})
>> +
>> +;; Convert low vector elements of 32-bit floating point numbers to vector of
>> +;; 64-bit unsigned integers.

Changed comment as suggested above.

>> +(define_expand "vunsignede_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    {
>> +      /* Shift left one word to put even word in correct location */
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
>> +    }
>> +  else
>> +    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
>> +
>> +  DONE;
>> +})
>> +
>> +;; Convert high vector elements of 32-bit floating point numbers to vector of
>> +;; 64-bit unsigned integers.

Changed comment as suggested above.

>> +(define_expand "vunsignedo_v4sf"
>> +  [(match_operand:V2DI 0 "vsx_register_operand")
>> +   (match_operand:V4SF 1 "vsx_register_operand")]
>> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> +  if (BYTES_BIG_ENDIAN)
>> +    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
>> +  else
>> +    {
>> +      /* Shift left one word to put even word in correct location */
>> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> +      rtx rtx_val = GEN_INT (4);
>> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> +					  rtx_val));
>> +      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
>> +    }
>> +
>> +  DONE;
>> +})
>> +
>>  ;; Generate float2 double
>>  ;; convert two double to float
>>  (define_expand "float2_v2df"
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 267fccd1512..b88e61641a2 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -22577,6 +22577,16 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
>>  @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>>  
>> +@smallexample
>> +vector signed signed long long vec_signedo (vector float);
>> +vector signed signed long long vec_signede (vector float);
>> +vector unsigned signed long long vec_signedo (vector float);
>> +vector unsigned signed long long vec_signede (vector float);
>> +@end smallexample
> 
> Nit: s/signed long/long/

Yea, a little verbose there... :-)  Fixed.

> 
> BR,
> Kewen
> 
>> +
>> +The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are
>> +additional extensions to the built-ins as documented in the PVIPR.
>> +
>>  @node PowerPC AltiVec Built-in Functions Available on ISA 2.07
>>  @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07
>>  
>> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> index 5dcdfbee791..557befc9a4a 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> @@ -3,7 +3,7 @@
>>  /* { dg-options "-maltivec -mvsx" } */
>>  
>>  #include <altivec.h> // vector
>> -
>> +#define DEBUG 1
>>  #ifdef DEBUG
>>  #include <stdio.h>
>>  #endif
>> @@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned int vec_result,
>>  }
>>  
>>  void test_ll_int_result(vector long long int vec_result,
>> -			vector long long int vec_expected)
>> +			vector long long int vec_expected,
>> +			char *string)
>>  {
>>  	int i;
>>  
>>  	for (i = 0; i < 2; i++)
>>  		if (vec_result[i] != vec_expected[i]) {
>>  #ifdef DEBUG
>> -			printf("Test_ll_int_result: ");
>> +			printf("Test_ll_int_result %s: ", string);
>>  			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
>>  			       i, vec_result[i], i, vec_expected[i]);
>>  #else
>> @@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result,
>>  }
>>  
>>  void test_ll_unsigned_int_result(vector long long unsigned int vec_result,
>> -				 vector long long unsigned int vec_expected)
>> +				 vector long long unsigned int vec_expected,
>> +				 char *string)
>>  {
>>  	int i;
>>  
>>  	for (i = 0; i < 2; i++)
>>  		if (vec_result[i] != vec_expected[i]) {
>>  #ifdef DEBUG
>> -			printf("Test_ll_unsigned_int_result: ");
>> +			printf("Test_ll_unsigned_int_result %s: ", string);
>>  			printf("vec_result[%d] (%lld) != vec_expected[%d] (%lld)\n",
>>  			       i, vec_result[i], i, vec_expected[i]);
>>  #else
>> @@ -292,7 +294,8 @@ int main()
>>  	vec_dble0 = (vector double){-124.930, 81234.49};
>>  	vec_ll_int_expected = (vector long long signed int){-124, 81234};
>>  	vec_ll_int_result = vec_signed (vec_dble0);
>> -	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected);
>> +	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
>> +			    "vec_signed");
>>  
>>  	/* Convert double precision vector float to vector int, even words */
>>  	vec_dble0 = (vector double){-124.930, 81234.49};
>> @@ -321,12 +324,44 @@ int main()
>>  	test_unsigned_int_result (ALL, vec_uns_int_result,
>>  				  vec_uns_int_expected);
>>  
>> +	/* Convert single precision vector float, even args, to vector
>> +	   signed long long int.  */
>> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> +	vec_ll_int_expected = (vector signed long long int){834, -5};
>> +	vec_ll_int_result = vec_signede (vec_flt0);
>> +	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
>> +			    "vec_signede");
>> +
>> +	/* Convert single precision vector float, odd args, to vector
>> +	   signed long long int.  */
>> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> +	vec_ll_int_expected = (vector signed long long int){14, -3};
>> +	vec_ll_int_result = vec_signedo (vec_flt0);
>> +	test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
>> +			    "vec_signedo");
>> +
>> +	/* Convert single precision vector float, even args, to vector
>> +	   unsigned long long int.  */
>> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> +	vec_ll_uns_int_expected = (vector unsigned long long int){834, 0};
>> +	vec_ll_uns_int_result = vec_unsignede (vec_flt0);
>> +	test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> +				     vec_ll_uns_int_expected, "vec_unsignede");
>> +
>> +	/* Convert single precision vector float, odd args, to vector
>> +	   unsigned long long int.  */
>> +	vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> +	vec_ll_uns_int_expected = (vector unsigned long long int){14, 0};
>> +	vec_ll_uns_int_result = vec_unsignedo (vec_flt0);
>> +	test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> +				     vec_ll_uns_int_expected, "vec_unsignedo");
>> +
>>  	/* Convert double precision float to long long unsigned int */
>>  	vec_dble0 = (vector double){124.930, 8134.49};
>>  	vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
>>  	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>>  	test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> -				     vec_ll_uns_int_expected);
>> +				     vec_ll_uns_int_expected, "vec_unsigned");
>>  
>>  	/* Convert double precision float to long long unsigned int. Negative
>>  	   arguments.  */
>> @@ -334,7 +369,7 @@ int main()
>>  	vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
>>  	vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>>  	test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> -				     vec_ll_uns_int_expected);
>> +				     vec_ll_uns_int_expected, "vec_unsigned");
>>  
>>  	/* Convert double precision vector float to vector unsigned int,
>>  	   even words.  Negative arguments */
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments
  2024-06-04  5:58   ` Kewen.Lin
@ 2024-06-13 15:35     ` Carl Love
  0 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-06-13 15:35 UTC (permalink / raw)
  To: Kewen.Lin, Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Kewen:

On 6/3/24 22:58, Kewen.Lin wrote:
> Hi,
> 
> on 2024/5/30 00:03, Carl Love wrote:
>> This was patch 6 in the previous series.  Updated the documentation file per the comments.  No functional changes to the patch.
>>
>>                           Carl 
>> ------------------------------------------------------------
>>
>> rs6000, add overloaded vec_sel with int128 arguments
>>
>> Extend the vec_sel built-in to take three signed/unsigned int128 arguments
>> and return a signed/unsigned int128 result.
>>
>> Extending the vec_sel built-in makes the existing buit-ins
>> __builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
>> patch removes these built-ins.
>>
>> The patch adds documentation and test cases for the new overloaded vec_sel
>> built-ins.
>>
>> gcc/ChangeLog:
>> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
>> 	__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
>> 	* config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
>> 	definitions.
>> 	* doc/extend.texi: Add documentation for new vec_sel instances.
>>
>> gcc/testsuite/ChangeLog:
>> 	* gcc.target/powerpc/vec-sel-runnable-i128.c: New test file.
>> ---
>>  gcc/config/rs6000/rs6000-builtins.def         |   6 -
>>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>>  gcc/doc/extend.texi                           |  12 ++
>>  .../powerpc/vec-sel-runnable-i128.c           | 129 ++++++++++++++++++
>>  4 files changed, 145 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
>> index 13e36df008d..ea0da77f13e 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1904,12 +1904,6 @@
>>    const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
>>      XXSEL_16QI_UNS vector_select_v16qi_uns {}
>>  
>> -  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
>> -    XXSEL_1TI vector_select_v1ti {}
>> -
>> -  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
>> -    XXSEL_1TI_UNS vector_select_v1ti_uns {}
>> -
>>    const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
>>      XXSEL_2DF vector_select_v2df {}
>>  
>> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
>> index 4d857bb1af3..a210c5ad10d 100644
>> --- a/gcc/config/rs6000/rs6000-overload.def
>> +++ b/gcc/config/rs6000/rs6000-overload.def
>> @@ -3274,6 +3274,10 @@
>>      VSEL_2DF  VSEL_2DF_B
>>    vd __builtin_vec_sel (vd, vd, vull);
>>      VSEL_2DF  VSEL_2DF_U
>> +  vsq __builtin_vec_sel (vsq, vsq, vsq);
>> +    VSEL_1TI  VSEL_1TI_S
>> +  vuq __builtin_vec_sel (vuq, vuq, vuq);
>> +    VSEL_1TI_UNS  VSEL_1TI_U
> 
> I just noticed that for integral types, such as: signed/unsigned int, we have six instances:
> 
>   vsi __builtin_vec_sel (vsi, vsi, vbi);
>     VSEL_4SI  VSEL_4SI_B
>   vsi __builtin_vec_sel (vsi, vsi, vui);
>     VSEL_4SI  VSEL_4SI_U
>   vui __builtin_vec_sel (vui, vui, vbi);
>     VSEL_4SI_UNS  VSEL_4SI_UB
>   vui __builtin_vec_sel (vui, vui, vui);
>     VSEL_4SI_UNS  VSEL_4SI_UU
>   vbi __builtin_vec_sel (vbi, vbi, vbi);
>     VSEL_4SI_UNS  VSEL_4SI_BB
>   vbi __builtin_vec_sel (vbi, vbi, vui);
> 
> It considers the control vector can only have unsigned and bool types, also consider the
> return type can be bool.  It aligns with what PVIPR defines, so here we should have:
> 
> vsq __builtin_vec_sel (vsq, vsq, vbq);
> vsq __builtin_vec_sel (vsq, vsq, vuq);
> vuq __builtin_vec_sel (vuq, vuq, vbq);
> vuq __builtin_vec_sel (vuq, vuq, vuq);
> vbq __builtin_vec_sel (vbq, vbq, vbq);
> vbq __builtin_vec_sel (vbq, vbq, vuq);
> 
> Sorry that I didn't find this in the previous review.

Yea, my bad I missed that as well.  Fixed to add all six instances.
> 
> 
>>  ; The following variants are deprecated.
>>    vsll __builtin_vec_sel (vsll, vsll, vsll);
>>      VSEL_2DI_B  VSEL_2DI_S
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index b88e61641a2..0756230b19e 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -21372,6 +21372,18 @@ Additional built-in functions are available for the 64-bit PowerPC
>>  family of processors, for efficient use of 128-bit floating point
>>  (@code{__float128}) values.
>>  
>> +Vector select
>> +
>> +@smallexample
>> +vector signed __int128 vec_sel (vector signed __int128,
>> +               vector signed __int128, vector signed __int128);
>> +vector unsigned __int128 vec_sel (vector unsigned __int128,
>> +               vector unsigned __int128, vector unsigned __int128);
>> +@end smallexample
> 
> As above, the documentation here has to consider vector bool __int128 and note that
> the control vector are of type either vector unsigned __int128 or vector bool __int128.
> 
>> +
>> +The instance is an extension of the exiting overloaded built-in @code{vec_sel}
>> +that is documented in the PVIPR.
>> +
>>  @node Basic PowerPC Built-in Functions Available on ISA 2.06
>>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.06
>>  
>> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
>> new file mode 100644
>> index 00000000000..d82225cc847
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
>> @@ -0,0 +1,129 @@
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target vmx_hw } */
>> +/* { dg-options "-save-temps" } */
>> +/* { dg-final { scan-assembler-times "xxsel" 2 } } */
> 
> Nit: Can we rename this case to builtins-10.c and separate it into
> one dg-do compile and the other dg-do "run" (like some builtins-*.c
> which have *-x.c and *-x-runnable.c)?
> 
> It needs some more adjustments as the overloaded instances change.

Renamed run test file builtins-10-runnable.c . Added tests for all six instances.

Created builtins-10.c as a compile only test that checks for 6 xxsel instances.

> 
> BR,
> Kewen
> 
>> +
>> +#include <altivec.h>
>> +
>> +#define DEBUG 0
>> +
>> +#if DEBUG
>> +#include <stdio.h>
>> +void print_i128 (unsigned __int128 val)
>> +{
>> +  printf(" 0x%016llx%016llx",
>> +         (unsigned long long)(val >> 64),
>> +         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
>> +}
>> +#endif
>> +
>> +extern void abort (void);
>> +
>> +union convert_union {
>> +  vector signed __int128    s128;
>> +  vector unsigned __int128  u128;
>> +  char  val[16];
>> +} convert;
>> +
>> +int check_u128_result(vector unsigned __int128 vresult_u128,
>> +		      vector unsigned __int128 expected_vresult_u128)
>> +{
>> +  /* Use a for loop to check each byte manually so the test case will run
>> +     with ISA 2.06.
>> +
>> +     Return 1 if they match, 0 otherwise.  */
>> +
>> +  int i;
>> +
>> +  union convert_union result;
>> +  union convert_union expected;
>> +
>> +  result.u128 = vresult_u128;
>> +  expected.u128 = expected_vresult_u128;
>> +
>> +  /* Check if each byte of the result and expected match. */
>> +  for (i = 0; i < 16; i++)
>> +    {
>> +      if (result.val[i] != expected.val[i])
>> +	return 0;
>> +    }
>> +  return 1;
>> +}
>> +
>> +int check_s128_result(vector signed __int128 vresult_s128,
>> +		      vector signed __int128 expected_vresult_s128)
>> +{
>> +  /* Convert the arguments to unsigned, then check equality.  */
>> +  union convert_union result;
>> +  union convert_union expected;
>> +
>> +  result.s128 = vresult_s128;
>> +  expected.s128 = expected_vresult_s128;
>> +
>> +  return check_u128_result (result.u128, expected.u128);
>> +}
>> +
>> +
>> +int
>> +main (int argc, char *argv [])
>> +{
>> +  int i;
>> +  
>> +  vector signed __int128 src_va_s128;
>> +  vector signed __int128 src_vb_s128;
>> +  vector signed __int128 src_vc_s128;
>> +  vector signed __int128 vresult_s128;
>> +  vector signed __int128 expected_vresult_s128;
>> +
>> +  vector unsigned __int128 src_va_u128;
>> +  vector unsigned __int128 src_vb_u128;
>> +  vector unsigned __int128 src_vc_u128;
>> +  vector unsigned __int128 vresult_u128;
>> +  vector unsigned __int128 expected_vresult_u128;
>> +
>> +  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
>> +  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
>> +  src_vc_s128 = (vector signed __int128) {0x3333333333333333};
>> +  expected_vresult_s128 = (vector signed __int128) {0x32147658ba9cfed0};
>> +
>> +  /* Signed arguments.  */
>> +  vresult_s128 = vec_sel (src_va_s128, src_vb_s128, src_vc_s128);
>> +
>> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
>> +#if DEBUG
>> +    {
>> +      printf ("ERROR, vec_sel (src_va_s128, src_vb_s128, src_vc_s128) result does not match expected output.\n");
>> +      printf ("  Result:          ");
>> +      print_i128 ((unsigned __int128) vresult_s128);
>> +      printf ("\n  Expected result: ");
>> +      print_i128 ((unsigned __int128) expected_vresult_s128);
>> +      printf ("\n");
>> +    }
>> +#else
>> +    abort ();
>> +#endif
>> +
>> +  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
>> +  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
>> +  src_vc_u128 = (vector unsigned __int128) {0x5555555555555555};
>> +  expected_vresult_u128 = (vector unsigned __int128) {0x0307CFCF47439A9A};
>> +
>> +  /* Unigned arguments.  */
>> +  vresult_u128 = vec_sel (src_va_u128, src_vb_u128, src_vc_u128);
>> +
>> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
>> +#if DEBUG
>> +    {
>> +      printf ("ERROR, vec_sel (src_va_u128, src_vb_u128, src_vc_u128) result does not match expected output.\n");
>> +      printf ("  Result:          ");
>> +      print_i128 ((unsigned __int128) vresult_u128);
>> +      printf ("\n  Expected result: ");
>> +      print_i128 ((unsigned __int128) expected_vresult_u128);
>> +      printf ("\n");
>> +    }
>> +#else
>> +    abort ();
>> +#endif
>> +
>> +    return 0;
>> +}
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args
  2024-06-04  5:58   ` Kewen.Lin
@ 2024-06-13 15:35     ` Carl Love
  0 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-06-13 15:35 UTC (permalink / raw)
  To: Kewen.Lin, cel@us.ibm.com >> Carl Love IMAP
  Cc: gcc-patches, Segher Boessenkool, bergner

Kewen:

On 6/3/24 22:58, Kewen.Lin wrote:
> Hi,
> 
> on 2024/5/30 00:10, Carl Love wrote:
>>  This was patch 10 from the previous series.  The patch was updated to address feedback comments.
>>
>>                             Carl 
>> ---------------------------------------------------
>>
>> rs6000, extend vec_xxpermdi built-in for __int128 args
>>
>> Add a new signed and unsigned overloaded instances for vec_xxpermdi
>>
>>    __int128 vec_xxpermdi (__int128, __int128, const int);
>>    __uint128 vec_xxpermdi (__uint128, __uint128, const int);
>>
>> Update the documentation to include a reference to the new built-in
>> instances.
>>
>> Add test cases for the new overloaded instances.
>>
>> gcc/ChangeLog:
>> 	* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
>> 	overloaded built-in instances.
>> 	* doc/extend.texi:  Add documentation for new overloaded built-in
>> 	instances.
>>
>> gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>> 	* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
>> ---
>>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>>  gcc/doc/extend.texi                           |   2 +
>>  .../powerpc/vec_perm-runnable-i128.c          | 229 ++++++++++++++++++
>>  3 files changed, 235 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
>>
>> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
>> index a210c5ad10d..45000f161e4 100644
>> --- a/gcc/config/rs6000/rs6000-overload.def
>> +++ b/gcc/config/rs6000/rs6000-overload.def
>> @@ -4932,6 +4932,10 @@
>>      XXPERMDI_4SF  XXPERMDI_VF
>>    vd __builtin_vsx_xxpermdi (vd, vd, const int);
>>      XXPERMDI_2DF  XXPERMDI_VD
>> +  vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
>> +    XXPERMDI_1TI  XXPERMDI_1TI
>> +  vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
>> +    XXPERMDI_1TI  XXPERMDI_1TUI
> 
> Nits:
>   - Move them before "vf __builtin_vsx_xxpermdi (vf, vf, const int);" so
>     they are close to instances for other integral types.
>   - As the existing name convention, _{SQ,UQ} are better.
> 
>     vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
>        XXPERMDI_1TI  XXPERMDI_1SQ
>     vuq __builtin_vsx_xxpermdi (vuq, vuq, const int);
>        XXPERMDI_1TI  XXPERMDI_1UQ
> 

OK, moved the definitions up and changed the names.

>>  
>>  [VEC_XXSLDWI, vec_xxsldwi, __builtin_vsx_xxsldwi]
>>    vsc __builtin_vsx_xxsldwi (vsc, vsc, const int);
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 0756230b19e..edfef1bdab7 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -22555,6 +22555,8 @@ void vec_vsx_st (vector bool char, int, signed char *);
>>  vector double vec_xxpermdi (vector double, vector double, const int);
>>  vector float vec_xxpermdi (vector float, vector float, const int);
>>  vector long long vec_xxpermdi (vector long long, vector long long, const int);
> 
>> +vector __int128 vec_xxpermdi (vector __int128, vector __int128, const int);
>> +vector __int128 vec_xxpermdi (vector __uint128, vector __uint128, const int);
> 
> Nit: These two lines break the long long and unsigned long long lines, can you move
> them one line upward?  Also using the explicit "signed" and "unsigned" would be
> better than "__{u,}int128".
> 

Yup, I didn't get them in the right place.  Fixed.

>>  vector unsigned long long vec_xxpermdi (vector unsigned long long,
>>                                          vector unsigned long long, const int);
>>  vector int vec_xxpermdi (vector int, vector int, const int);
>> diff --git a/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
>> new file mode 100644
>> index 00000000000..2d5dce09404
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vec_perm-runnable-i128.c
>> @@ -0,0 +1,229 @@
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target vmx_hw } */
>> +/* { dg-options "-save-temps" } */
> 
> Nit: dg-options line isn't needed as it doesn't check assembly.

Removed the save-temps.

> 
> BR,
> Kewen
> 
>> +
>> +#include <altivec.h>
>> +
>> +#define DEBUG 0
>> +
>> +#if DEBUG
>> +#include <stdio.h>
>> +void print_i128 (unsigned __int128 val)
>> +{
>> +  printf(" 0x%016llx%016llx",
>> +         (unsigned long long)(val >> 64),
>> +         (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
>> +}
>> +#endif
>> +
>> +extern void abort (void);
>> +
>> +union convert_union {
>> +  vector signed __int128    s128;
>> +  vector unsigned __int128  u128;
>> +  char  val[16];
>> +} convert;
>> +
>> +int check_u128_result(vector unsigned __int128 vresult_u128,
>> +		      vector unsigned __int128 expected_vresult_u128)
>> +{
>> +  /* Use a for loop to check each byte manually so the test case will
>> +     run with ISA 2.06.
>> +
>> +     Return 1 if they match, 0 otherwise.  */
>> +
>> +  int i;
>> +
>> +  union convert_union result;
>> +  union convert_union expected;
>> +
>> +  result.u128 = vresult_u128;
>> +  expected.u128 = expected_vresult_u128;
>> +
>> +  /* Check if each byte of the result and expected match. */
>> +  for (i = 0; i < 16; i++)
>> +    {
>> +      if (result.val[i] != expected.val[i])
>> +	return 0;
>> +    }
>> +  return 1;
>> +}
>> +
>> +int check_s128_result(vector signed __int128 vresult_s128,
>> +		      vector signed __int128 expected_vresult_s128)
>> +{
>> +  /* Convert the arguments to unsigned, then check equality.  */
>> +  union convert_union result;
>> +  union convert_union expected;
>> +
>> +  result.s128 = vresult_s128;
>> +  expected.s128 = expected_vresult_s128;
>> +
>> +  return check_u128_result (result.u128, expected.u128);
>> +}
>> +
>> +
>> +int
>> +main (int argc, char *argv [])
>> +{
>> +  int i;
>> +  
>> +  vector signed __int128 src_va_s128;
>> +  vector signed __int128 src_vb_s128;
>> +  vector signed __int128 vresult_s128;
>> +  vector signed __int128 expected_vresult_s128;
>> +
>> +  vector unsigned __int128 src_va_u128;
>> +  vector unsigned __int128 src_vb_u128;
>> +  vector unsigned __int128 src_vc_u128;
>> +  vector unsigned __int128 vresult_u128;
>> +  vector unsigned __int128 expected_vresult_u128;
>> +
>> +  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
>> +  src_va_s128 = src_va_s128 << 64; 
>> +  src_va_s128 |= (vector signed __int128) {0x22446688AACCEE00};
>> +  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
>> +  src_vb_s128 = src_vb_s128 << 64;
>> +  src_vb_s128 |= (vector signed __int128) {0x3333333333333333};
>> +
>> +  src_va_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
>> +  src_va_u128 = src_va_u128 << 64;
>> +  src_va_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
>> +  src_vb_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
>> +  src_vb_u128 = src_vb_u128 << 64;
>> +  src_vb_u128 |= (vector unsigned __int128) {0x5555555555555555};
>> +
>> +
>> +  /* Signed 128-bit arguments.  */
>> +  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x1);
>> +
>> +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>> +  /* BE expected results  */
>> +  expected_vresult_s128 = (vector signed __int128) {0x123456789ABCDEF0};
>> +  expected_vresult_s128 = expected_vresult_s128 << 64;
>> +  expected_vresult_s128 |= (vector signed __int128) {0x3333333333333333};
>> +#else
>> +  /* LE expected results  */
>> +  expected_vresult_s128 = (vector signed __int128) {0xFEDCBA9876543210};
>> +  expected_vresult_s128 = expected_vresult_s128 << 64;
>> +  expected_vresult_s128 |= (vector signed __int128) {0x22446688AACCEE00};
>> +#endif
>> +
>> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
>> +#if DEBUG
>> +    {
>> +      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x1) result does not match expected output.\n");
>> +      printf ("  src_va_s128:     ");
>> +      print_i128 ((unsigned __int128) src_va_s128);
>> +      printf ("\n  src_vb_s128:     ");
>> +      print_i128 ((unsigned __int128) src_vb_s128);
>> +      printf ("\n  Result:          ");
>> +      print_i128 ((unsigned __int128) vresult_s128);
>> +      printf ("\n  Expected result: ");
>> +      print_i128 ((unsigned __int128) expected_vresult_s128);
>> +      printf ("\n");
>> +    }
>> +#else
>> +    abort ();
>> +#endif
>> +
>> +  vresult_s128 = vec_xxpermdi (src_va_s128, src_vb_s128, 0x2);
>> +
>> +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>> +  /* BE expected results  */
>> +  expected_vresult_s128 = (vector signed __int128) {0x22446688AACCEE00};
>> +  expected_vresult_s128 = expected_vresult_s128 << 64;
>> +  expected_vresult_s128 |= (vector signed __int128) {0xFEDCBA9876543210};
>> +#else
>> +  /* LE expected results  */
>> +  expected_vresult_s128 = (vector signed __int128) {0x3333333333333333};
>> +  expected_vresult_s128 = expected_vresult_s128 << 64;
>> +  expected_vresult_s128 |= (vector signed __int128) {0x123456789ABCDEF0};
>> +#endif
>> +
>> +  if (!check_s128_result (vresult_s128, expected_vresult_s128))
>> +#if DEBUG
>> +    {
>> +      printf ("ERROR, vec_xxpermdi (src_va_s128, src_vb_s128, 0x2) result does not match expected output.\n");
>> +      printf ("  src_va_s128:     ");
>> +      print_i128 ((unsigned __int128) src_va_s128);
>> +      printf ("\n  src_vb_s128:     ");
>> +      print_i128 ((unsigned __int128) src_vb_s128);
>> +      printf ("\n  Result:          ");
>> +      print_i128 ((unsigned __int128) vresult_s128);
>> +      printf ("\n  Expected result: ");
>> +      print_i128 ((unsigned __int128) expected_vresult_s128);
>> +      printf ("\n");
>> +    }
>> +#else
>> +    abort ();
>> +#endif
>> +
>> +  /* Unigned arguments.  */
>> +  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x1);
>> +
>> +  #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>> +  /* BE expected results */
>> +  expected_vresult_u128 = (vector unsigned __int128) {0x13579ACE02468BDF};
>> +  expected_vresult_u128 = expected_vresult_u128 << 64;
>> +  expected_vresult_u128 |= (vector unsigned __int128) {0x5555555555555555};
>> +#else
>> +  /* LE expected results */
>> +  expected_vresult_u128 = (vector unsigned __int128) {0xA987654FEDCB3210};
>> +  expected_vresult_u128 = expected_vresult_u128 << 64;
>> +  expected_vresult_u128 |= (vector unsigned __int128) {0x1133557799BBDD00};
>> +#endif
>> +
>> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
>> +#if DEBUG
>> +    {
>> +      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x1) result does not match expected output.\n");
>> +      printf ("  src_va_s128:     ");
>> +      print_i128 ((unsigned __int128) src_va_s128);
>> +      printf ("\n  src_vb_s128:     ");
>> +      print_i128 ((unsigned __int128) src_vb_s128);
>> +      printf ("\n  Result:          ");
>> +      print_i128 ((unsigned __int128) vresult_u128);
>> +      printf ("\n  Expected result: ");
>> +      print_i128 ((unsigned __int128) expected_vresult_u128);
>> +      printf ("\n");
>> +    }
>> +#else
>> +    abort ();
>> +#endif
>> +
>> +  /* Unigned arguments.  */
>> +  vresult_u128 = vec_xxpermdi (src_va_u128, src_vb_u128, 0x2);
>> +
>> +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>> +  /* BE expected results */
>> +  expected_vresult_u128 = (vector unsigned __int128) {0x1133557799BBDD00};
>> +  expected_vresult_u128 = expected_vresult_u128 << 64;
>> +  expected_vresult_u128 |= (vector unsigned __int128) {0xA987654FEDCB3210};
>> +#else
>> +  /* LE expected results */
>> +  expected_vresult_u128 = (vector unsigned __int128) {0x5555555555555555};
>> +  expected_vresult_u128 = expected_vresult_u128 << 64;
>> +  expected_vresult_u128 |= (vector unsigned __int128) {0x13579ACE02468BDF};
>> +#endif
>> +  
>> +  if (!check_u128_result (vresult_u128, expected_vresult_u128))
>> +#if DEBUG
>> +    {
>> +      printf ("ERROR, vec_xxpermdi (src_va_u128, src_vb_u128, 0x2) result does not match expected output.\n");
>> +      printf ("  src_va_s128:     ");
>> +      print_i128 ((unsigned __int128) src_va_s128);
>> +      printf ("\n  src_vb_s128:     ");
>> +      print_i128 ((unsigned __int128) src_vb_s128);
>> +      printf ("\n  Result:          ");
>> +      print_i128 ((unsigned __int128) vresult_u128);
>> +      printf ("\n  Expected result: ");
>> +      print_i128 ((unsigned __int128) expected_vresult_u128);
>> +      printf ("\n");
>> +    }
>> +#else
>> +    abort ();
>> +#endif
>> +
>> +    return 0;
>> +}
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins.
  2024-06-04  5:59   ` Kewen.Lin
@ 2024-06-13 15:35     ` Carl Love
  0 siblings, 0 replies; 30+ messages in thread
From: Carl Love @ 2024-06-13 15:35 UTC (permalink / raw)
  To: Kewen.Lin, Carl Love; +Cc: gcc-patches, Segher Boessenkool, bergner

Kewen:

On 6/3/24 22:59, Kewen.Lin wrote:
> Hi,
> 
> on 2024/5/30 00:16, Carl Love wrote:
>> This was patch 13 from the previous series.  Note the previous series patch 12 was dropped.  This patch is the same as the previous version.  The additional work to remove  __builtin_vec_set_v1ti, __builtin_vec_set_v2di,  __builtin_vec_set_v2d per the feedback comments with equivalent gimple code is being deferred to a future patch.  The goal of this series was simply to remove duplicated built-ins, extending overloaded built-ins as needed.  Adding the needed gimple code to remove the additional built-ins is beyond the goal of this patch series.
>>
>>                              Carl 
>> -------------------------------------------------------
>>
>> rs6000, remove vector set and vector init built-ins.
>>
>> The vector init built-ins:
>>
>>   __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
>>   __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
>>   __builtin_vec_init_v2di, __builtin_vec_init_v2df,
>>   __builtin_vec_set_v1ti
> 
> Typo here, s/__builtin_vec_set_v1ti/__builtin_vec_init_v1ti/

Fixed.

> 
>>
>> perform the same operation as initializing the vector in C code.  For
>> example:
>>
>>   result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
>>   result_v4si = {1, 2, 3, 4};
>>
>> These two constructs were tested and verified they generate identical
>> assembly instructions with no optimization and -O3 optimization.
>>
>> The vector set built-ins:
>>
>>   __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
>>   __builtin_vec_set_v4si, __builtin_vec_set_v4sf
> 
> Please also add the reserved ones (...v1ti/v2di/v2df), as they are the 
> same too, temporarily reserving them for the uses in resolve_vec_insert()
> doesn't affect this.

Added the three additional built-ins to the list.

> 
>>
>> perform the same operation as setting a specific element in the vector in
>> C code.  For example:
>>
>>   src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
>>   src_v4si[index] = int_val;
>>
>> The built-in actually generates more instructions than the inline C code
>> with no optimization but is identical with -O3 optimizations.
>>
>> All of the above built-ins that are removed do not have test cases and
>> are not documented.
>>
>> Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
>> __builtin_vec_set_v2df are not removed as they are used in function
>> resolve_vec_insert() in file rs6000-c.cc.
>>
>> The built-ins are removed as they don't provide any benefit over just
>> using C code.
>>
>> gcc/ChangeLog:
>> 	* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
>> 	__builtin_vec_init_v8hi, __builtin_vec_init_v4si,
>> 	__builtin_vec_init_v4sf, __builtin_vec_init_v2di,
>> 	__builtin_vec_init_v2df, __builtin_vec_set_v1ti,
> 
> Typo, s/__builtin_vec_set_v1ti/__builtin_vec_init_v1ti/

Fixed

> 
>> 	__builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
>> 	__builtin_vec_set_v4si, __builtin_vec_set_v4sf,
>> 	__builtin_vec_set_v2di, __builtin_vec_set_v2df,
>> 	__builtin_vec_set_v1ti): Remove built-in definitions.
> 
> The last three ones are not actually removed.

OK, fixed.

> 
>> ---
>>  gcc/config/rs6000/rs6000-builtins.def | 42 ++-------------------------
>>  1 file changed, 2 insertions(+), 40 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
>> index 48ebc018a8d..8349d45169f 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1118,37 +1118,6 @@
>>    const signed short __builtin_vec_ext_v8hi (vss, signed int);
>>      VEC_EXT_V8HI nothing {extract}
>>  
>> -  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
>> -            signed char, signed char, signed char, signed char, signed char, \
>> -            signed char, signed char, signed char, signed char, signed char, \
>> -            signed char, signed char, signed char);
>> -    VEC_INIT_V16QI nothing {init}
>> -
>> -  const vf __builtin_vec_init_v4sf (float, float, float, float);
>> -    VEC_INIT_V4SF nothing {init}
>> -
>> -  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
>> -                                     signed int);
>> -    VEC_INIT_V4SI nothing {init}
>> -
>> -  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short,\
>> -             signed short, signed short, signed short, signed short, \
>> -             signed short);
>> -    VEC_INIT_V8HI nothing {init}
>> -
>> -  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
>> -    VEC_SET_V16QI nothing {set}
>> -
>> -  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
>> -    VEC_SET_V4SF nothing {set}
>> -
>> -  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
>> -    VEC_SET_V4SI nothing {set}
>> -
>> -  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
>> -    VEC_SET_V8HI nothing {set}
>> -
>> -
>>  ; Cell builtins.
>>  [cell]
>>    pure vsc __builtin_altivec_lvlx (signed long, const void *);
>> @@ -1295,15 +1264,8 @@
>>    const signed long long __builtin_vec_ext_v2di (vsll, signed int);
>>      VEC_EXT_V2DI nothing {extract}
>>  
>> -  const vsq __builtin_vec_init_v1ti (signed __int128);
>> -    VEC_INIT_V1TI nothing {init}
>> -
>> -  const vd __builtin_vec_init_v2df (double, double);
>> -    VEC_INIT_V2DF nothing {init}
>> -
>> -  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
>> -    VEC_INIT_V2DI nothing {init}
>> -
>> +;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
>> +;; resolve_vec_insert(), rs6000-c.cc
> 
> It would be good to place one TODO here, something like:

Added comment.

> 
> ;; TODO: Remove VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI once the uses
> ;; in resolve_vec_insert are replaced by the equivalent gimple statements.
> 
>>    const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
>>      VEC_SET_V1TI nothing {set}
>>  
> 
> BR,
> Kewen
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-06-13 16:36 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-29 15:48 [PATCH 0/13 ver 3] rs6000, built-in cleanup patch series Carl Love
2024-05-29 15:52 ` [PATCH 1/13 ver 3] s6000, Remove __builtin_vsx_cmple* builtins Carl Love
2024-06-04  6:00   ` [PATCH 1/13 ver 3] rs6000, " Kewen.Lin
2024-06-05 22:25     ` Carl Love
2024-06-06  2:40       ` Kewen.Lin
2024-05-29 15:55 ` [PATCH 2/13 ver 3] rs6000, Remove __builtin_vsx_xvcvspsxws built-in Carl Love
2024-05-29 15:56 ` [PATCH 3/13 ver 3] rs6000, fix error in unsigned vector float to unsigned int built-in definition Carl Love
2024-06-04  5:58   ` Kewen.Lin
2024-05-29 15:58 ` [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins Carl Love
2024-06-04  7:19   ` Kewen.Lin
2024-06-13 15:35     ` Carl Love
2024-05-29 16:00 ` [PATCH 5/13 ver 3] rs6000, Remove redundant float/double type conversions Carl Love
2024-06-04  6:20   ` Kewen.Lin
2024-05-29 16:01 ` [PATCH 6/13 ver 3] rs6000, remove duplicated built-ins of vecmergl and, vec_mergeh Carl Love
2024-05-29 16:03 ` [PATCH 7/13 ver 3] rs6000, add overloaded vec_sel with int128 arguments Carl Love
2024-06-04  5:58   ` Kewen.Lin
2024-06-13 15:35     ` Carl Love
2024-05-29 16:05 ` [PATCH 8/13 ver 3] rs6000, remove the vec_xxsel built-ins, they are, duplicates Carl Love
2024-06-04  5:58   ` Kewen.Lin
2024-05-29 16:06 ` [PATCH 9/13 ver 3] rs6000, remove __builtin_vsx_vperm_* built-ins Carl Love
2024-06-04  5:58   ` Kewen.Lin
2024-05-29 16:08 ` [PATCH 10/13 ver 3] rs6000, remove __builtin_vsx_xvnegdp and, __builtin_vsx_xvnegsp built-ins Carl Love
2024-05-29 16:10 ` [PATCH 11/13 ver 3] rs6000, extend vec_xxpermdi built-in for __int128 args Carl Love
2024-06-04  5:58   ` Kewen.Lin
2024-06-13 15:35     ` Carl Love
2024-05-29 16:11 ` [PATCH 12/13 ver 3] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in Carl Love
2024-06-04  5:59   ` Kewen.Lin
2024-05-29 16:16 ` [PATCH 13/13 ver 3] rs6000, remove vector set and vector init built-ins Carl Love
2024-06-04  5:59   ` Kewen.Lin
2024-06-13 15:35     ` Carl Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).