public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
@ 2023-05-10 18:06 Carl Love
  2023-05-18 21:28 ` Peter Bergner
  2023-05-22  9:04 ` Kewen.Lin
  0 siblings, 2 replies; 12+ messages in thread
From: Carl Love @ 2023-05-10 18:06 UTC (permalink / raw)
  To: Segher Boessenkool, gcc-patches; +Cc: Peter Bergner, cel

GCC maintainers:

The following patch fixes errors in the arguments in the
__builtin_altivec_tr_stxvrhx,	__builtin_altivec_tr_stxvrwx builtin
definitions.  Note, these builtins are used by the overloaded
__builtin_vec_xst_trunc builtin.

The patch adds a new overloaded builtin definition for
__builtin_vec_xst_trunc for the third argument to be unsigned and
signed long int.

A new testcase is added for the various overloaded versions of
__builtin_vec_xst_trunc.

The patch has been tested on Power 10 with no new regressions.

Please let me know if the patch is acceptable for mainline.  Thanks.

                    Carl

-------------------------------------------
rs6000: Fix __builtin_vec_xst_trunc definition

Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
and __builtin_altivec_tr_stxvrwx to handle the short and word cases.  The
arguments for these two builtins are wrong.  This patch fixes the wrong
arguments for the builtins.

Additionally, the patch adds a new __builtin_vec_xst_trunc overloaded
version for the destination being signed or unsigned long int.

A runnable test case is added to test each of the overloaded definitions
of __builtin_vec_xst_tru

gcc/
	* config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
	__builtin_altivec_tr_stxvrwx): Fix type of second argument.
	Add, definition for send argument to be signed long.
	* config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
	add definition with thrird arument signed and unsigned long.
	* doc/extend.texi (__builtin_vec_xst_trunc): Add documentation for
	new unsinged long and signed long versions.

gcc/testsuite/
	* gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
	for __builtin_vec_xst_trunc builtin.
---
 gcc/config/rs6000/rs6000-builtins.def         |   7 +-
 gcc/config/rs6000/rs6000-overload.def         |   4 +
 gcc/doc/extend.texi                           |   2 +
 .../powerpc/vsx-builtin-vec_xst_trunc.c       | 217 ++++++++++++++++++
 4 files changed, 228 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..a378491b358 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3161,12 +3161,15 @@
   void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
     TR_STXVRBX vsx_stxvrbx {stvec}
 
-  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
+  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
     TR_STXVRHX vsx_stxvrhx {stvec}
 
-  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
+  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
     TR_STXVRWX vsx_stxvrwx {stvec}
 
+  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
+    TR_STXVRLX vsx_stxvrdx {stvec}
+
   void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
     TR_STXVRDX vsx_stxvrdx {stvec}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index c582490c084..54b7ae5e51b 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4872,6 +4872,10 @@
     TR_STXVRWX  TR_STXVRWX_S
   void __builtin_vec_xst_trunc (vuq, signed long long, unsigned int *);
     TR_STXVRWX  TR_STXVRWX_U
+  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
+    TR_STXVRLX  TR_STXVRLX_S
+  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
+    TR_STXVRLX  TR_STXVRLX_U
   void __builtin_vec_xst_trunc (vsq, signed long long, signed long long *);
     TR_STXVRDX  TR_STXVRDX_S
   void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long long *);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e426a2eb7d8..7e2ae790ab3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18570,10 +18570,12 @@ instructions.
 @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed long long, signed char *)}
 @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed short *)}
 @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed int *)}
+@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed long *)}
 @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed long long *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned char *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned short *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned int *)}
+@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long long *)}
 
 Truncate and store the rightmost element of a vector, as if implemented by the
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
new file mode 100644
index 00000000000..7108109560d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
@@ -0,0 +1,217 @@
+/* Test of __builtin_vec_xst_trunc  */
+
+/* { dg-do run { target power10_hw } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+#include <altivec.h>
+#include <stdio.h>
+#include <inttypes.h>
+#include <string.h>
+#include <stdlib.h>
+
+#define DEBUG 0
+#define TRUE  1
+#define FALSE 0
+#define SIZE  4
+
+vector signed __int128 zero_vsint128 = {0x0};
+
+vector signed __int128 store_data[SIZE] = {
+{ (__int128) 0x79BD000000000000 << 64 | (__int128) 0x123456789abcdef8ULL},
+{ (__int128) 0x8ACE000000000000 << 64 | (__int128) 0xfedcba9876543217ULL},
+{ (__int128) 0x1357000000000000 << 64 | (__int128) 0xccccccccccccccccULL},
+{ (__int128) 0xf000000000000000 << 64 | (__int128) 0xaaaaaaaaaaaaaaaaULL}
+};
+
+signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC, 0xAA};
+signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217, 0xcccc, 0xaaaa, };
+signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217, 0xCCCCCCCC,
+					0xAAAAAAAA};
+signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
+							0xFEDCBA9876543217ULL,
+							0xCCCCCCCCCCCCCCCCULL,
+							0xAAAAAAAAAAAAAAAAULL};
+signed long long int signed_long_long_expected[SIZE] = {0x123456789ABCDEF8ULL,
+							0xFEDCBA9876543217ULL,
+							0xCCCCCCCCCCCCCCCCULL,
+							0xAAAAAAAAAAAAAAAAULL};
+
+union conv_t {
+  vector signed __int128 vsi128;
+  unsigned long long ull[2];
+  signed char schar[16];
+  signed __int128 s128;
+} conv;
+
+int check_expected_byte (signed char expected,
+			 signed char actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected half values don't match. \n");
+      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
+	      expected & 0xFF, actual & 0xFF);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_half (signed short int expected,
+			 signed short int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected short values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected & 0xFFFF, actual & 0xFFFF);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_int (signed int expected,
+			signed int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected int values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_long (signed long int expected,
+			 signed long int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected long values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_long_long (signed long long int expected,
+			      signed long long int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected long long values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+void print_store_data (vector signed __int128 *store_data, int size)
+{
+#if DEBUG
+  union conv_t val;
+  int i;
+  
+  for  (i = 0; i < size; i++)
+    {
+      val.vsi128 = store_data[i];
+      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
+    }
+#endif
+}
+
+
+void print_raw_buffer (vector signed __int128 *rawbuffer, int size)
+{
+#if DEBUG
+  union conv_t val;
+  int i;
+  
+  for  (i = 0; i < size; i++)
+    {
+      val.vsi128 = rawbuffer[i];
+      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
+    }
+#endif
+}
+
+int
+main () {
+  int i;
+  
+  vector signed __int128 rawbuffer[SIZE];
+  signed char * vsbuffer_char = (signed char *)rawbuffer;
+  signed short int * vsbuffer_short = (signed short int *)rawbuffer;
+  signed int * vsbuffer_int = (signed int *)rawbuffer;
+  signed long int * vsbuffer_long = (signed long *)rawbuffer;
+  signed long long int * vsbuffer_long_long = (signed long long *)rawbuffer;
+
+  for  (i = 0; i < SIZE; i++)
+    rawbuffer[i] = zero_vsint128;
+
+  print_store_data (store_data, SIZE);
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
+			       vsbuffer_char);
+      check_expected_byte (signed_char_expected[i], vsbuffer_char[i]);
+    }
+  
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
+			       vsbuffer_short);
+      check_expected_half (signed_short_expected[i], vsbuffer_short[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
+			       vsbuffer_int);
+      check_expected_int (signed_int_expected[i], vsbuffer_int[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
+      			       vsbuffer_long);
+      check_expected_long (signed_long_long_expected[i],
+			   vsbuffer_long[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long int),
+			       vsbuffer_long_long);
+      check_expected_long_long (signed_long_long_expected[i],
+				vsbuffer_long_long[i]);
+    }
+
+  print_raw_buffer (rawbuffer, SIZE);
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\mstxvrbx\M} } } */
+/* { dg-final { scan-assembler {\mstxvrhx\M} } } */
+/* { dg-final { scan-assembler {\mstxvrwx\M} } } */
+/* { dg-final { scan-assembler {\mstxvrdx\M} } } */
-- 
2.37.2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-10 18:06 [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition Carl Love
@ 2023-05-18 21:28 ` Peter Bergner
  2023-05-18 21:44   ` Carl Love
  2023-05-22  9:04 ` Kewen.Lin
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Bergner @ 2023-05-18 21:28 UTC (permalink / raw)
  To: Carl Love, Segher Boessenkool, gcc-patches

On 5/10/23 1:06 PM, Carl Love wrote:
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>      TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>      TR_STXVRWX vsx_stxvrwx {stvec}

In my estimation, these two changes are "obvious" fixes.




> +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
> +    TR_STXVRLX  TR_STXVRLX_S
> +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
> +    TR_STXVRLX  TR_STXVRLX_U

Not a comment on these two changes, and not a request to expand this
specific patch, but I believe I saw other built-ins that were missing
signed long */unsigned long * versions where they could/should accept
them.  Can you double-check whether there are other built-ins that
need similar changes and if so, please post a separate patch to fix
those as well?  Thanks.

Peter


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-18 21:28 ` Peter Bergner
@ 2023-05-18 21:44   ` Carl Love
  0 siblings, 0 replies; 12+ messages in thread
From: Carl Love @ 2023-05-18 21:44 UTC (permalink / raw)
  To: Peter Bergner, Segher Boessenkool, gcc-patches; +Cc: cel

Peter:

On Thu, 2023-05-18 at 16:28 -0500, Peter Bergner wrote:
> 

<snip>

> 
> > +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long
> > *);
> > +    TR_STXVRLX  TR_STXVRLX_S
> > +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > long *);
> > +    TR_STXVRLX  TR_STXVRLX_U
> 
> Not a comment on these two changes, and not a request to expand this
> specific patch, but I believe I saw other built-ins that were missing
> signed long */unsigned long * versions where they could/should accept
> them.  Can you double-check whether there are other built-ins that
> need similar changes and if so, please post a separate patch to fix
> those as well?  Thanks.

OK, I will put that on my to do list to go look for that in other
builtins.  

                     Carl 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-10 18:06 [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition Carl Love
  2023-05-18 21:28 ` Peter Bergner
@ 2023-05-22  9:04 ` Kewen.Lin
  2023-05-22 19:50   ` Carl Love
                     ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Kewen.Lin @ 2023-05-22  9:04 UTC (permalink / raw)
  To: Carl Love; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches

Hi Carl,

on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> GCC maintainers:
> 
> The following patch fixes errors in the arguments in the
> __builtin_altivec_tr_stxvrhx,	__builtin_altivec_tr_stxvrwx builtin
> definitions.  Note, these builtins are used by the overloaded
> __builtin_vec_xst_trunc builtin.
> 
> The patch adds a new overloaded builtin definition for
> __builtin_vec_xst_trunc for the third argument to be unsigned and
> signed long int.
> 
> A new testcase is added for the various overloaded versions of
> __builtin_vec_xst_trunc.
> 
> The patch has been tested on Power 10 with no new regressions.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                     Carl
> 
> -------------------------------------------
> rs6000: Fix __builtin_vec_xst_trunc definition
> 
> Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
> and __builtin_altivec_tr_stxvrwx to handle the short and word cases.  The
> arguments for these two builtins are wrong.  This patch fixes the wrong
> arguments for the builtins.
> 
> Additionally, the patch adds a new __builtin_vec_xst_trunc overloaded
> version for the destination being signed or unsigned long int.
> 
> A runnable test case is added to test each of the overloaded definitions
> of __builtin_vec_xst_tru
> 
> gcc/
> 	* config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
> 	__builtin_altivec_tr_stxvrwx): Fix type of second argument.
> 	Add, definition for send argument to be signed long.
> 	* config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
> 	add definition with thrird arument signed and unsigned long.
> 	* doc/extend.texi (__builtin_vec_xst_trunc): Add documentation for
> 	new unsinged long and signed long versions.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
> 	for __builtin_vec_xst_trunc builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   7 +-
>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>  gcc/doc/extend.texi                           |   2 +
>  .../powerpc/vsx-builtin-vec_xst_trunc.c       | 217 ++++++++++++++++++
>  4 files changed, 228 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..a378491b358 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3161,12 +3161,15 @@
>    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>      TR_STXVRBX vsx_stxvrbx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>      TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>      TR_STXVRWX vsx_stxvrwx {stvec}

Good catching!

>  
> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
> +    TR_STXVRLX vsx_stxvrdx {stvec}
> +

This is mapped to the one used for type long long, it's a hard mapping,
IMHO it's wrong and not consistent with what the users expect, since on Power
the size of type long int is 4 bytes at -m32 while 8 bytes at -m64, this
implementation binding to 8 bytes can cause trouble in 32-bit.  I wonder if
it's a good idea to add one overloaded version for type long int, for now
openxl also emits error message for long int type pointer (see its doc [1]),
users can use casting to make it to the acceptable pointer types (long long
or int as its size).

[1] https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc


>    void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
>      TR_STXVRDX vsx_stxvrdx {stvec}
>  
> diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
> index c582490c084..54b7ae5e51b 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4872,6 +4872,10 @@
>      TR_STXVRWX  TR_STXVRWX_S
>    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned int *);
>      TR_STXVRWX  TR_STXVRWX_U
> +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
> +    TR_STXVRLX  TR_STXVRLX_S
> +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
> +    TR_STXVRLX  TR_STXVRLX_U
>    void __builtin_vec_xst_trunc (vsq, signed long long, signed long long *);
>      TR_STXVRDX  TR_STXVRDX_S
>    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long long *);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e426a2eb7d8..7e2ae790ab3 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18570,10 +18570,12 @@ instructions.
>  @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed long long, signed char *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed short *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed int *)}
> +@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed long *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed long long *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned char *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned short *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned int *)}
> +@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long long *)}
>  
>  Truncate and store the rightmost element of a vector, as if implemented by the
> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> new file mode 100644
> index 00000000000..7108109560d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> @@ -0,0 +1,217 @@
> +/* Test of __builtin_vec_xst_trunc  */
> +
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-require-effective-target power10_ok } */

Normally _hw is stricter than _ok, you already check power10_hw,
I think power10_ok is useless here.

> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> +
> +#include <altivec.h>
> +#include <stdio.h>
> +#include <inttypes.h>
> +#include <string.h>
> +#include <stdlib.h>
> +
> +#define DEBUG 0
> +#define TRUE  1
> +#define FALSE 0
> +#define SIZE  4
> +
> +vector signed __int128 zero_vsint128 = {0x0};
> +
> +vector signed __int128 store_data[SIZE] = {
> +{ (__int128) 0x79BD000000000000 << 64 | (__int128) 0x123456789abcdef8ULL},
> +{ (__int128) 0x8ACE000000000000 << 64 | (__int128) 0xfedcba9876543217ULL},
> +{ (__int128) 0x1357000000000000 << 64 | (__int128) 0xccccccccccccccccULL},
> +{ (__int128) 0xf000000000000000 << 64 | (__int128) 0xaaaaaaaaaaaaaaaaULL}
> +};
> +
> +signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC, 0xAA};
> +signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217, 0xcccc, 0xaaaa, };
> +signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217, 0xCCCCCCCC,
> +					0xAAAAAAAA};
> +signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
> +							0xFEDCBA9876543217ULL,
> +							0xCCCCCCCCCCCCCCCCULL,
> +							0xAAAAAAAAAAAAAAAAULL};
> +signed long long int signed_long_long_expected[SIZE] = {0x123456789ABCDEF8ULL,
> +							0xFEDCBA9876543217ULL,
> +							0xCCCCCCCCCCCCCCCCULL,
> +							0xAAAAAAAAAAAAAAAAULL};
> +
> +union conv_t {
> +  vector signed __int128 vsi128;
> +  unsigned long long ull[2];
> +  signed char schar[16];
> +  signed __int128 s128;
> +} conv;
> +
> +int check_expected_byte (signed char expected,
> +			 signed char actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected half values don't match. \n");
> +      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
> +	      expected & 0xFF, actual & 0xFF);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_half (signed short int expected,
> +			 signed short int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected short values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +	      expected & 0xFFFF, actual & 0xFFFF);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_int (signed int expected,
> +			signed int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected int values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +	      expected, actual);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_long (signed long int expected,
> +			 signed long int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected long values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +	      expected, actual);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_long_long (signed long long int expected,
> +			      signed long long int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected long long values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +	      expected, actual);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +void print_store_data (vector signed __int128 *store_data, int size)
> +{
> +#if DEBUG
> +  union conv_t val;
> +  int i;
> +  
> +  for  (i = 0; i < size; i++)
> +    {
> +      val.vsi128 = store_data[i];
> +      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
> +    }
> +#endif
> +}
> +
> +
> +void print_raw_buffer (vector signed __int128 *rawbuffer, int size)
> +{
> +#if DEBUG
> +  union conv_t val;
> +  int i;
> +  
> +  for  (i = 0; i < size; i++)
> +    {
> +      val.vsi128 = rawbuffer[i];
> +      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
> +    }
> +#endif
> +}
> +
> +int
> +main () {
> +  int i;
> +  
> +  vector signed __int128 rawbuffer[SIZE];
> +  signed char * vsbuffer_char = (signed char *)rawbuffer;
> +  signed short int * vsbuffer_short = (signed short int *)rawbuffer;
> +  signed int * vsbuffer_int = (signed int *)rawbuffer;
> +  signed long int * vsbuffer_long = (signed long *)rawbuffer;
> +  signed long long int * vsbuffer_long_long = (signed long long *)rawbuffer;
> +
> +  for  (i = 0; i < SIZE; i++)
> +    rawbuffer[i] = zero_vsint128;
> +
> +  print_store_data (store_data, SIZE);
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
> +			       vsbuffer_char);
> +      check_expected_byte (signed_char_expected[i], vsbuffer_char[i]);
> +    }
> +  
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
> +			       vsbuffer_short);
> +      check_expected_half (signed_short_expected[i], vsbuffer_short[i]);
> +    }
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
> +			       vsbuffer_int);
> +      check_expected_int (signed_int_expected[i], vsbuffer_int[i]);
> +    }
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
> +      			       vsbuffer_long);
> +      check_expected_long (signed_long_long_expected[i],
> +			   vsbuffer_long[i]);
> +    }
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long int),
> +			       vsbuffer_long_long);
> +      check_expected_long_long (signed_long_long_expected[i],
> +				vsbuffer_long_long[i]);
> +    }
> +
> +  print_raw_buffer (rawbuffer, SIZE);
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\mstxvrbx\M} } } */
> +/* { dg-final { scan-assembler {\mstxvrhx\M} } } */
> +/* { dg-final { scan-assembler {\mstxvrwx\M} } } */
> +/* { dg-final { scan-assembler {\mstxvrdx\M} } } */

Could you check the exact times instead?

BR,
Kewen


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-22  9:04 ` Kewen.Lin
@ 2023-05-22 19:50   ` Carl Love
  2023-05-23  3:36     ` Kewen.Lin
  2023-05-22 19:50   ` [PATCH ver 2] " Carl Love
  2023-05-31 17:59   ` [PATCH] " Peter Bergner
  2 siblings, 1 reply; 12+ messages in thread
From: Carl Love @ 2023-05-22 19:50 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches, cel

On Mon, 2023-05-22 at 17:04 +0800, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> > GCC maintainers:
> > 
> > The following patch fixes errors in the arguments in the
> > __builtin_altivec_tr_stxvrhx,	__builtin_altivec_tr_stxvrwx
> > builtin
> > definitions.  Note, these builtins are used by the overloaded
> > __builtin_vec_xst_trunc builtin.
> > 
> > The patch adds a new overloaded builtin definition for
> > __builtin_vec_xst_trunc for the third argument to be unsigned and
> > signed long int.
> > 
> > A new testcase is added for the various overloaded versions of
> > __builtin_vec_xst_trunc.
> > 
> > The patch has been tested on Power 10 with no new regressions.
> > 
> > Please let me know if the patch is acceptable for
> > mainline.  Thanks.
> > 
> >                     Carl
> > 
> > -------------------------------------------
> > rs6000: Fix __builtin_vec_xst_trunc definition
> > 
> > Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
> > and __builtin_altivec_tr_stxvrwx to handle the short and word
> > cases.  The
> > arguments for these two builtins are wrong.  This patch fixes the
> > wrong
> > arguments for the builtins.
> > 
> > Additionally, the patch adds a new __builtin_vec_xst_trunc
> > overloaded
> > version for the destination being signed or unsigned long int.
> > 
> > A runnable test case is added to test each of the overloaded
> > definitions
> > of __builtin_vec_xst_tru
> > 
> > gcc/
> > 	* config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
> > 	__builtin_altivec_tr_stxvrwx): Fix type of second argument.
> > 	Add, definition for send argument to be signed long.
> > 	* config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
> > 	add definition with thrird arument signed and unsigned long.
> > 	* doc/extend.texi (__builtin_vec_xst_trunc): Add documentation
> > for
> > 	new unsinged long and signed long versions.
> > 
> > gcc/testsuite/
> > 	* gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
> > 	for __builtin_vec_xst_trunc builtin.
> > ---
> >  gcc/config/rs6000/rs6000-builtins.def         |   7 +-
> >  gcc/config/rs6000/rs6000-overload.def         |   4 +
> >  gcc/doc/extend.texi                           |   2 +
> >  .../powerpc/vsx-builtin-vec_xst_trunc.c       | 217
> > ++++++++++++++++++
> >  4 files changed, 228 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin-
> > vec_xst_trunc.c
> > 
> > diff --git a/gcc/config/rs6000/rs6000-builtins.def
> > b/gcc/config/rs6000/rs6000-builtins.def
> > index 638d0bc72ca..a378491b358 100644
> > --- a/gcc/config/rs6000/rs6000-builtins.def
> > +++ b/gcc/config/rs6000/rs6000-builtins.def
> > @@ -3161,12 +3161,15 @@
> >    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char
> > *);
> >      TR_STXVRBX vsx_stxvrbx {stvec}
> >  
> > -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int
> > *);
> > +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
> > short *);
> >      TR_STXVRHX vsx_stxvrhx {stvec}
> >  
> > -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
> > short *);
> > +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int
> > *);
> >      TR_STXVRWX vsx_stxvrwx {stvec}
> 
> Good catching!
> 
> >  
> > +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long
> > *);
> > +    TR_STXVRLX vsx_stxvrdx {stvec}
> > +
> 
> This is mapped to the one used for type long long, it's a hard
> mapping,
> IMHO it's wrong and not consistent with what the users expect, since
> on Power
> the size of type long int is 4 bytes at -m32 while 8 bytes at -m64,
> this
> implementation binding to 8 bytes can cause trouble in 32-bit.  I
> wonder if
> it's a good idea to add one overloaded version for type long int, for
> now
> openxl also emits error message for long int type pointer (see its
> doc [1]),
> users can use casting to make it to the acceptable pointer types
> (long long
> or int as its size).
> 
> [1] 
> https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc
> 
> 

If I understand this correctly, the "signed long" is mapped to type
"long long int"?  Just curious, where is the mapping done?

So I believe you would like to have an additional overloaded
definition:

 void __builtin_vec_xst_trunc (vuq, signed long long, long *);

Am I understanding this correctly?  I added the above definition.

> >    void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long
> > long *);
> >      TR_STXVRDX vsx_stxvrdx {stvec}
> >  
> > diff --git a/gcc/config/rs6000/rs6000-overload.def
> > b/gcc/config/rs6000/rs6000-overload.def
> > index c582490c084..54b7ae5e51b 100644
> > --- a/gcc/config/rs6000/rs6000-overload.def
> > +++ b/gcc/config/rs6000/rs6000-overload.def
> > @@ -4872,6 +4872,10 @@
> >      TR_STXVRWX  TR_STXVRWX_S
> >    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > int *);
> >      TR_STXVRWX  TR_STXVRWX_U
> > +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long
> > *);
> > +    TR_STXVRLX  TR_STXVRLX_S
> > +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > long *);
> > +    TR_STXVRLX  TR_STXVRLX_U
> >    void __builtin_vec_xst_trunc (vsq, signed long long, signed long
> > long *);
> >      TR_STXVRDX  TR_STXVRDX_S
> >    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > long long *);
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index e426a2eb7d8..7e2ae790ab3 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -18570,10 +18570,12 @@ instructions.
> >  @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed char *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed short *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed int *)}
> > +@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed long *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed long long *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned char *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned short *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned int *)}
> > +@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned long *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned long long *)}
> >  
> >  Truncate and store the rightmost element of a vector, as if
> > implemented by the
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-
> > vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-
> > vec_xst_trunc.c
> > new file mode 100644
> > index 00000000000..7108109560d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> > @@ -0,0 +1,217 @@
> > +/* Test of __builtin_vec_xst_trunc  */
> > +
> > +/* { dg-do run { target power10_hw } } */
> > +/* { dg-require-effective-target power10_ok } */
> 
> Normally _hw is stricter than _ok, you already check power10_hw,
> I think power10_ok is useless here.

OK, removed the  dg-require-effective-target power10_ok line.

> 
> > +/* { dg-require-effective-target int128 } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> > +
> > +#include <altivec.h>
> > +#include <stdio.h>
> > +#include <inttypes.h>
> > +#include <string.h>
> > +#include <stdlib.h>
> > +
> > +#define DEBUG 0
> > +#define TRUE  1
> > +#define FALSE 0
> > +#define SIZE  4
> > +
> > +vector signed __int128 zero_vsint128 = {0x0};
> > +
> > +vector signed __int128 store_data[SIZE] = {
> > +{ (__int128) 0x79BD000000000000 << 64 | (__int128)
> > 0x123456789abcdef8ULL},
> > +{ (__int128) 0x8ACE000000000000 << 64 | (__int128)
> > 0xfedcba9876543217ULL},
> > +{ (__int128) 0x1357000000000000 << 64 | (__int128)
> > 0xccccccccccccccccULL},
> > +{ (__int128) 0xf000000000000000 << 64 | (__int128)
> > 0xaaaaaaaaaaaaaaaaULL}
> > +};
> > +
> > +signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC,
> > 0xAA};
> > +signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217,
> > 0xcccc, 0xaaaa, };
> > +signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217,
> > 0xCCCCCCCC,
> > +					0xAAAAAAAA};
> > +signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
> > +							0xFEDCBA9876543
> > 217ULL,
> > +							0xCCCCCCCCCCCCC
> > CCCULL,
> > +							0xAAAAAAAAAAAAA
> > AAAULL};
> > +signed long long int signed_long_long_expected[SIZE] =
> > {0x123456789ABCDEF8ULL,
> > +							0xFEDCBA9876543
> > 217ULL,
> > +							0xCCCCCCCCCCCCC
> > CCCULL,
> > +							0xAAAAAAAAAAAAA
> > AAAULL};
> > +
> > +union conv_t {
> > +  vector signed __int128 vsi128;
> > +  unsigned long long ull[2];
> > +  signed char schar[16];
> > +  signed __int128 s128;
> > +} conv;
> > +
> > +int check_expected_byte (signed char expected,
> > +			 signed char actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected half values don't match. \n");
> > +      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
> > +	      expected & 0xFF, actual & 0xFF);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_half (signed short int expected,
> > +			 signed short int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected short values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +	      expected & 0xFFFF, actual & 0xFFFF);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_int (signed int expected,
> > +			signed int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected int values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +	      expected, actual);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_long (signed long int expected,
> > +			 signed long int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected long values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +	      expected, actual);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_long_long (signed long long int expected,
> > +			      signed long long int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected long long values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +	      expected, actual);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +void print_store_data (vector signed __int128 *store_data, int
> > size)
> > +{
> > +#if DEBUG
> > +  union conv_t val;
> > +  int i;
> > +  
> > +  for  (i = 0; i < size; i++)
> > +    {
> > +      val.vsi128 = store_data[i];
> > +      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1],
> > val.ull[0]);
> > +    }
> > +#endif
> > +}
> > +
> > +
> > +void print_raw_buffer (vector signed __int128 *rawbuffer, int
> > size)
> > +{
> > +#if DEBUG
> > +  union conv_t val;
> > +  int i;
> > +  
> > +  for  (i = 0; i < size; i++)
> > +    {
> > +      val.vsi128 = rawbuffer[i];
> > +      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1],
> > val.ull[0]);
> > +    }
> > +#endif
> > +}
> > +
> > +int
> > +main () {
> > +  int i;
> > +  
> > +  vector signed __int128 rawbuffer[SIZE];
> > +  signed char * vsbuffer_char = (signed char *)rawbuffer;
> > +  signed short int * vsbuffer_short = (signed short int
> > *)rawbuffer;
> > +  signed int * vsbuffer_int = (signed int *)rawbuffer;
> > +  signed long int * vsbuffer_long = (signed long *)rawbuffer;
> > +  signed long long int * vsbuffer_long_long = (signed long long
> > *)rawbuffer;
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    rawbuffer[i] = zero_vsint128;
> > +
> > +  print_store_data (store_data, SIZE);
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
> > +			       vsbuffer_char);
> > +      check_expected_byte (signed_char_expected[i],
> > vsbuffer_char[i]);
> > +    }
> > +  
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
> > +			       vsbuffer_short);
> > +      check_expected_half (signed_short_expected[i],
> > vsbuffer_short[i]);
> > +    }
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
> > +			       vsbuffer_int);
> > +      check_expected_int (signed_int_expected[i],
> > vsbuffer_int[i]);
> > +    }
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
> > +      			       vsbuffer_long);
> > +      check_expected_long (signed_long_long_expected[i],
> > +			   vsbuffer_long[i]);
> > +    }
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long
> > int),
> > +			       vsbuffer_long_long);
> > +      check_expected_long_long (signed_long_long_expected[i],
> > +				vsbuffer_long_long[i]);
> > +    }
> > +
> > +  print_raw_buffer (rawbuffer, SIZE);
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler {\mstxvrbx\M} } } */
> > +/* { dg-final { scan-assembler {\mstxvrhx\M} } } */
> > +/* { dg-final { scan-assembler {\mstxvrwx\M} } } */
> > +/* { dg-final { scan-assembler {\mstxvrdx\M} } } */
> 
> Could you check the exact times instead?

OK, added check for the instruction counts.
> 
> BR,
> Kewen
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH ver 2] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-22  9:04 ` Kewen.Lin
  2023-05-22 19:50   ` Carl Love
@ 2023-05-22 19:50   ` Carl Love
  2023-05-31 17:59   ` [PATCH] " Peter Bergner
  2 siblings, 0 replies; 12+ messages in thread
From: Carl Love @ 2023-05-22 19:50 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches, cel

Kewen, GCC maintainers:

Version 2, addressed comments from Kewen. Added an additional
overloaded builtin:  
   void __builtin_vec_xst_trunc (vuq, signed long long, long *);


The following patch fixes errors in the arguments in the
__builtin_altivec_tr_stxvrhx,   __builtin_altivec_tr_stxvrwx builtin
definitions.  Note, these builtins are used by the overloaded
__builtin_vec_xst_trunc builtin.

The patch adds a new overloaded builtin definition for
__builtin_vec_xst_trunc for the third argument to be unsigned and
signed long int.

A new testcase is added for the various overloaded versions of
__builtin_vec_xst_trunc.

The patch has been tested on Power 10 with no new regressions.

Please let me know if the patch is acceptable for mainline.  Thanks.

                    Carl

-----------------------------------------------------------------
rs6000: Fix __builtin_vec_xst_trunc definition

Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
and __builtin_altivec_tr_stxvrwx to handle the short and word cases.  The
arguments for these two builtins are wrong.  This patch fixes the wrong
arguments for the builtins.

Additionally, the patch adds a new __builtin_vec_xst_trunc overloaded
version for the destination being signed or unsigned long int.

A runnable test case is added to test each of the overloaded definitions
of __builtin_vec_xst_tru

gcc/
	* config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
	__builtin_altivec_tr_stxvrwx): Fix type of second argument.
	Add, definition for send argument to be signed long.
	* config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
	add definition with thrird arument signed and unsigned long.
	* doc/extend.texi (__builtin_vec_xst_trunc): Add documentation for
	new unsinged long and signed long versions.

gcc/testsuite/
	* gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
	for __builtin_vec_xst_trunc builtin.
---
 gcc/config/rs6000/rs6000-builtins.def         |   7 +-
 gcc/config/rs6000/rs6000-overload.def         |   6 +
 gcc/doc/extend.texi                           |   2 +
 .../powerpc/vsx-builtin-vec_xst_trunc.c       | 241 ++++++++++++++++++
 4 files changed, 254 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..a378491b358 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3161,12 +3161,15 @@
   void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
     TR_STXVRBX vsx_stxvrbx {stvec}
 
-  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
+  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
     TR_STXVRHX vsx_stxvrhx {stvec}
 
-  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
+  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
     TR_STXVRWX vsx_stxvrwx {stvec}
 
+  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
+    TR_STXVRLX vsx_stxvrdx {stvec}
+
   void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
     TR_STXVRDX vsx_stxvrdx {stvec}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index c582490c084..fd47f5b24e8 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4872,6 +4872,12 @@
     TR_STXVRWX  TR_STXVRWX_S
   void __builtin_vec_xst_trunc (vuq, signed long long, unsigned int *);
     TR_STXVRWX  TR_STXVRWX_U
+  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
+    TR_STXVRLX  TR_STXVRLX_S
+  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
+    TR_STXVRLX  TR_STXVRLX_U
+  void __builtin_vec_xst_trunc (vuq, signed long long, long *);
+    TR_STXVRLX  TR_STXVRLX_I
   void __builtin_vec_xst_trunc (vsq, signed long long, signed long long *);
     TR_STXVRDX  TR_STXVRDX_S
   void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long long *);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e426a2eb7d8..7e2ae790ab3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18570,10 +18570,12 @@ instructions.
 @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed long long, signed char *)}
 @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed short *)}
 @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed int *)}
+@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed long *)}
 @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, signed long long *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned char *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned short *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned int *)}
+@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long *)}
 @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long long *)}
 
 Truncate and store the rightmost element of a vector, as if implemented by the
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
new file mode 100644
index 00000000000..99bd0dbd61b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
@@ -0,0 +1,241 @@
+/* Test of __builtin_vec_xst_trunc  */
+
+/* { dg-do run { target power10_hw } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+#include <altivec.h>
+#include <stdio.h>
+#include <inttypes.h>
+#include <string.h>
+#include <stdlib.h>
+
+#define DEBUG 0
+#define TRUE  1
+#define FALSE 0
+#define SIZE  4
+
+vector signed __int128 zero_vsint128 = {0x0};
+
+vector signed __int128 store_data[SIZE] = {
+{ (__int128) 0x79BD000000000000 << 64 | (__int128) 0x123456789abcdef8ULL},
+{ (__int128) 0x8ACE000000000000 << 64 | (__int128) 0xfedcba9876543217ULL},
+{ (__int128) 0x1357000000000000 << 64 | (__int128) 0xccccccccccccccccULL},
+{ (__int128) 0xf000000000000000 << 64 | (__int128) 0xaaaaaaaaaaaaaaaaULL}
+};
+
+signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC, 0xAA};
+signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217, 0xcccc, 0xaaaa, };
+signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217, 0xCCCCCCCC,
+					0xAAAAAAAA};
+signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
+							0xFEDCBA9876543217ULL,
+							0xCCCCCCCCCCCCCCCCULL,
+							0xAAAAAAAAAAAAAAAAULL};
+signed long long int signed_long_long_expected[SIZE] = {0x123456789ABCDEF8ULL,
+							0xFEDCBA9876543217ULL,
+							0xCCCCCCCCCCCCCCCCULL,
+							0xAAAAAAAAAAAAAAAAULL};
+
+union conv_t {
+  vector signed __int128 vsi128;
+  unsigned long long ull[2];
+  signed char schar[16];
+  signed __int128 s128;
+} conv;
+
+int check_expected_byte (signed char expected,
+			 signed char actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected half values don't match. \n");
+      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
+	      expected & 0xFF, actual & 0xFF);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_half (signed short int expected,
+			 signed short int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected short values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected & 0xFFFF, actual & 0xFFFF);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_int (signed int expected,
+			signed int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected int values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_long (signed long int expected,
+			 signed long int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected long values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_long_long (signed long long int expected,
+			      signed long long int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected long long values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+int check_expected_long_int (signed long long int expected,
+			     long int  actual)
+{
+  /* Return TRUE if expected and actual values all match.  */
+  if (expected != actual)
+    {
+#if DEBUG
+      printf ("ERROR: Expected long int values don't match. \n");
+      printf ("  Expected 0x%x, actual 0x%x\n",
+	      expected, actual);
+#endif
+      return FALSE;
+    }
+  return TRUE;
+}
+
+void print_store_data (vector signed __int128 *store_data, int size)
+{
+#if DEBUG
+  union conv_t val;
+  int i;
+  
+  for  (i = 0; i < size; i++)
+    {
+      val.vsi128 = store_data[i];
+      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
+    }
+#endif
+}
+
+
+void print_raw_buffer (vector signed __int128 *rawbuffer, int size)
+{
+#if DEBUG
+  union conv_t val;
+  int i;
+  
+  for  (i = 0; i < size; i++)
+    {
+      val.vsi128 = rawbuffer[i];
+      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
+    }
+#endif
+}
+
+int
+main () {
+  int i;
+  
+  vector signed __int128 rawbuffer[SIZE];
+  signed char * vsbuffer_char = (signed char *)rawbuffer;
+  signed short int * vsbuffer_short = (signed short int *)rawbuffer;
+  signed int * vsbuffer_int = (signed int *)rawbuffer;
+  signed long int * vsbuffer_long = (signed long *)rawbuffer;
+  signed long long int * vsbuffer_long_long = (signed long long *)rawbuffer;
+  long int * vsbuffer_long_int = (long int*)rawbuffer;
+
+  for  (i = 0; i < SIZE; i++)
+    rawbuffer[i] = zero_vsint128;
+
+  print_store_data (store_data, SIZE);
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
+			       vsbuffer_char);
+      check_expected_byte (signed_char_expected[i], vsbuffer_char[i]);
+    }
+  
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
+			       vsbuffer_short);
+      check_expected_half (signed_short_expected[i], vsbuffer_short[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
+			       vsbuffer_int);
+      check_expected_int (signed_int_expected[i], vsbuffer_int[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
+      			       vsbuffer_long);
+      check_expected_long (signed_long_long_expected[i],
+			   vsbuffer_long[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long int),
+			       vsbuffer_long_long);
+      check_expected_long_long (signed_long_long_expected[i],
+				vsbuffer_long_long[i]);
+    }
+
+  for  (i = 0; i < SIZE; i++)
+    {
+      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long int),
+			       vsbuffer_long_long);
+      check_expected_long_int (signed_long_long_expected[i],
+			       vsbuffer_long_int[i]);
+    }
+
+  print_raw_buffer (rawbuffer, SIZE);
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times {\mstxvrbx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvrhx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvrwx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvrdx\M} 3 } } */
-- 
2.37.2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-22 19:50   ` Carl Love
@ 2023-05-23  3:36     ` Kewen.Lin
  0 siblings, 0 replies; 12+ messages in thread
From: Kewen.Lin @ 2023-05-23  3:36 UTC (permalink / raw)
  To: Carl Love; +Cc: Peter Bergner, Segher Boessenkool, gcc-patches

on 2023/5/23 03:50, Carl Love wrote:
> On Mon, 2023-05-22 at 17:04 +0800, Kewen.Lin wrote:
>> Hi Carl,
>>
>> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
>>> GCC maintainers:
>>>
>>> The following patch fixes errors in the arguments in the
>>> __builtin_altivec_tr_stxvrhx,	__builtin_altivec_tr_stxvrwx
>>> builtin
>>> definitions.  Note, these builtins are used by the overloaded
>>> __builtin_vec_xst_trunc builtin.
>>>
>>> The patch adds a new overloaded builtin definition for
>>> __builtin_vec_xst_trunc for the third argument to be unsigned and
>>> signed long int.
>>>
>>> A new testcase is added for the various overloaded versions of
>>> __builtin_vec_xst_trunc.
>>>
>>> The patch has been tested on Power 10 with no new regressions.
>>>
>>> Please let me know if the patch is acceptable for
>>> mainline.  Thanks.
>>>
>>>                     Carl
>>>
>>> -------------------------------------------
>>> rs6000: Fix __builtin_vec_xst_trunc definition
>>>
>>> Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
>>> and __builtin_altivec_tr_stxvrwx to handle the short and word
>>> cases.  The
>>> arguments for these two builtins are wrong.  This patch fixes the
>>> wrong
>>> arguments for the builtins.
>>>
>>> Additionally, the patch adds a new __builtin_vec_xst_trunc
>>> overloaded
>>> version for the destination being signed or unsigned long int.
>>>
>>> A runnable test case is added to test each of the overloaded
>>> definitions
>>> of __builtin_vec_xst_tru
>>>
>>> gcc/
>>> 	* config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
>>> 	__builtin_altivec_tr_stxvrwx): Fix type of second argument.
>>> 	Add, definition for send argument to be signed long.
>>> 	* config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
>>> 	add definition with thrird arument signed and unsigned long.
>>> 	* doc/extend.texi (__builtin_vec_xst_trunc): Add documentation
>>> for
>>> 	new unsinged long and signed long versions.
>>>
>>> gcc/testsuite/
>>> 	* gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
>>> 	for __builtin_vec_xst_trunc builtin.
>>> ---
>>>  gcc/config/rs6000/rs6000-builtins.def         |   7 +-
>>>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>>>  gcc/doc/extend.texi                           |   2 +
>>>  .../powerpc/vsx-builtin-vec_xst_trunc.c       | 217
>>> ++++++++++++++++++
>>>  4 files changed, 228 insertions(+), 2 deletions(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin-
>>> vec_xst_trunc.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index 638d0bc72ca..a378491b358 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -3161,12 +3161,15 @@
>>>    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char
>>> *);
>>>      TR_STXVRBX vsx_stxvrbx {stvec}
>>>  
>>> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int
>>> *);
>>> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
>>> short *);
>>>      TR_STXVRHX vsx_stxvrhx {stvec}
>>>  
>>> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
>>> short *);
>>> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int
>>> *);
>>>      TR_STXVRWX vsx_stxvrwx {stvec}
>>
>> Good catching!
>>
>>>  
>>> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long
>>> *);
>>> +    TR_STXVRLX vsx_stxvrdx {stvec}
>>> +
>>
>> This is mapped to the one used for type long long, it's a hard
>> mapping,
>> IMHO it's wrong and not consistent with what the users expect, since
>> on Power
>> the size of type long int is 4 bytes at -m32 while 8 bytes at -m64,
>> this
>> implementation binding to 8 bytes can cause trouble in 32-bit.  I
>> wonder if
>> it's a good idea to add one overloaded version for type long int, for
>> now
>> openxl also emits error message for long int type pointer (see its
>> doc [1]),
>> users can use casting to make it to the acceptable pointer types
>> (long long
>> or int as its size).
>>
>> [1] 
>> https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc
>>
>>
> 
> If I understand this correctly, the "signed long" is mapped to type
> "long long int"?  Just curious, where is the mapping done?

Sorry for the confusion, the mapping here is for your implementation,

>>> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long
>>> *);
>>> +    TR_STXVRLX vsx_stxvrdx {stvec}

you used the one **vsx_stxvrdx** which is for "signed long long int"
(doubleword, exactly 8 bytes size element), so I said you have a hard
mapping, takes "signed long int" as "signed long long int".

But again type signed long int isn't guaranteed to be with the same size
as type signed long long int, on Power it has the same size as int (4 bytes)
on 32-bit env, while has the same size as long long (8 bytes) on 64-bit
env, since your test case is guarded with power10, you didn't get a chance
to run it on 32-bit env yet, that's why you didn't spot any failures.

> 
> So I believe you would like to have an additional overloaded
> definition:
> 
>  void __builtin_vec_xst_trunc (vuq, signed long long, long *);
> 
> Am I understanding this correctly?  I added the above definition.
> 

No, I questioned that if it's really a good idea to add one overloaded
for signed long, openxl also only supports int and long long types and
emits error message for long type, users can just do the pointer type
casting when they have signed long type pointer.  Do we have any
requirements for this long type, could we just keep supporting int and
long long, but not long (not adding this new overloaded definition)?

BR,
Kewen

>>>    void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long
>>> long *);
>>>      TR_STXVRDX vsx_stxvrdx {stvec}
>>>  
>>> diff --git a/gcc/config/rs6000/rs6000-overload.def
>>> b/gcc/config/rs6000/rs6000-overload.def
>>> index c582490c084..54b7ae5e51b 100644
>>> --- a/gcc/config/rs6000/rs6000-overload.def
>>> +++ b/gcc/config/rs6000/rs6000-overload.def
>>> @@ -4872,6 +4872,10 @@
>>>      TR_STXVRWX  TR_STXVRWX_S
>>>    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
>>> int *);
>>>      TR_STXVRWX  TR_STXVRWX_U
>>> +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long
>>> *);
>>> +    TR_STXVRLX  TR_STXVRLX_S
>>> +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
>>> long *);
>>> +    TR_STXVRLX  TR_STXVRLX_U
>>>    void __builtin_vec_xst_trunc (vsq, signed long long, signed long
>>> long *);
>>>      TR_STXVRDX  TR_STXVRDX_S
>>>    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
>>> long long *);
>>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>>> index e426a2eb7d8..7e2ae790ab3 100644
>>> --- a/gcc/doc/extend.texi
>>> +++ b/gcc/doc/extend.texi
>>> @@ -18570,10 +18570,12 @@ instructions.
>>>  @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed
>>> long long, signed char *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
>>> long long, signed short *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
>>> long long, signed int *)}
>>> +@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
>>> long long, signed long *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
>>> long long, signed long long *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
>>> signed long long, unsigned char *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
>>> signed long long, unsigned short *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
>>> signed long long, unsigned int *)}
>>> +@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
>>> signed long long, unsigned long *)}
>>>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
>>> signed long long, unsigned long long *)}
>>>  
>>>  Truncate and store the rightmost element of a vector, as if
>>> implemented by the
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-
>>> vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-
>>> vec_xst_trunc.c
>>> new file mode 100644
>>> index 00000000000..7108109560d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
>>> @@ -0,0 +1,217 @@
>>> +/* Test of __builtin_vec_xst_trunc  */
>>> +
>>> +/* { dg-do run { target power10_hw } } */
>>> +/* { dg-require-effective-target power10_ok } */
>>
>> Normally _hw is stricter than _ok, you already check power10_hw,
>> I think power10_ok is useless here.
> 
> OK, removed the  dg-require-effective-target power10_ok line.
> 
>>
>>> +/* { dg-require-effective-target int128 } */
>>> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
>>> +
>>> +#include <altivec.h>
>>> +#include <stdio.h>
>>> +#include <inttypes.h>
>>> +#include <string.h>
>>> +#include <stdlib.h>
>>> +
>>> +#define DEBUG 0
>>> +#define TRUE  1
>>> +#define FALSE 0
>>> +#define SIZE  4
>>> +
>>> +vector signed __int128 zero_vsint128 = {0x0};
>>> +
>>> +vector signed __int128 store_data[SIZE] = {
>>> +{ (__int128) 0x79BD000000000000 << 64 | (__int128)
>>> 0x123456789abcdef8ULL},
>>> +{ (__int128) 0x8ACE000000000000 << 64 | (__int128)
>>> 0xfedcba9876543217ULL},
>>> +{ (__int128) 0x1357000000000000 << 64 | (__int128)
>>> 0xccccccccccccccccULL},
>>> +{ (__int128) 0xf000000000000000 << 64 | (__int128)
>>> 0xaaaaaaaaaaaaaaaaULL}
>>> +};
>>> +
>>> +signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC,
>>> 0xAA};
>>> +signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217,
>>> 0xcccc, 0xaaaa, };
>>> +signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217,
>>> 0xCCCCCCCC,
>>> +					0xAAAAAAAA};
>>> +signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
>>> +							0xFEDCBA9876543
>>> 217ULL,
>>> +							0xCCCCCCCCCCCCC
>>> CCCULL,
>>> +							0xAAAAAAAAAAAAA
>>> AAAULL};
>>> +signed long long int signed_long_long_expected[SIZE] =
>>> {0x123456789ABCDEF8ULL,
>>> +							0xFEDCBA9876543
>>> 217ULL,
>>> +							0xCCCCCCCCCCCCC
>>> CCCULL,
>>> +							0xAAAAAAAAAAAAA
>>> AAAULL};
>>> +
>>> +union conv_t {
>>> +  vector signed __int128 vsi128;
>>> +  unsigned long long ull[2];
>>> +  signed char schar[16];
>>> +  signed __int128 s128;
>>> +} conv;
>>> +
>>> +int check_expected_byte (signed char expected,
>>> +			 signed char actual)
>>> +{
>>> +  /* Return TRUE if expected and actual values all match.  */
>>> +  if (expected != actual)
>>> +    {
>>> +#if DEBUG
>>> +      printf ("ERROR: Expected half values don't match. \n");
>>> +      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
>>> +	      expected & 0xFF, actual & 0xFF);
>>> +#endif
>>> +      return FALSE;
>>> +    }
>>> +  return TRUE;
>>> +}
>>> +
>>> +int check_expected_half (signed short int expected,
>>> +			 signed short int  actual)
>>> +{
>>> +  /* Return TRUE if expected and actual values all match.  */
>>> +  if (expected != actual)
>>> +    {
>>> +#if DEBUG
>>> +      printf ("ERROR: Expected short values don't match. \n");
>>> +      printf ("  Expected 0x%x, actual 0x%x\n",
>>> +	      expected & 0xFFFF, actual & 0xFFFF);
>>> +#endif
>>> +      return FALSE;
>>> +    }
>>> +  return TRUE;
>>> +}
>>> +
>>> +int check_expected_int (signed int expected,
>>> +			signed int  actual)
>>> +{
>>> +  /* Return TRUE if expected and actual values all match.  */
>>> +  if (expected != actual)
>>> +    {
>>> +#if DEBUG
>>> +      printf ("ERROR: Expected int values don't match. \n");
>>> +      printf ("  Expected 0x%x, actual 0x%x\n",
>>> +	      expected, actual);
>>> +#endif
>>> +      return FALSE;
>>> +    }
>>> +  return TRUE;
>>> +}
>>> +
>>> +int check_expected_long (signed long int expected,
>>> +			 signed long int  actual)
>>> +{
>>> +  /* Return TRUE if expected and actual values all match.  */
>>> +  if (expected != actual)
>>> +    {
>>> +#if DEBUG
>>> +      printf ("ERROR: Expected long values don't match. \n");
>>> +      printf ("  Expected 0x%x, actual 0x%x\n",
>>> +	      expected, actual);
>>> +#endif
>>> +      return FALSE;
>>> +    }
>>> +  return TRUE;
>>> +}
>>> +
>>> +int check_expected_long_long (signed long long int expected,
>>> +			      signed long long int  actual)
>>> +{
>>> +  /* Return TRUE if expected and actual values all match.  */
>>> +  if (expected != actual)
>>> +    {
>>> +#if DEBUG
>>> +      printf ("ERROR: Expected long long values don't match. \n");
>>> +      printf ("  Expected 0x%x, actual 0x%x\n",
>>> +	      expected, actual);
>>> +#endif
>>> +      return FALSE;
>>> +    }
>>> +  return TRUE;
>>> +}
>>> +
>>> +void print_store_data (vector signed __int128 *store_data, int
>>> size)
>>> +{
>>> +#if DEBUG
>>> +  union conv_t val;
>>> +  int i;
>>> +  
>>> +  for  (i = 0; i < size; i++)
>>> +    {
>>> +      val.vsi128 = store_data[i];
>>> +      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1],
>>> val.ull[0]);
>>> +    }
>>> +#endif
>>> +}
>>> +
>>> +
>>> +void print_raw_buffer (vector signed __int128 *rawbuffer, int
>>> size)
>>> +{
>>> +#if DEBUG
>>> +  union conv_t val;
>>> +  int i;
>>> +  
>>> +  for  (i = 0; i < size; i++)
>>> +    {
>>> +      val.vsi128 = rawbuffer[i];
>>> +      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1],
>>> val.ull[0]);
>>> +    }
>>> +#endif
>>> +}
>>> +
>>> +int
>>> +main () {
>>> +  int i;
>>> +  
>>> +  vector signed __int128 rawbuffer[SIZE];
>>> +  signed char * vsbuffer_char = (signed char *)rawbuffer;
>>> +  signed short int * vsbuffer_short = (signed short int
>>> *)rawbuffer;
>>> +  signed int * vsbuffer_int = (signed int *)rawbuffer;
>>> +  signed long int * vsbuffer_long = (signed long *)rawbuffer;
>>> +  signed long long int * vsbuffer_long_long = (signed long long
>>> *)rawbuffer;
>>> +
>>> +  for  (i = 0; i < SIZE; i++)
>>> +    rawbuffer[i] = zero_vsint128;
>>> +
>>> +  print_store_data (store_data, SIZE);
>>> +
>>> +  for  (i = 0; i < SIZE; i++)
>>> +    {
>>> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
>>> +			       vsbuffer_char);
>>> +      check_expected_byte (signed_char_expected[i],
>>> vsbuffer_char[i]);
>>> +    }
>>> +  
>>> +  for  (i = 0; i < SIZE; i++)
>>> +    {
>>> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
>>> +			       vsbuffer_short);
>>> +      check_expected_half (signed_short_expected[i],
>>> vsbuffer_short[i]);
>>> +    }
>>> +
>>> +  for  (i = 0; i < SIZE; i++)
>>> +    {
>>> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
>>> +			       vsbuffer_int);
>>> +      check_expected_int (signed_int_expected[i],
>>> vsbuffer_int[i]);
>>> +    }
>>> +
>>> +  for  (i = 0; i < SIZE; i++)
>>> +    {
>>> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
>>> +      			       vsbuffer_long);
>>> +      check_expected_long (signed_long_long_expected[i],
>>> +			   vsbuffer_long[i]);
>>> +    }
>>> +
>>> +  for  (i = 0; i < SIZE; i++)
>>> +    {
>>> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long
>>> int),
>>> +			       vsbuffer_long_long);
>>> +      check_expected_long_long (signed_long_long_expected[i],
>>> +				vsbuffer_long_long[i]);
>>> +    }
>>> +
>>> +  print_raw_buffer (rawbuffer, SIZE);
>>> +  return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler {\mstxvrbx\M} } } */
>>> +/* { dg-final { scan-assembler {\mstxvrhx\M} } } */
>>> +/* { dg-final { scan-assembler {\mstxvrwx\M} } } */
>>> +/* { dg-final { scan-assembler {\mstxvrdx\M} } } */
>>
>> Could you check the exact times instead?
> 
> OK, added check for the instruction counts.
>>
>> BR,
>> Kewen
>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-22  9:04 ` Kewen.Lin
  2023-05-22 19:50   ` Carl Love
  2023-05-22 19:50   ` [PATCH ver 2] " Carl Love
@ 2023-05-31 17:59   ` Peter Bergner
  2023-06-01 20:01     ` Carl Love
  2023-06-01 20:01     ` [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx Carl Love
  2 siblings, 2 replies; 12+ messages in thread
From: Peter Bergner @ 2023-05-31 17:59 UTC (permalink / raw)
  To: Kewen.Lin, Carl Love; +Cc: Segher Boessenkool, gcc-patches

On 5/22/23 4:04 AM, Kewen.Lin wrote:
> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
>> @@ -3161,12 +3161,15 @@
>>    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>>      TR_STXVRBX vsx_stxvrbx {stvec}
>>  
>> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
>> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>>      TR_STXVRHX vsx_stxvrhx {stvec}
>>  
>> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
>> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>>      TR_STXVRWX vsx_stxvrwx {stvec}
> 
> Good catching!

This hunk should be its own patch and commit, as it is independent of
the other change.  Especially since other built-ins also don't have
{,un}simgned long * as arguments, not just __builtin_altivec_tr_stxvr*x.




>> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
>> +    TR_STXVRLX vsx_stxvrdx {stvec}
>> +
> 
> This is mapped to the one used for type long long, it's a hard mapping,
> IMHO it's wrong and not consistent with what the users expect, since on Power
> the size of type long int is 4 bytes at -m32 while 8 bytes at -m64, this
> implementation binding to 8 bytes can cause trouble in 32-bit.  I wonder if
> it's a good idea to add one overloaded version for type long int, for now
> openxl also emits error message for long int type pointer (see its doc [1]),
> users can use casting to make it to the acceptable pointer types (long long
> or int as its size).

I'm the person who noticed that we don't accept signed/unsigned long * as
an argument type and asked Carl to investigate.  I find it hard to believe
we accept all integer pointer types, except long *.  I agree that it shouldn't
always map to long long *, since as you say, that's wrong for -m32.
My hope was that we could somehow automagically handle the long * types
in the built-in machinery, mapping them to either the int * built-in or
the long long * built-in depending on -m32 or -m64.  Again, this limitation
is no limited to __builtin_altivec_tr_stx* built-ins, but others as well,
so I was kind of hoping for a general solution that would fix them all.
I'm not sure of that's possible though.

Peter



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-05-31 17:59   ` [PATCH] " Peter Bergner
@ 2023-06-01 20:01     ` Carl Love
  2023-06-02  3:22       ` Kewen.Lin
  2023-06-01 20:01     ` [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx Carl Love
  1 sibling, 1 reply; 12+ messages in thread
From: Carl Love @ 2023-06-01 20:01 UTC (permalink / raw)
  To: Peter Bergner, Kewen.Lin; +Cc: Segher Boessenkool, gcc-patches

On Wed, 2023-05-31 at 12:59 -0500, Peter Bergner wrote:
> On 5/22/23 4:04 AM, Kewen.Lin wrote:
> > on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> > > @@ -3161,12 +3161,15 @@
> > >    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed
> > > char *);
> > >      TR_STXVRBX vsx_stxvrbx {stvec}
> > >  
> > > -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
> > > int *);
> > > +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
> > > short *);
> > >      TR_STXVRHX vsx_stxvrhx {stvec}
> > >  
> > > -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
> > > short *);
> > > +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
> > > int *);
> > >      TR_STXVRWX vsx_stxvrwx {stvec}
> > 
> > Good catching!
> 
> This hunk should be its own patch and commit, as it is independent of
> the other change.  Especially since other built-ins also don't have
> {,un}simgned long * as arguments, not just
> __builtin_altivec_tr_stxvr*x.

Yes, I was thinking the patch needs to be split into a bug fix and a
patch for the long * arguments.

I redid the patch to create the bug fix only.  The patch includes a
testcase that tests the __builtin_altivec_tr_stxvr* builtins.  I will
post the new patch.

The updated patch is now called:  " rs6000: Fix arguments for
__builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx"

> 
> 
> 
> > > +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed
> > > long *);
> > > +    TR_STXVRLX vsx_stxvrdx {stvec}
> > > +
> > 
> > This is mapped to the one used for type long long, it's a hard
> > mapping,
> > IMHO it's wrong and not consistent with what the users expect,
> > since on Power
> > the size of type long int is 4 bytes at -m32 while 8 bytes at -m64,
> > this
> > implementation binding to 8 bytes can cause trouble in 32-bit.  I
> > wonder if
> > it's a good idea to add one overloaded version for type long int,
> > for now
> > openxl also emits error message for long int type pointer (see its
> > doc [1]),
> > users can use casting to make it to the acceptable pointer types
> > (long long
> > or int as its size).
> 
> I'm the person who noticed that we don't accept signed/unsigned long
> * as
> an argument type and asked Carl to investigate.  I find it hard to
> believe
> we accept all integer pointer types, except long *.  I agree that it
> shouldn't
> always map to long long *, since as you say, that's wrong for -m32.
> My hope was that we could somehow automagically handle the long *
> types
> in the built-in machinery, mapping them to either the int * built-in
> or
> the long long * built-in depending on -m32 or -m64.  Again, this
> limitation
> is no limited to __builtin_altivec_tr_stx* built-ins, but others as
> well,
> so I was kind of hoping for a general solution that would fix them
> all.
> I'm not sure of that's possible though.

Per Peter's request, I added the overloaded version of the
__builtin_vec_xst_trunc builtin with the long * argument which Kewen
pushed back on.  So, that approach is not acceptable.  Not sure about
how to get the builtin infrastructure to automatically map long * to
int * or long long *?  If someone has some idea on how to do that, I
will gladly pursue it.  I will study the builtin support some more to
see if I can come up with any ideas as well.

             Carl


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx
  2023-05-31 17:59   ` [PATCH] " Peter Bergner
  2023-06-01 20:01     ` Carl Love
@ 2023-06-01 20:01     ` Carl Love
  2023-06-02  3:02       ` Kewen.Lin
  1 sibling, 1 reply; 12+ messages in thread
From: Carl Love @ 2023-06-01 20:01 UTC (permalink / raw)
  To: Peter Bergner, Kewen.Lin; +Cc: Segher Boessenkool, gcc-patches, cel

Kewen, Segher, Peter:

The following patch is a redo of the previous "rs6000: Fix
__builtin_vec_xst_trunc definition" patch.  

This patch fixes the argument in the two builtin definitions
__builtin_altivec_tr_stxvrwx and __builtin_altivec_tr_stxvrhx.  It also
adds with a testcase to validate the related builtins which have the
third argument of char *, short *, int * and long long *.

I have tested the patch on Power 10 with no regressions.

Please let me know if this patch is acceptable for mainline.

                      Carl 

------------------------------------------------------------
rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx

The third argument for __builtin_altivec_tr_stxvrhx should be short *
not int *.  Similarly, the third argument for __builtin_altivec_tr_stxvrwx
should be int * not short *.  This patch fixes the arguments in the two
builtins.

A runnable test case is added to test the __builtin_altivec_tr_stxvrbx,
__builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx and
__builtin_altivec_tr_stxvrdx builtins.

gcc/
	* config/rs6000/rs6000-builtins.def (__builtin_altivec_tr_stxvrhx,
	__builtin_altivec_tr_stxvrwx): Fix type of third argument.

gcc/testsuite/
	* gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c: New test
	for __builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx,
	__builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrdx.
---
 gcc/config/rs6000/rs6000-builtins.def         |   4 +-
 .../builtin_altivec_tr_stxvr_runnable.c       | 107 ++++++++++++++++++
 2 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..d7839f2e06b 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3161,10 +3161,10 @@
   void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
     TR_STXVRBX vsx_stxvrbx {stvec}
 
-  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
+  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
     TR_STXVRHX vsx_stxvrhx {stvec}
 
-  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
+  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
     TR_STXVRWX vsx_stxvrwx {stvec}
 
   void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
new file mode 100644
index 00000000000..46014d83535
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
@@ -0,0 +1,107 @@
+/* Test of __builtin_vec_xst_trunc  */
+
+/* { dg-do run { target power10_hw } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+#include <altivec.h>
+#include <stdio.h>
+#include <inttypes.h>
+#include <string.h>
+#include <stdlib.h>
+
+#define DEBUG 0
+
+vector signed __int128 store_data =
+  {  (__int128) 0x8ACE000000000000 << 64 | (__int128) 0xfedcba9876543217ULL};
+
+union conv_t {
+  vector signed __int128 vsi128;
+  unsigned long long ull[2];
+} conv;
+
+void abort (void);
+
+
+int
+main () {
+  int i;
+  signed long sl;
+  signed char sc, expected_sc;
+  signed short ss, expected_ss;
+  signed int si, expected_si;
+  signed long long int sll, expected_sll;
+  signed char *psc;
+  signed short *pss;
+  signed int *psi;
+  signed long long int *psll;
+  
+#if DEBUG
+  val.vsi128 = store_data;
+   printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
+#endif
+
+  psc = &sc;
+  pss = &ss;
+  psi = &si;
+  psll = &sll;
+
+  sl = 1;
+  sc =0xA1;
+  expected_sc = 0xA1;
+  __builtin_altivec_tr_stxvrbx (store_data, sl, psc);
+
+  if (expected_sc != sc & 0xFF)
+#if DEBUG
+    printf(" ERROR: Signed char = 0x%x doesn't match expected value 0x%x\n",
+	   sc & 0xFF, expected_sc);
+#else
+    abort();
+#endif
+
+  sl = 1;
+  ss = 0x52;
+  expected_ss = 0x1752;
+  __builtin_altivec_tr_stxvrhx (store_data, sl, pss);
+
+  if (expected_ss != ss & 0xFFFF)
+#if DEBUG
+    printf(" ERROR: Signed short = 0x%x doesn't match expected value 0x%x\n",
+	   ss, expected_ss) & 0xFFFF;
+#else
+    abort();
+#endif
+
+  sl = 1;
+  si = 0x21;
+  expected_si = 0x54321721;
+   __builtin_altivec_tr_stxvrwx (store_data, sl, psi);
+
+   if (expected_si != si)
+#if DEBUG
+    printf(" ERROR: Signed int = 0x%x doesn't match expected value 0x%x\n",
+	   si, expected_si);
+#else
+    abort();
+#endif
+
+  sl = 1;
+  sll = 0x12FFULL;
+   expected_sll = 0xdcba9876543217FF;
+   __builtin_altivec_tr_stxvrdx (store_data, sl, psll);
+
+   if (expected_sll != sll)
+#if DEBUG
+    printf(" ERROR: Signed long long int = 0x%llx doesn't match expected value 0x%llx\n",
+	   sll, expected_sll);
+#else
+    abort();
+#endif
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times {\mstxvrbx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvrhx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvrwx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvrdx\M} 1 } } */
-- 
2.37.2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx
  2023-06-01 20:01     ` [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx Carl Love
@ 2023-06-02  3:02       ` Kewen.Lin
  0 siblings, 0 replies; 12+ messages in thread
From: Kewen.Lin @ 2023-06-02  3:02 UTC (permalink / raw)
  To: Carl Love; +Cc: Segher Boessenkool, gcc-patches, Peter Bergner

Hi Carl,

on 2023/6/2 04:01, Carl Love wrote:
> Kewen, Segher, Peter:
> 
> The following patch is a redo of the previous "rs6000: Fix
> __builtin_vec_xst_trunc definition" patch.  
> 
> This patch fixes the argument in the two builtin definitions
> __builtin_altivec_tr_stxvrwx and __builtin_altivec_tr_stxvrhx.  It also
> adds with a testcase to validate the related builtins which have the
> third argument of char *, short *, int * and long long *.
> 
> I have tested the patch on Power 10 with no regressions.
> 
> Please let me know if this patch is acceptable for mainline.

Thanks for catching and fixing this.

OK for trunk with or without some nits below in test case fixed
(as it's just a test case :)).

> 
>                       Carl 
> 
> ------------------------------------------------------------
> rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx
> 
> The third argument for __builtin_altivec_tr_stxvrhx should be short *
> not int *.  Similarly, the third argument for __builtin_altivec_tr_stxvrwx
> should be int * not short *.  This patch fixes the arguments in the two
> builtins.
> 
> A runnable test case is added to test the __builtin_altivec_tr_stxvrbx,
> __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx and
> __builtin_altivec_tr_stxvrdx builtins.
> 
> gcc/
> 	* config/rs6000/rs6000-builtins.def (__builtin_altivec_tr_stxvrhx,
> 	__builtin_altivec_tr_stxvrwx): Fix type of third argument.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c: New test
> 	for __builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx,
> 	__builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrdx.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   4 +-
>  .../builtin_altivec_tr_stxvr_runnable.c       | 107 ++++++++++++++++++
>  2 files changed, 109 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..d7839f2e06b 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3161,10 +3161,10 @@
>    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>      TR_STXVRBX vsx_stxvrbx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>      TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>      TR_STXVRWX vsx_stxvrwx {stvec}
>  
>    void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
> new file mode 100644
> index 00000000000..46014d83535
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
> @@ -0,0 +1,107 @@
> +/* Test of __builtin_vec_xst_trunc  */
> +
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> +
> +#include <altivec.h>
> +#include <stdio.h>
> +#include <inttypes.h>
> +#include <string.h>
> +#include <stdlib.h>
> +
> +#define DEBUG 0
> +
> +vector signed __int128 store_data =
> +  {  (__int128) 0x8ACE000000000000 << 64 | (__int128) 0xfedcba9876543217ULL};
> +
> +union conv_t {
> +  vector signed __int128 vsi128;
> +  unsigned long long ull[2];
> +} conv;
> +
> +void abort (void);
> +
> +
> +int
> +main () {
> +  int i;
> +  signed long sl;
> +  signed char sc, expected_sc;
> +  signed short ss, expected_ss;
> +  signed int si, expected_si;
> +  signed long long int sll, expected_sll;
> +  signed char *psc;
> +  signed short *pss;
> +  signed int *psi;
> +  signed long long int *psll;
> +  
> +#if DEBUG
> +  val.vsi128 = store_data;
> +   printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);

odd indent.

> +#endif
> +
> +  psc = &sc;
> +  pss = &ss;
> +  psi = &si;
> +  psll = &sll;
> +
> +  sl = 1;
> +  sc =0xA1;

one more space after "=".

> +  expected_sc = 0xA1;
> +  __builtin_altivec_tr_stxvrbx (store_data, sl, psc);
> +
> +  if (expected_sc != sc & 0xFF)
> +#if DEBUG
> +    printf(" ERROR: Signed char = 0x%x doesn't match expected value 0x%x\n",
> +	   sc & 0xFF, expected_sc);
> +#else
> +    abort();
> +#endif
> +
> +  sl = 1;

redundant, and easy to cause misunderstanding that sl can get changed.

> +  ss = 0x52;
> +  expected_ss = 0x1752;
> +  __builtin_altivec_tr_stxvrhx (store_data, sl, pss);
> +
> +  if (expected_ss != ss & 0xFFFF)
> +#if DEBUG
> +    printf(" ERROR: Signed short = 0x%x doesn't match expected value 0x%x\n",
> +	   ss, expected_ss) & 0xFFFF;
> +#else
> +    abort();
> +#endif
> +
> +  sl = 1;

same as above.

> +  si = 0x21;
> +  expected_si = 0x54321721;
> +   __builtin_altivec_tr_stxvrwx (store_data, sl, psi);

odd indent.

> +
> +   if (expected_si != si)
> +#if DEBUG
> +    printf(" ERROR: Signed int = 0x%x doesn't match expected value 0x%x\n",
> +	   si, expected_si);
> +#else
> +    abort();
> +#endif
> +
> +  sl = 1;

same as above.

> +  sll = 0x12FFULL;
> +   expected_sll = 0xdcba9876543217FF;
> +   __builtin_altivec_tr_stxvrdx (store_data, sl, psll);

Ditto.

BR,
Kewen

> +
> +   if (expected_sll != sll)
> +#if DEBUG
> +    printf(" ERROR: Signed long long int = 0x%llx doesn't match expected value 0x%llx\n",
> +	   sll, expected_sll);
> +#else
> +    abort();
> +#endif
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mstxvrbx\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mstxvrhx\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mstxvrwx\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mstxvrdx\M} 1 } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition
  2023-06-01 20:01     ` Carl Love
@ 2023-06-02  3:22       ` Kewen.Lin
  0 siblings, 0 replies; 12+ messages in thread
From: Kewen.Lin @ 2023-06-02  3:22 UTC (permalink / raw)
  To: Carl Love; +Cc: Segher Boessenkool, gcc-patches, Peter Bergner

on 2023/6/2 04:01, Carl Love wrote:
> On Wed, 2023-05-31 at 12:59 -0500, Peter Bergner wrote:
>> On 5/22/23 4:04 AM, Kewen.Lin wrote:
>>> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
>>>> @@ -3161,12 +3161,15 @@
>>>>    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed
>>>> char *);
>>>>      TR_STXVRBX vsx_stxvrbx {stvec}
>>>>  
>>>> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
>>>> int *);
>>>> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
>>>> short *);
>>>>      TR_STXVRHX vsx_stxvrhx {stvec}
>>>>  
>>>> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
>>>> short *);
>>>> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
>>>> int *);
>>>>      TR_STXVRWX vsx_stxvrwx {stvec}
>>>
>>> Good catching!
>>
>> This hunk should be its own patch and commit, as it is independent of
>> the other change.  Especially since other built-ins also don't have
>> {,un}simgned long * as arguments, not just
>> __builtin_altivec_tr_stxvr*x.
> 
> Yes, I was thinking the patch needs to be split into a bug fix and a
> patch for the long * arguments.
> 
> I redid the patch to create the bug fix only.  The patch includes a
> testcase that tests the __builtin_altivec_tr_stxvr* builtins.  I will
> post the new patch.
> 
> The updated patch is now called:  " rs6000: Fix arguments for
> __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx"
> 
>>
>>
>>
>>>> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed
>>>> long *);
>>>> +    TR_STXVRLX vsx_stxvrdx {stvec}
>>>> +
>>>
>>> This is mapped to the one used for type long long, it's a hard
>>> mapping,
>>> IMHO it's wrong and not consistent with what the users expect,
>>> since on Power
>>> the size of type long int is 4 bytes at -m32 while 8 bytes at -m64,
>>> this
>>> implementation binding to 8 bytes can cause trouble in 32-bit.  I
>>> wonder if
>>> it's a good idea to add one overloaded version for type long int,
>>> for now
>>> openxl also emits error message for long int type pointer (see its
>>> doc [1]),
>>> users can use casting to make it to the acceptable pointer types
>>> (long long
>>> or int as its size).
>>
>> I'm the person who noticed that we don't accept signed/unsigned long
>> * as
>> an argument type and asked Carl to investigate.  I find it hard to
>> believe
>> we accept all integer pointer types, except long *.  I agree that it
>> shouldn't
>> always map to long long *, since as you say, that's wrong for -m32.
>> My hope was that we could somehow automagically handle the long *
>> types
>> in the built-in machinery, mapping them to either the int * built-in
>> or
>> the long long * built-in depending on -m32 or -m64.  Again, this
>> limitation
>> is no limited to __builtin_altivec_tr_stx* built-ins, but others as
>> well,
>> so I was kind of hoping for a general solution that would fix them
>> all.
>> I'm not sure of that's possible though.
> 
> Per Peter's request, I added the overloaded version of the
> __builtin_vec_xst_trunc builtin with the long * argument which Kewen
> pushed back on.  So, that approach is not acceptable.  Not sure about
> how to get the builtin infrastructure to automatically map long * to
> int * or long long *?  If someone has some idea on how to do that, I
> will gladly pursue it.  I will study the builtin support some more to
> see if I can come up with any ideas as well.
> 

Yeah, Peter gave a petty nice idea about a general solution, I think
it's doable.  One place is in funciton:

  rs6000_builtin_type_compatible (tree parmtype, tree argtype).

For POINTER_TYPE_P (parmtype) && POINTER_TYPE_P (argtype), now we just
check the TREE_TYPE of pointer type, you can do some check on TREE_TYPE
of parmtype with long/unsigned long, and do the corresponding conversion
to unsigned/signed {int, long long} further, then the given instance
can match it, it won't raise errors any more.

But note that not all built-in overloading go into this, since we have
some specific handlers (see resolve_vec_* like mul, cmpne...), you may
need some checks further like this.

Another place is in function altivec_resolve_overloaded_builtin, when
iterating the given fnargs to prepare args and types, we can do the 
check and conversion.  It's before that we start to route to different
handlers if they need to, it may be better at the first glance.

I noticed that some of built-ins already support {unsigned,} long *
overloaded, once this general solution works, I think we can revisit
and get rid of those ones to speed up the instance searching.

Besides, if we want to support it, we probably want to sync this with
openxl team to keep both compatible. :)

BR,
Kewen

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-06-02  3:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-10 18:06 [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition Carl Love
2023-05-18 21:28 ` Peter Bergner
2023-05-18 21:44   ` Carl Love
2023-05-22  9:04 ` Kewen.Lin
2023-05-22 19:50   ` Carl Love
2023-05-23  3:36     ` Kewen.Lin
2023-05-22 19:50   ` [PATCH ver 2] " Carl Love
2023-05-31 17:59   ` [PATCH] " Peter Bergner
2023-06-01 20:01     ` Carl Love
2023-06-02  3:22       ` Kewen.Lin
2023-06-01 20:01     ` [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx Carl Love
2023-06-02  3:02       ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).