* [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
@ 2024-07-26 22:37 Carl Love
2024-07-29 10:21 ` Kewen.Lin
0 siblings, 1 reply; 7+ messages in thread
From: Carl Love @ 2024-07-26 22:37 UTC (permalink / raw)
To: GCC Patches, Kewen, Peter Bergner, segher, cel, David Edelsohn
GCC developers:
Version 2, updated rs6000-overload.def to remove adding additonal
internal names and to change XXSLDWI_Q to XXSLDWI_1TI per comments from
Kewen. Move new documentation statement for the PIVPR built-ins per
comments from Kewen. Updated dg-do-run directive and added comment
about the save-temps in testcase per feedback from Segher. Retested
the patch on Power 10 with no regressions.
The following patch adds the int128 varients to the existing overloaded
built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo, vec_srdb,
vec_srl, vec_sro. These varients were requested by Steve Munroe.
The patch has been tested on a Power 10 system with no regressions.
Please let me know if the patch is acceptable for mainline.
Carl
---------------------------------------------------------------
rs6000, Add new overloaded vector shift builtin int128 varients
Add the signed __int128 and unsigned __int128 argument types for the
overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
vec_srdb, vec_srl, vec_sro. For each of the new argument types add a
testcase and update the documentation for the built-in.
gcc/ChangeLog:
* config/rs6000/altivec.md (vs<SLDB_lr>db_<mode>): Change
define_insn iterator to VEC_IC.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vsldoi_v1ti,
__builtin_vsx_xxsldwi_v1ti, __builtin_altivec_vsldb_v1ti,
__builtin_altivec_vsrdb_v1ti): New builtin definitions.
* config/rs6000/rs6000-overload.def (vec_sld, vec_sldb, vec_sldw,
vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro): New overloaded
definitions.
* doc/extend.texi (vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
vec_srdb, vec_srl, vec_sro): Add documentation for new overloaded
built-ins.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec-shift-double-runnable-int128.c: New test file.
---
gcc/config/rs6000/altivec.md | 6 +-
gcc/config/rs6000/rs6000-builtins.def | 12 +
gcc/config/rs6000/rs6000-overload.def | 40 ++
gcc/doc/extend.texi | 43 +++
.../vec-shift-double-runnable-int128.c | 358 ++++++++++++++++++
5 files changed, 456 insertions(+), 3 deletions(-)
create mode 100644
gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 5af9bf920a2..2a18ee44526 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -878,9 +878,9 @@ (define_int_attr SLDB_lr [(UNSPEC_SLDB "l")
(define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB])
(define_insn "vs<SLDB_lr>db_<mode>"
- [(set (match_operand:VI2 0 "register_operand" "=v")
- (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v")
- (match_operand:VI2 2 "register_operand" "v")
+ [(set (match_operand:VEC_IC 0 "register_operand" "=v")
+ (unspec:VEC_IC [(match_operand:VEC_IC 1 "register_operand" "v")
+ (match_operand:VEC_IC 2 "register_operand" "v")
(match_operand:QI 3 "const_0_to_12_operand" "n")]
VSHIFT_DBL_LR))]
"TARGET_POWER10"
diff --git a/gcc/config/rs6000/rs6000-builtins.def
b/gcc/config/rs6000/rs6000-builtins.def
index 77eb0f7e406..a2b2b729270 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -964,6 +964,9 @@
const vss __builtin_altivec_vsldoi_8hi (vss, vss, const int<4>);
VSLDOI_8HI altivec_vsldoi_v8hi {}
+ const vsq __builtin_altivec_vsldoi_v1ti (vsq, vsq, const int<4>);
+ VSLDOI_V1TI altivec_vsldoi_v1ti {}
+
const vss __builtin_altivec_vslh (vss, vus);
VSLH vashlv8hi3 {}
@@ -1831,6 +1834,9 @@
const vsll __builtin_vsx_xxsldwi_2di (vsll, vsll, const int<2>);
XXSLDWI_2DI vsx_xxsldwi_v2di {}
+ const vsq __builtin_vsx_xxsldwi_v1ti (vsq, vsq, const int<2>);
+ XXSLDWI_1TI vsx_xxsldwi_v1ti {}
+
const vf __builtin_vsx_xxsldwi_4sf (vf, vf, const int<2>);
XXSLDWI_4SF vsx_xxsldwi_v4sf {}
@@ -3299,6 +3305,9 @@
const vss __builtin_altivec_vsldb_v8hi (vss, vss, const int<3>);
VSLDB_V8HI vsldb_v8hi {}
+ const vsq __builtin_altivec_vsldb_v1ti (vsq, vsq, const int<3>);
+ VSLDB_V1TI vsldb_v1ti {}
+
const vsq __builtin_altivec_vslq (vsq, vuq);
VSLQ vashlv1ti3 {}
@@ -3317,6 +3326,9 @@
const vss __builtin_altivec_vsrdb_v8hi (vss, vss, const int<3>);
VSRDB_V8HI vsrdb_v8hi {}
+ const vsq __builtin_altivec_vsrdb_v1ti (vsq, vsq, const int<3>);
+ VSRDB_V1TI vsrdb_v1ti {}
+
const vsq __builtin_altivec_vsrq (vsq, vuq);
VSRQ vlshrv1ti3 {}
diff --git a/gcc/config/rs6000/rs6000-overload.def
b/gcc/config/rs6000/rs6000-overload.def
index c4ecafc6f7e..96b0ecbd675 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3399,6 +3399,10 @@
VSLDOI_4SF
vd __builtin_vec_sld (vd, vd, const int);
VSLDOI_2DF
+ vsq __builtin_vec_sld (vsq, vsq, const int);
+ VSLDOI_V1TI VSLDOI_VSQ
+ vuq __builtin_vec_sld (vuq, vuq, const int);
+ VSLDOI_V1TI VSLDOI_VUQ
[VEC_SLDB, vec_sldb, __builtin_vec_sldb]
vsc __builtin_vec_sldb (vsc, vsc, const int);
@@ -3417,6 +3421,10 @@
VSLDB_V2DI VSLDB_VSLL
vull __builtin_vec_sldb (vull, vull, const int);
VSLDB_V2DI VSLDB_VULL
+ vsq __builtin_vec_sldb (vsq, vsq, const int);
+ VSLDB_V1TI VSLDB_VSQ
+ vuq __builtin_vec_sldb (vuq, vuq, const int);
+ VSLDB_V1TI VSLDB_VUQ
[VEC_SLDW, vec_sldw, __builtin_vec_sldw]
vsc __builtin_vec_sldw (vsc, vsc, const int);
@@ -3439,6 +3447,10 @@
XXSLDWI_4SF XXSLDWI_VF
vd __builtin_vec_sldw (vd, vd, const int);
XXSLDWI_2DF XXSLDWI_VD
+ vsq __builtin_vec_sldw (vsq, vsq, const int);
+ XXSLDWI_1TI XXSLDWI_VSQ
+ vuq __builtin_vec_sldw (vuq, vuq, const int);
+ XXSLDWI_1TI XXSLDWI_VUQ
[VEC_SLL, vec_sll, __builtin_vec_sll]
vsc __builtin_vec_sll (vsc, vuc);
@@ -3459,6 +3471,10 @@
VSL VSL_VSLL
vull __builtin_vec_sll (vull, vuc);
VSL VSL_VULL
+ vsq __builtin_vec_sll (vsq, vuc);
+ VSL VSL_VSQ
+ vuq __builtin_vec_sll (vuq, vuc);
+ VSL VSL_VUQ
; The following variants are deprecated.
vsc __builtin_vec_sll (vsc, vus);
VSL VSL_VSC_VUS
@@ -3554,6 +3570,14 @@
VSLO VSLO_VFS
vf __builtin_vec_slo (vf, vuc);
VSLO VSLO_VFU
+ vsq __builtin_vec_slo (vsq, vsc);
+ VSLO VSLDO_VSQS
+ vsq __builtin_vec_slo (vsq, vuc);
+ VSLO VSLDO_VSQU
+ vuq __builtin_vec_slo (vuq, vsc);
+ VSLO VSLDO_VUQS
+ vuq __builtin_vec_slo (vuq, vuc);
+ VSLO VSLDO_VUQU
[VEC_SLV, vec_slv, __builtin_vec_vslv]
vuc __builtin_vec_vslv (vuc, vuc);
@@ -3699,6 +3723,10 @@
VSRDB_V2DI VSRDB_VSLL
vull __builtin_vec_srdb (vull, vull, const int);
VSRDB_V2DI VSRDB_VULL
+ vsq __builtin_vec_srdb (vsq, vsq, const int);
+ VSRDB_V1TI VSRDB_VSQ
+ vuq __builtin_vec_srdb (vuq, vuq, const int);
+ VSRDB_V1TI VSRDB_VUQ
[VEC_SRL, vec_srl, __builtin_vec_srl]
vsc __builtin_vec_srl (vsc, vuc);
@@ -3719,6 +3747,10 @@
VSR VSR_VSLL
vull __builtin_vec_srl (vull, vuc);
VSR VSR_VULL
+ vsq __builtin_vec_srl (vsq, vuc);
+ VSR VSR_VSQ
+ vuq __builtin_vec_srl (vuq, vuc);
+ VSR VSR_VUQ
; The following variants are deprecated.
vsc __builtin_vec_srl (vsc, vus);
VSR VSR_VSC_VUS
@@ -3808,6 +3840,14 @@
VSRO VSRO_VFS
vf __builtin_vec_sro (vf, vuc);
VSRO VSRO_VFU
+ vsq __builtin_vec_sro (vsq, vsc);
+ VSRO VSRDO_VSQS
+ vsq __builtin_vec_sro (vsq, vuc);
+ VSRO VSRDO_VSQU
+ vuq __builtin_vec_sro (vuq, vsc);
+ VSRO VSRDO_VUQS
+ vuq __builtin_vec_sro (vuq, vuc);
+ VSRO VSRDO_VUQU
[VEC_SRV, vec_srv, __builtin_vec_vsrv]
vuc __builtin_vec_vsrv (vuc, vuc);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 0b572afca72..83ff168faf6 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -23504,6 +23504,10 @@ const unsigned int);
vector signed long long, const unsigned int);
@exdent vector unsigned long long vec_sldb (vector unsigned long long,
vector unsigned long long, const unsigned int);
+@exdent vector signed __int128 vec_sldb (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sldb (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
@end smallexample
Shift the combined input vectors left by the amount specified by the
low-order
@@ -23531,12 +23535,51 @@ const unsigned int);
vector signed long long, const unsigned int);
@exdent vector unsigned long long vec_srdb (vector unsigned long long,
vector unsigned long long, const unsigned int);
+@exdent vector signed __int128 vec_srdb (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_srdb (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
@end smallexample
Shift the combined input vectors right by the amount specified by the
low-order
three bits of the third argument, and return the remaining 128 bits. Code
using this built-in must be endian-aware.
+@smallexample
+@exdent vector signed __int128 vec_sld (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sld (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
+@exdent vector signed __int128 vec_sldw (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sldw (vector unsigned __int,
+vector unsigned __int128, const unsigned int);
+@exdent vector signed __int128 vec_slo (vector signed __int128,
+vector signed char);
+@exdent vector signed __int128 vec_slo (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
+vector signed char);
+@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
+vector unsigned char);
+@exdent vector signed __int128 vec_sro (vector signed __int128,
+vector signed char);
+@exdent vector signed __int128 vec_sro (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
+vector signed char);
+@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
+vector unsigned char);
+@exdent vector signed __int128 vec_srl (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_srl (vector unsigned __int128,
+vector unsigned char);
+@end smallexample
+
+The above instances are extension of the existing overloaded built-ins
+@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro},
@code{vec_srl}
+that are documented in the PVIPR.
+
@findex vec_srdb
Vector Splat
diff --git
a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
new file mode 100644
index 00000000000..65e8e94ec07
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
@@ -0,0 +1,358 @@
+/* { dg-do run { target power10_hw } } */
+/* { dg-do link { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
+
+/* Need -save-temps for dg-final scan-assembler-times at end of test. */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+
+void print_i128 (unsigned __int128 val)
+{
+ printf(" 0x%016llx%016llx",
+ (unsigned long long)(val >> 64),
+ (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
+}
+#endif
+
+extern void abort (void);
+
+#if DEBUG
+#define ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME) \
+ printf ("vec_%s (vector TYPE __int128, vector TYPE __int128) \n",
#NAME); \
+ printf(" src_va_s128[0] = "); \
+ print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]); \
+ printf("\n"); \
+ printf(" src_vb_uc = 0x"); \
+ for (i = 0; i < 16; i++) \
+ printf("%02x", src_vb_uc[i]); \
+ printf("\n"); \
+ printf(" vresult[0] = "); \
+ print_i128 ((unsigned __int128) vresult[0]); \
+ printf("\n"); \
+ printf(" expected_vresult[0] = "); \
+ print_i128 ((unsigned __int128) expected_vresult[0]); \
+ printf("\n");
+
+#define ACTION_2ARG_SIGNED(NAME, TYPE_NAME) \
+ printf ("vec_%s (vector TYPE __int128, vector TYPE __int128) \n",
#NAME); \
+ printf(" src_va_s128[0] = "); \
+ print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]); \
+ printf("\n"); \
+ printf(" src_vb_sc = 0x"); \
+ for (i = 0; i < 16; i++) \
+ printf("%02x", src_vb_sc[i]); \
+ printf("\n"); \
+ printf(" vresult[0] = "); \
+ print_i128 ((unsigned __int128) vresult[0]); \
+ printf("\n"); \
+ printf(" expected_vresult[0] = "); \
+ print_i128 ((unsigned __int128) expected_vresult[0]); \
+ printf("\n");
+
+#define ACTION_3ARG(NAME, TYPE_NAME, CONST) \
+ printf ("vec_%s (vector TYPE __int128, vector TYPE __int128, %s)
\n", \
+ #NAME, #CONST); \
+ printf(" src_va_s128[0] = "); \
+ print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]); \
+ printf("\n"); \
+ printf(" src_vb_s128[0] = "); \
+ print_i128 ((unsigned __int128) src_vb_##TYPE_NAME[0]); \
+ printf("\n"); \
+ printf(" vresult[0] = "); \
+ print_i128 ((unsigned __int128) vresult[0]); \
+ printf("\n"); \
+ printf(" expected_vresult[0] = "); \
+ print_i128 ((unsigned __int128) expected_vresult[0]); \
+ printf("\n");
+
+#else
+#define ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME) \
+ abort();
+
+#define ACTION_2ARG_SIGNED(NAME, TYPE_NAME) \
+ abort();
+
+#define ACTION_2ARG(NAME, TYPE_NAME) \
+ abort();
+
+#define ACTION_3ARG(NAME, TYPE_NAME, CONST) \
+ abort();
+#endif
+
+/* Second argument of the builtin is vector unsigned char. */
+#define TEST_2ARG_UNSIGNED(NAME, TYPE, TYPE_NAME, EXP_RESULT_HI,
EXP_RESULT_LO) \
+ { \
+ vector TYPE __int128 vresult; \
+ vector TYPE __int128 expected_vresult; \
+ int i; \
+ \
+ expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
+ expected_vresult = (expected_vresult << 64) | \
+ (vector TYPE __int128) { EXP_RESULT_LO }; \
+ vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_uc); \
+ \
+ if (!vec_all_eq (vresult, expected_vresult)) { \
+ ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME) \
+ } \
+ }
+
+/* Second argument of the builtin is vector signed char. */
+#define TEST_2ARG_SIGNED(NAME, TYPE, TYPE_NAME, EXP_RESULT_HI,
EXP_RESULT_LO) \
+ { \
+ vector TYPE __int128 vresult; \
+ vector TYPE __int128 expected_vresult; \
+ int i; \
+ \
+ expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
+ expected_vresult = (expected_vresult << 64) | \
+ (vector TYPE __int128) { EXP_RESULT_LO }; \
+ vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_sc); \
+ \
+ if (!vec_all_eq (vresult, expected_vresult)) { \
+ ACTION_2ARG_SIGNED(NAME, TYPE_NAME) \
+ } \
+ }
+
+#define TEST_3ARG(NAME, TYPE, TYPE_NAME, CONST, EXP_RESULT_HI,
EXP_RESULT_LO) \
+ { \
+ vector TYPE __int128 vresult; \
+ vector TYPE __int128 expected_vresult; \
+ \
+ expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
+ expected_vresult = (expected_vresult << 64) | \
+ (vector TYPE __int128) { EXP_RESULT_LO }; \
+ vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_##TYPE_NAME,
CONST); \
+ \
+ if (!vec_all_eq (vresult, expected_vresult)) { \
+ ACTION_3ARG(NAME, TYPE_NAME, CONST) \
+ } \
+ }
+
+int
+main (int argc, char *argv [])
+{
+ vector signed __int128 vresult_s128;
+ vector signed __int128 expected_vresult_s128;
+ vector signed __int128 src_va_s128;
+ vector signed __int128 src_vb_s128;
+ vector unsigned __int128 vresult_u128;
+ vector unsigned __int128 expected_vresult_u128;
+ vector unsigned __int128 src_va_u128;
+ vector unsigned __int128 src_vb_u128;
+ vector signed char src_vb_sc;
+ vector unsigned char src_vb_uc;
+
+ /* 128-bit vector shift right tests, vec_srdb. */
+ src_va_s128 = (vector signed __int128) {0x12345678};
+ src_vb_s128 = (vector signed __int128) {0xFEDCBA90};
+ TEST_3ARG(srdb, signed, s128, 4, 0x8000000000000000, 0xFEDCBA9)
+
+ src_va_u128 = (vector unsigned __int128) { 0xFEDCBA98 };
+ src_vb_u128 = (vector unsigned __int128) { 0x76543210};
+ TEST_3ARG(srdb, unsigned, u128, 4, 0x8000000000000000, 0x07654321)
+
+ /* 128-bit vector shift left tests, vec_sldb. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+ src_vb_s128 = (src_vb_s128 << 64)
+ | (vector signed __int128) {0xFEDCBA9876543210};
+ TEST_3ARG(sldb, signed, s128, 4, 0x23456789ABCDEF01, 0x23456789ABCDEF0F)
+
+ src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_vb_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_vb_u128 = src_vb_u128 << 64
+ | (vector unsigned __int128) {0x123456789ABCDEF0};
+ TEST_3ARG(sldb, unsigned, u128, 4, 0xEDCBA9876543210F,
0xEDCBA98765432101)
+
+ /* Shift left by octect tests, vec_sld. Shift is by immediate value
+ times 8. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+ src_vb_s128 = (src_vb_s128 << 64)
+ | (vector signed __int128) {0xFEDCBA9876543210};
+ TEST_3ARG(sld, signed, s128, 4, 0x9abcdef012345678, 0x9abcdef0fedcba98)
+
+ src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_vb_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_vb_u128 = src_vb_u128 << 64
+ | (vector unsigned __int128) {0x123456789ABCDEF0};
+ TEST_3ARG(sld, unsigned, u128, 4, 0x76543210fedcba98, 0x7654321012345678)
+
+ /* Vector left shift bytes within the vector, vec_sll. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ src_vb_uc = (vector unsigned char) {0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01};
+ TEST_2ARG_UNSIGNED(sll, signed, s128, 0x2468acf13579bde0,
+ 0x2468acf13579bde0)
+
+ src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_vb_uc = (vector unsigned char) {0x02, 0x02, 0x02, 0x02,
+ 0x02, 0x02, 0x02, 0x02,
+ 0x02, 0x02, 0x02, 0x02,
+ 0x02, 0x02, 0x02, 0x02};
+ TEST_2ARG_UNSIGNED(sll, unsigned, u128, 0x48d159e26af37bc0,
+ 0x48d159e26af37bc0)
+
+ /* Vector right shift bytes within the vector, vec_srl. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ src_vb_uc = (vector unsigned char) {0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01,
+ 0x01, 0x01, 0x01, 0x01};
+ TEST_2ARG_UNSIGNED(srl, signed, s128, 0x091a2b3c4d5e6f78,
+ 0x091a2b3c4d5e6f78)
+
+ src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_vb_uc = (vector unsigned char) {0x02, 0x02, 0x02, 0x02,
+ 0x02, 0x02, 0x02, 0x02,
+ 0x02, 0x02, 0x02, 0x02,
+ 0x02, 0x02, 0x02, 0x02};
+ TEST_2ARG_UNSIGNED(srl, unsigned, u128, 0x48d159e26af37bc,
+ 0x48d159e26af37bc)
+
+ /* Shift left by octect tests, vec_slo. Shift is by immediate value
+ bytes. Shift amount in bits 121:124. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ /* Note vb_sc is Endian specific, this is just LE. */
+ /* The left shift amount is 1 byte, i.e. 1 * 8 bits. */
+ src_vb_sc = (vector signed char) {0x1 << 3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0};
+
+ TEST_2ARG_SIGNED(slo, signed, s128, 0x3456789ABCDEF012,
+ 0x3456789ABCDEF000)
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ /* Note vb_sc is Endian specific, this is just LE. */
+ /* The left shift amount is 2 bytes, i.e. 2 * 8 bits. */
+ src_vb_uc = (vector unsigned char) {0x2 << 3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0};
+ TEST_2ARG_UNSIGNED(slo, signed, s128, 0x56789ABCDEF01234,
+ 0x56789ABCDEF00000)
+
+ src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ /* The left shift amount is 3 bytes, i.e. 3 * 8 bits. */
+ src_vb_sc = (vector signed char) {0x03<<3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x00, 0x00, 0x00, 0x0};
+ TEST_2ARG_SIGNED(slo, unsigned, u128, 0x9876543210FEDCBA,
+ 0x9876543210000000)
+
+ src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ /* The left shift amount is 4 bytes, i.e. 4 * 8 bits. */
+ src_vb_uc = (vector unsigned char) {0x04<<3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x00, 0x00, 0x00, 0x0};
+ TEST_2ARG_UNSIGNED(slo, unsigned, u128, 0x76543210FEDCBA98,
+ 0x7654321000000000)
+
+ /* Shift right by octect tests, vec_sro. Shift is by immediate value
+ times 8. Shift amount in bits 121:124. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ /* Note vb_sc is Endian specific, this is just LE. */
+ /* The left shift amount is 1 byte, i.e. 1 * 8 bits. */
+ src_vb_sc = (vector signed char) {0x1 << 3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0};
+ TEST_2ARG_SIGNED(sro, signed, s128, 0x00123456789ABCDE,
0xF0123456789ABCDE)
+
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ /* Note vb_sc is Endian specific, this is just LE. */
+ /* The left shift amount is 1 byte, i.e. 1 * 8 bits. */
+ src_vb_uc = (vector unsigned char) {0x2 << 3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0};
+ TEST_2ARG_UNSIGNED(sro, signed, s128, 0x0000123456789ABC,
+ 0xDEF0123456789ABC)
+
+ src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ /* The left shift amount is 4 bytes, i.e. 4 * 8 bits. */
+ src_vb_sc = (vector signed char) {0x03<<3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x00, 0x00, 0x00, 0x0};
+ TEST_2ARG_SIGNED(sro, unsigned, u128, 0x000000FEDCBA9876,
+ 0x543210FEDCBA9876)
+
+ src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_va_u128 = src_va_u128 << 64
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ /* The left shift amount is 4 bytes, i.e. 4 * 8 bits. */
+ src_vb_uc = (vector unsigned char) {0x04<<3, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x0, 0x0, 0x0, 0x0,
+ 0x00, 0x00, 0x00, 0x0};
+ TEST_2ARG_UNSIGNED(sro, unsigned, u128, 0x00000000FEDCBA98,
+ 0x76543210FEDCBA98)
+
+ /* 128-bit vector shift left tests, vec_sldw. */
+ src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+ src_va_s128 = (src_va_s128 << 64)
+ | (vector signed __int128) {0x123456789ABCDEF0};
+ src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+ src_vb_s128 = (src_vb_s128 << 64)
+ | (vector signed __int128) {0xFEDCBA9876543210};
+ TEST_3ARG(sldw, signed, s128, 1, 0x9ABCDEF012345678, 0x9ABCDEF0FEDCBA98)
+
+ src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_va_u128 = (src_va_u128 << 64)
+ | (vector unsigned __int128) {0x123456789ABCDEF0};
+ src_vb_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+ src_vb_u128 = (src_vb_u128 << 64)
+ | (vector unsigned __int128) {0xFEDCBA9876543210};
+ TEST_3ARG(sldw, unsigned, u128, 2, 0x123456789ABCDEF0,
0xFEDCBA9876543210)
+
+
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times {\mvsrdbi\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvsldbi\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvsl\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvsr\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvslo\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvsro\M} 4 } } */
--
2.45.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
2024-07-26 22:37 [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients Carl Love
@ 2024-07-29 10:21 ` Kewen.Lin
2024-07-29 15:47 ` Peter Bergner
2024-07-31 20:49 ` Carl Love
0 siblings, 2 replies; 7+ messages in thread
From: Kewen.Lin @ 2024-07-29 10:21 UTC (permalink / raw)
To: Carl Love; +Cc: GCC Patches, Peter Bergner, segher, David Edelsohn
Hi Carl,
on 2024/7/27 06:37, Carl Love wrote:
> GCC developers:
>
> Version 2, updated rs6000-overload.def to remove adding additonal internal names and to change XXSLDWI_Q to XXSLDWI_1TI per comments from Kewen. Move new documentation statement for the PIVPR built-ins per comments from Kewen. Updated dg-do-run directive and added comment about the save-temps in testcase per feedback from Segher. Retested the patch on Power 10 with no regressions.
>
> The following patch adds the int128 varients to the existing overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro. These varients were requested by Steve Munroe.
>
> The patch has been tested on a Power 10 system with no regressions.
>
> Please let me know if the patch is acceptable for mainline.
>
> Carl
>
>
> ---------------------------------------------------------------
> rs6000, Add new overloaded vector shift builtin int128 varients
>
> Add the signed __int128 and unsigned __int128 argument types for the
> overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
> vec_srdb, vec_srl, vec_sro. For each of the new argument types add a
> testcase and update the documentation for the built-in.
>
> gcc/ChangeLog:
> * config/rs6000/altivec.md (vs<SLDB_lr>db_<mode>): Change
> define_insn iterator to VEC_IC.
> * config/rs6000/rs6000-builtins.def (__builtin_altivec_vsldoi_v1ti,
> __builtin_vsx_xxsldwi_v1ti, __builtin_altivec_vsldb_v1ti,
> __builtin_altivec_vsrdb_v1ti): New builtin definitions.
> * config/rs6000/rs6000-overload.def (vec_sld, vec_sldb, vec_sldw,
> vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro): New overloaded
> definitions.
> * doc/extend.texi (vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
Nit: s/ / /
> vec_srdb, vec_srl, vec_sro): Add documentation for new overloaded
> built-ins.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/powerpc/vec-shift-double-runnable-int128.c: New test file.
> ---
> gcc/config/rs6000/altivec.md | 6 +-
> gcc/config/rs6000/rs6000-builtins.def | 12 +
> gcc/config/rs6000/rs6000-overload.def | 40 ++
> gcc/doc/extend.texi | 43 +++
> .../vec-shift-double-runnable-int128.c | 358 ++++++++++++++++++
> 5 files changed, 456 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
>
snip...
>
> [VEC_SRV, vec_srv, __builtin_vec_vsrv]
> vuc __builtin_vec_vsrv (vuc, vuc);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 0b572afca72..83ff168faf6 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -23504,6 +23504,10 @@ const unsigned int);
> vector signed long long, const unsigned int);
> @exdent vector unsigned long long vec_sldb (vector unsigned long long,
> vector unsigned long long, const unsigned int);
> +@exdent vector signed __int128 vec_sldb (vector signed __int128,
> +vector signed __int128, const unsigned int);
> +@exdent vector unsigned __int128 vec_sldb (vector unsigned __int128,
> +vector unsigned __int128, const unsigned int);
> @end smallexample
>
> Shift the combined input vectors left by the amount specified by the low-order
> @@ -23531,12 +23535,51 @@ const unsigned int);
> vector signed long long, const unsigned int);
> @exdent vector unsigned long long vec_srdb (vector unsigned long long,
> vector unsigned long long, const unsigned int);
> +@exdent vector signed __int128 vec_srdb (vector signed __int128,
> +vector signed __int128, const unsigned int);
> +@exdent vector unsigned __int128 vec_srdb (vector unsigned __int128,
> +vector unsigned __int128, const unsigned int);
> @end smallexample
>
> Shift the combined input vectors right by the amount specified by the low-order
> three bits of the third argument, and return the remaining 128 bits. Code
> using this built-in must be endian-aware.
>
> +@smallexample
> +@exdent vector signed __int128 vec_sld (vector signed __int128,
> +vector signed __int128, const unsigned int);
> +@exdent vector unsigned __int128 vec_sld (vector unsigned __int128,
> +vector unsigned __int128, const unsigned int);
> +@exdent vector signed __int128 vec_sldw (vector signed __int128,
> +vector signed __int128, const unsigned int);
> +@exdent vector unsigned __int128 vec_sldw (vector unsigned __int,
> +vector unsigned __int128, const unsigned int);
> +@exdent vector signed __int128 vec_slo (vector signed __int128,
> +vector signed char);
> +@exdent vector signed __int128 vec_slo (vector signed __int128,
> +vector unsigned char);
> +@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
> +vector signed char);
> +@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
> +vector unsigned char);
> +@exdent vector signed __int128 vec_sro (vector signed __int128,
> +vector signed char);
> +@exdent vector signed __int128 vec_sro (vector signed __int128,
> +vector unsigned char);
> +@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
> +vector signed char);
> +@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
> +vector unsigned char);
> +@exdent vector signed __int128 vec_srl (vector signed __int128,
> +vector unsigned char);
> +@exdent vector unsigned __int128 vec_srl (vector unsigned __int128,
> +vector unsigned char);
> +@end smallexample
> +
> +The above instances are extension of the existing overloaded built-ins
> +@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro}, @code{vec_srl}
> +that are documented in the PVIPR.
> +
> @findex vec_srdb
Nit: The above new @smallexample section and its associated description should be
placed after this @findex vec_srdb (otherwise it breaks the connection between the
index and the content of vec_srdb), but personally I preferred it to be placed at
the end of this node, that is: after
"int vec_any_le (vector unsigned __int128, vector unsigned __int128);
@end smallexample
" as what's in your previous version, since most of these beginning entries have
their headings but this @smallexample section doesn't have a heading, it looks a
bit weird.
>
> Vector Splat
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
> new file mode 100644
> index 00000000000..65e8e94ec07
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
> @@ -0,0 +1,358 @@
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-do link { target { ! power10_hw } } } */
> +/* { dg-require-effective-target power10_ok } */
As Peter pointed out in another thread, you need int128 effective target check as well,
otherwise it will fail with power10 -m32.
Another nit: power10_hw should already guarantee power10_ok, so power10_ok
is only required for dg-do link.
BR,
Kewen
> +
> +/* Need -save-temps for dg-final scan-assembler-times at end of test. */
> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> +
> +#include <altivec.h>
> +
> +#define DEBUG 0
> +
> +#if DEBUG
> +#include <stdio.h>
> +
> +void print_i128 (unsigned __int128 val)
> +{
> + printf(" 0x%016llx%016llx",
> + (unsigned long long)(val >> 64),
> + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
> +}
> +#endif
> +
> +extern void abort (void);
> +
> +#if DEBUG
> +#define ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME) \
> + printf ("vec_%s (vector TYPE __int128, vector TYPE __int128) \n", #NAME); \
> + printf(" src_va_s128[0] = "); \
> + print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]); \
> + printf("\n"); \
> + printf(" src_vb_uc = 0x"); \
> + for (i = 0; i < 16; i++) \
> + printf("%02x", src_vb_uc[i]); \
> + printf("\n"); \
> + printf(" vresult[0] = "); \
> + print_i128 ((unsigned __int128) vresult[0]); \
> + printf("\n"); \
> + printf(" expected_vresult[0] = "); \
> + print_i128 ((unsigned __int128) expected_vresult[0]); \
> + printf("\n");
> +
> +#define ACTION_2ARG_SIGNED(NAME, TYPE_NAME) \
> + printf ("vec_%s (vector TYPE __int128, vector TYPE __int128) \n", #NAME); \
> + printf(" src_va_s128[0] = "); \
> + print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]); \
> + printf("\n"); \
> + printf(" src_vb_sc = 0x"); \
> + for (i = 0; i < 16; i++) \
> + printf("%02x", src_vb_sc[i]); \
> + printf("\n"); \
> + printf(" vresult[0] = "); \
> + print_i128 ((unsigned __int128) vresult[0]); \
> + printf("\n"); \
> + printf(" expected_vresult[0] = "); \
> + print_i128 ((unsigned __int128) expected_vresult[0]); \
> + printf("\n");
> +
> +#define ACTION_3ARG(NAME, TYPE_NAME, CONST) \
> + printf ("vec_%s (vector TYPE __int128, vector TYPE __int128, %s) \n", \
> + #NAME, #CONST); \
> + printf(" src_va_s128[0] = "); \
> + print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]); \
> + printf("\n"); \
> + printf(" src_vb_s128[0] = "); \
> + print_i128 ((unsigned __int128) src_vb_##TYPE_NAME[0]); \
> + printf("\n"); \
> + printf(" vresult[0] = "); \
> + print_i128 ((unsigned __int128) vresult[0]); \
> + printf("\n"); \
> + printf(" expected_vresult[0] = "); \
> + print_i128 ((unsigned __int128) expected_vresult[0]); \
> + printf("\n");
> +
> +#else
> +#define ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME) \
> + abort();
> +
> +#define ACTION_2ARG_SIGNED(NAME, TYPE_NAME) \
> + abort();
> +
> +#define ACTION_2ARG(NAME, TYPE_NAME) \
> + abort();
> +
> +#define ACTION_3ARG(NAME, TYPE_NAME, CONST) \
> + abort();
> +#endif
> +
> +/* Second argument of the builtin is vector unsigned char. */
> +#define TEST_2ARG_UNSIGNED(NAME, TYPE, TYPE_NAME, EXP_RESULT_HI, EXP_RESULT_LO) \
> + { \
> + vector TYPE __int128 vresult; \
> + vector TYPE __int128 expected_vresult; \
> + int i; \
> + \
> + expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
> + expected_vresult = (expected_vresult << 64) | \
> + (vector TYPE __int128) { EXP_RESULT_LO }; \
> + vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_uc); \
> + \
> + if (!vec_all_eq (vresult, expected_vresult)) { \
> + ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME) \
> + } \
> + }
> +
> +/* Second argument of the builtin is vector signed char. */
> +#define TEST_2ARG_SIGNED(NAME, TYPE, TYPE_NAME, EXP_RESULT_HI, EXP_RESULT_LO) \
> + { \
> + vector TYPE __int128 vresult; \
> + vector TYPE __int128 expected_vresult; \
> + int i; \
> + \
> + expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
> + expected_vresult = (expected_vresult << 64) | \
> + (vector TYPE __int128) { EXP_RESULT_LO }; \
> + vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_sc); \
> + \
> + if (!vec_all_eq (vresult, expected_vresult)) { \
> + ACTION_2ARG_SIGNED(NAME, TYPE_NAME) \
> + } \
> + }
> +
> +#define TEST_3ARG(NAME, TYPE, TYPE_NAME, CONST, EXP_RESULT_HI, EXP_RESULT_LO) \
> + { \
> + vector TYPE __int128 vresult; \
> + vector TYPE __int128 expected_vresult; \
> + \
> + expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
> + expected_vresult = (expected_vresult << 64) | \
> + (vector TYPE __int128) { EXP_RESULT_LO }; \
> + vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_##TYPE_NAME, CONST); \
> + \
> + if (!vec_all_eq (vresult, expected_vresult)) { \
> + ACTION_3ARG(NAME, TYPE_NAME, CONST) \
> + } \
> + }
> +
> +int
> +main (int argc, char *argv [])
> +{
> + vector signed __int128 vresult_s128;
> + vector signed __int128 expected_vresult_s128;
> + vector signed __int128 src_va_s128;
> + vector signed __int128 src_vb_s128;
> + vector unsigned __int128 vresult_u128;
> + vector unsigned __int128 expected_vresult_u128;
> + vector unsigned __int128 src_va_u128;
> + vector unsigned __int128 src_vb_u128;
> + vector signed char src_vb_sc;
> + vector unsigned char src_vb_uc;
> +
> + /* 128-bit vector shift right tests, vec_srdb. */
> + src_va_s128 = (vector signed __int128) {0x12345678};
> + src_vb_s128 = (vector signed __int128) {0xFEDCBA90};
> + TEST_3ARG(srdb, signed, s128, 4, 0x8000000000000000, 0xFEDCBA9)
> +
> + src_va_u128 = (vector unsigned __int128) { 0xFEDCBA98 };
> + src_vb_u128 = (vector unsigned __int128) { 0x76543210};
> + TEST_3ARG(srdb, unsigned, u128, 4, 0x8000000000000000, 0x07654321)
> +
> + /* 128-bit vector shift left tests, vec_sldb. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> + src_vb_s128 = (src_vb_s128 << 64)
> + | (vector signed __int128) {0xFEDCBA9876543210};
> + TEST_3ARG(sldb, signed, s128, 4, 0x23456789ABCDEF01, 0x23456789ABCDEF0F)
> +
> + src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_vb_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_vb_u128 = src_vb_u128 << 64
> + | (vector unsigned __int128) {0x123456789ABCDEF0};
> + TEST_3ARG(sldb, unsigned, u128, 4, 0xEDCBA9876543210F, 0xEDCBA98765432101)
> +
> + /* Shift left by octect tests, vec_sld. Shift is by immediate value
> + times 8. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> + src_vb_s128 = (src_vb_s128 << 64)
> + | (vector signed __int128) {0xFEDCBA9876543210};
> + TEST_3ARG(sld, signed, s128, 4, 0x9abcdef012345678, 0x9abcdef0fedcba98)
> +
> + src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_vb_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_vb_u128 = src_vb_u128 << 64
> + | (vector unsigned __int128) {0x123456789ABCDEF0};
> + TEST_3ARG(sld, unsigned, u128, 4, 0x76543210fedcba98, 0x7654321012345678)
> +
> + /* Vector left shift bytes within the vector, vec_sll. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + src_vb_uc = (vector unsigned char) {0x01, 0x01, 0x01, 0x01,
> + 0x01, 0x01, 0x01, 0x01,
> + 0x01, 0x01, 0x01, 0x01,
> + 0x01, 0x01, 0x01, 0x01};
> + TEST_2ARG_UNSIGNED(sll, signed, s128, 0x2468acf13579bde0,
> + 0x2468acf13579bde0)
> +
> + src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_vb_uc = (vector unsigned char) {0x02, 0x02, 0x02, 0x02,
> + 0x02, 0x02, 0x02, 0x02,
> + 0x02, 0x02, 0x02, 0x02,
> + 0x02, 0x02, 0x02, 0x02};
> + TEST_2ARG_UNSIGNED(sll, unsigned, u128, 0x48d159e26af37bc0,
> + 0x48d159e26af37bc0)
> +
> + /* Vector right shift bytes within the vector, vec_srl. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + src_vb_uc = (vector unsigned char) {0x01, 0x01, 0x01, 0x01,
> + 0x01, 0x01, 0x01, 0x01,
> + 0x01, 0x01, 0x01, 0x01,
> + 0x01, 0x01, 0x01, 0x01};
> + TEST_2ARG_UNSIGNED(srl, signed, s128, 0x091a2b3c4d5e6f78,
> + 0x091a2b3c4d5e6f78)
> +
> + src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_vb_uc = (vector unsigned char) {0x02, 0x02, 0x02, 0x02,
> + 0x02, 0x02, 0x02, 0x02,
> + 0x02, 0x02, 0x02, 0x02,
> + 0x02, 0x02, 0x02, 0x02};
> + TEST_2ARG_UNSIGNED(srl, unsigned, u128, 0x48d159e26af37bc,
> + 0x48d159e26af37bc)
> +
> + /* Shift left by octect tests, vec_slo. Shift is by immediate value
> + bytes. Shift amount in bits 121:124. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + /* Note vb_sc is Endian specific, this is just LE. */
> + /* The left shift amount is 1 byte, i.e. 1 * 8 bits. */
> + src_vb_sc = (vector signed char) {0x1 << 3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0};
> +
> + TEST_2ARG_SIGNED(slo, signed, s128, 0x3456789ABCDEF012,
> + 0x3456789ABCDEF000)
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + /* Note vb_sc is Endian specific, this is just LE. */
> + /* The left shift amount is 2 bytes, i.e. 2 * 8 bits. */
> + src_vb_uc = (vector unsigned char) {0x2 << 3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0};
> + TEST_2ARG_UNSIGNED(slo, signed, s128, 0x56789ABCDEF01234,
> + 0x56789ABCDEF00000)
> +
> + src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + /* The left shift amount is 3 bytes, i.e. 3 * 8 bits. */
> + src_vb_sc = (vector signed char) {0x03<<3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x00, 0x00, 0x00, 0x0};
> + TEST_2ARG_SIGNED(slo, unsigned, u128, 0x9876543210FEDCBA,
> + 0x9876543210000000)
> +
> + src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + /* The left shift amount is 4 bytes, i.e. 4 * 8 bits. */
> + src_vb_uc = (vector unsigned char) {0x04<<3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x00, 0x00, 0x00, 0x0};
> + TEST_2ARG_UNSIGNED(slo, unsigned, u128, 0x76543210FEDCBA98,
> + 0x7654321000000000)
> +
> + /* Shift right by octect tests, vec_sro. Shift is by immediate value
> + times 8. Shift amount in bits 121:124. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + /* Note vb_sc is Endian specific, this is just LE. */
> + /* The left shift amount is 1 byte, i.e. 1 * 8 bits. */
> + src_vb_sc = (vector signed char) {0x1 << 3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0};
> + TEST_2ARG_SIGNED(sro, signed, s128, 0x00123456789ABCDE, 0xF0123456789ABCDE)
> +
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + /* Note vb_sc is Endian specific, this is just LE. */
> + /* The left shift amount is 1 byte, i.e. 1 * 8 bits. */
> + src_vb_uc = (vector unsigned char) {0x2 << 3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0};
> + TEST_2ARG_UNSIGNED(sro, signed, s128, 0x0000123456789ABC,
> + 0xDEF0123456789ABC)
> +
> + src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + /* The left shift amount is 4 bytes, i.e. 4 * 8 bits. */
> + src_vb_sc = (vector signed char) {0x03<<3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x00, 0x00, 0x00, 0x0};
> + TEST_2ARG_SIGNED(sro, unsigned, u128, 0x000000FEDCBA9876,
> + 0x543210FEDCBA9876)
> +
> + src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_va_u128 = src_va_u128 << 64
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + /* The left shift amount is 4 bytes, i.e. 4 * 8 bits. */
> + src_vb_uc = (vector unsigned char) {0x04<<3, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x0, 0x0, 0x0, 0x0,
> + 0x00, 0x00, 0x00, 0x0};
> + TEST_2ARG_UNSIGNED(sro, unsigned, u128, 0x00000000FEDCBA98,
> + 0x76543210FEDCBA98)
> +
> + /* 128-bit vector shift left tests, vec_sldw. */
> + src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
> + src_va_s128 = (src_va_s128 << 64)
> + | (vector signed __int128) {0x123456789ABCDEF0};
> + src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
> + src_vb_s128 = (src_vb_s128 << 64)
> + | (vector signed __int128) {0xFEDCBA9876543210};
> + TEST_3ARG(sldw, signed, s128, 1, 0x9ABCDEF012345678, 0x9ABCDEF0FEDCBA98)
> +
> + src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_va_u128 = (src_va_u128 << 64)
> + | (vector unsigned __int128) {0x123456789ABCDEF0};
> + src_vb_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
> + src_vb_u128 = (src_vb_u128 << 64)
> + | (vector unsigned __int128) {0xFEDCBA9876543210};
> + TEST_3ARG(sldw, unsigned, u128, 2, 0x123456789ABCDEF0, 0xFEDCBA9876543210)
> +
> +
> + return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mvsrdbi\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvsldbi\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvsl\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvsr\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvslo\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mvsro\M} 4 } } */
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
2024-07-29 10:21 ` Kewen.Lin
@ 2024-07-29 15:47 ` Peter Bergner
2024-07-30 2:27 ` Kewen.Lin
2024-07-30 15:17 ` Carl Love
2024-07-31 20:49 ` Carl Love
1 sibling, 2 replies; 7+ messages in thread
From: Peter Bergner @ 2024-07-29 15:47 UTC (permalink / raw)
To: Kewen.Lin, Carl Love; +Cc: GCC Patches, segher, David Edelsohn
On 7/29/24 5:21 AM, Kewen.Lin wrote:
> on 2024/7/27 06:37, Carl Love wrote:
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
>> @@ -0,0 +1,358 @@
>> +/* { dg-do run { target power10_hw } } */
>> +/* { dg-do link { target { ! power10_hw } } } */
>> +/* { dg-require-effective-target power10_ok } */
>
> As Peter pointed out in another thread, you need int128 effective target check as well,
> otherwise it will fail with power10 -m32.
>
> Another nit: power10_hw should already guarantee power10_ok, so power10_ok
> is only required for dg-do link.
I really dislike those *_ok tests. The power10_ok test doesn't verify that
the options being used to compile the test case enables Power10. It only
verifies the assembler you're using is Power10 enabled. I agree that the
power10_hw test includes the same (useless) assembler check that power10_ok
includes, so power10_ok isn't needed.
<rant>
Those *_ok tests really should be verifying the compiler options that will
be used to compile the test case enables the features the test case is
attempting to use.
</rant>
Maybe the following will work?
+/* { dg-do run { target power10_hw } } */
+/* { dg-do link { target { ! power10_hw } } } */
+/* { dg-require-effective-target int128 } */
...
Carl, can you try testing the above change on ltcd97-lp7 and run the test
in both 32-bit and 64-bit modes?
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
2024-07-29 15:47 ` Peter Bergner
@ 2024-07-30 2:27 ` Kewen.Lin
2024-07-30 15:17 ` Carl Love
1 sibling, 0 replies; 7+ messages in thread
From: Kewen.Lin @ 2024-07-30 2:27 UTC (permalink / raw)
To: Peter Bergner, Carl Love; +Cc: GCC Patches, segher, David Edelsohn
on 2024/7/29 23:47, Peter Bergner wrote:
> On 7/29/24 5:21 AM, Kewen.Lin wrote:
>> on 2024/7/27 06:37, Carl Love wrote:
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
>>> @@ -0,0 +1,358 @@
>>> +/* { dg-do run { target power10_hw } } */
>>> +/* { dg-do link { target { ! power10_hw } } } */
>>> +/* { dg-require-effective-target power10_ok } */
>>
>> As Peter pointed out in another thread, you need int128 effective target check as well,
>> otherwise it will fail with power10 -m32.
>>
>> Another nit: power10_hw should already guarantee power10_ok, so power10_ok
>> is only required for dg-do link.
>
> I really dislike those *_ok tests. The power10_ok test doesn't verify that
> the options being used to compile the test case enables Power10. It only
> verifies the assembler you're using is Power10 enabled. I agree that the
> power10_hw test includes the same (useless) assembler check that power10_ok
> includes, so power10_ok isn't needed.
>
> <rant>
> Those *_ok tests really should be verifying the compiler options that will
> be used to compile the test case enables the features the test case is
> attempting to use.
> </rant>
>
>
Yes!
> Maybe the following will work?
>
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-do link { target { ! power10_hw } } } */
Maybe we can replace link by compile here, as we care about compilation and
execution result more here. (IMHO if it's still "link", power10_ok is useful
to stop this being tested on an environment with an assembler not supporting
power10).
BR,
Kewen
> +/* { dg-require-effective-target int128 } */
> ...
>
> Carl, can you try testing the above change on ltcd97-lp7 and run the test
> in both 32-bit and 64-bit modes?
>
> Peter
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
2024-07-29 15:47 ` Peter Bergner
2024-07-30 2:27 ` Kewen.Lin
@ 2024-07-30 15:17 ` Carl Love
2024-07-30 23:16 ` Peter Bergner
1 sibling, 1 reply; 7+ messages in thread
From: Carl Love @ 2024-07-30 15:17 UTC (permalink / raw)
To: gcc-patches, Peter Bergner, Kewen, cel
Peter, Kewen:
Per Peter's request, I did the following testing on ltcd97-lp7 which is
a Power 10 running in BE mode.
On 7/29/24 8:47 AM, Peter Bergner wrote:
> Maybe the following will work?
>
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-do link { target { ! power10_hw } } } */
> +/* { dg-require-effective-target int128 } */
> ...
>
> Carl, can you try testing the above change on ltcd97-lp7 and run the test
> in both 32-bit and 64-bit modes?
I tested with the above specification and -m64 and I get
# of expected passes 8
I tested the above specification with -m32
<snip>
/home/carll/GCC/gcc-steve/gcc/testsuite/gcc.target/powerpc/vec-shift-double-run\
nable-int128.c:390:346: warning: overflow in conversion from 'long long
int' to\
'int' changes value from '8526495043095935640' to '-19088744'
[-Woverflow]^M
/home/carll/GCC/gcc-steve/gcc/testsuite/gcc.target/powerpc/vec-shift-double-run\
nable-int128.c:394:60: error: '__int128' is not supported on this target^
<snip>
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c (test for
excess er\
rors)
gcc.target/powerpc/vec-shift-double-runnable-int128.c: \\mvsrdbi\\M
found 0 tim\
es
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c
scan-assembler-time\
s \\mvsrdbi\\M 2
gcc.target/powerpc/vec-shift-double-runnable-int128.c: \\mvsldbi\\M
found 0 tim\
es
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c
scan-assembler-time\
s \\mvsldbi\\M 2
gcc.target/powerpc/vec-shift-double-runnable-int128.c: \\mvsl\\M found 0
times
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c
scan-assembler-time\
s \\mvsl\\M 2
gcc.target/powerpc/vec-shift-double-runnable-int128.c: \\mvsr\\M found 0
times
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c
scan-assembler-time\
s \\mvsr\\M 2
gcc.target/powerpc/vec-shift-double-runnable-int128.c: \\mvslo\\M found
0 times
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c
scan-assembler-time\
s \\mvslo\\M 4
gcc.target/powerpc/vec-shift-double-runnable-int128.c: \\mvsro\\M found
0 times
FAIL: gcc.target/powerpc/vec-shift-double-runnable-int128.c
scan-assembler-time\
s \\mvsro\\M 4
# of unexpected failures 7
Basically, the header is not detecting the int128.
But if I put the int128 in the dg-do run line, like vsc-buildin-20d.c
/* { dg-do run { target { power10_hw } && { int128 } } } */
/* { dg-do link { target { ! power10_hw } } } */
/* { dg-require-effective-target vsx_hw } */
I get the following with -m32:
# of unsupported tests 1
Per the comments from Kewen:
On 7/29/24 7:27 PM, Kewen.Lin wrote:
>> Maybe the following will work?
>>
>> +/* { dg-do run { target power10_hw } } */
>> +/* { dg-do link { target { ! power10_hw } } } */
> Maybe we can replace link by compile here, as we care about compilation and
> execution result more here. (IMHO if it's still "link", power10_ok is useful
> to stop this being tested on an environment with an assembler not supporting
> power10).
>
> BR,
> Kewen
I tried, I hope I got it right, with -m32t:
/* { dg-do run { target power10_hw } } */
/* { dg-do compile { target { ! power10_hw } } } */
/* { dg-require-effective-target int128 } */
This gives:
# of unsupported tests 1
The same header with -m64 I get:
# of expected passes 8
This header seems to give us what we want on Power10 BE with -m32 and
m64 (tested on ltcd97-lp7).
Carl
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
2024-07-30 15:17 ` Carl Love
@ 2024-07-30 23:16 ` Peter Bergner
0 siblings, 0 replies; 7+ messages in thread
From: Peter Bergner @ 2024-07-30 23:16 UTC (permalink / raw)
To: Carl Love, gcc-patches, Kewen
On 7/30/24 10:17 AM, Carl Love wrote:
> I tried, I hope I got it right, with -m32t:
>
> /* { dg-do run { target power10_hw } } */
> /* { dg-do compile { target { ! power10_hw } } } */
> /* { dg-require-effective-target int128 } */
>
> This gives:
>
> # of unsupported tests 1
>
> The same header with -m64 I get:
>
> # of expected passes 8
>
> This header seems to give us what we want on Power10 BE with -m32 and m64 (tested on ltcd97-lp7).
I agree with Kewen that dg-do compile makes more sense than link for
the non Power10 test, so if the above works, then great!
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients
2024-07-29 10:21 ` Kewen.Lin
2024-07-29 15:47 ` Peter Bergner
@ 2024-07-31 20:49 ` Carl Love
1 sibling, 0 replies; 7+ messages in thread
From: Carl Love @ 2024-07-31 20:49 UTC (permalink / raw)
To: Kewen.Lin, cel; +Cc: GCC Patches, Peter Bergner, segher, David Edelsohn
Kewen:
On 7/29/24 3:21 AM, Kewen.Lin wrote:
>> +@smallexample
>> +@exdent vector signed __int128 vec_sld (vector signed __int128,
>> +vector signed __int128, const unsigned int);
>> +@exdent vector unsigned __int128 vec_sld (vector unsigned __int128,
>> +vector unsigned __int128, const unsigned int);
>> +@exdent vector signed __int128 vec_sldw (vector signed __int128,
>> +vector signed __int128, const unsigned int);
>> +@exdent vector unsigned __int128 vec_sldw (vector unsigned __int,
>> +vector unsigned __int128, const unsigned int);
>> +@exdent vector signed __int128 vec_slo (vector signed __int128,
>> +vector signed char);
>> +@exdent vector signed __int128 vec_slo (vector signed __int128,
>> +vector unsigned char);
>> +@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
>> +vector signed char);
>> +@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
>> +vector unsigned char);
>> +@exdent vector signed __int128 vec_sro (vector signed __int128,
>> +vector signed char);
>> +@exdent vector signed __int128 vec_sro (vector signed __int128,
>> +vector unsigned char);
>> +@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
>> +vector signed char);
>> +@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
>> +vector unsigned char);
>> +@exdent vector signed __int128 vec_srl (vector signed __int128,
>> +vector unsigned char);
>> +@exdent vector unsigned __int128 vec_srl (vector unsigned __int128,
>> +vector unsigned char);
>> +@end smallexample
>> +
>> +The above instances are extension of the existing overloaded built-ins
>> +@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro}, @code{vec_srl}
>> +that are documented in the PVIPR.
>> +
>> @findex vec_srdb
> Nit: The above new @smallexample section and its associated description should be
> placed after this @findex vec_srdb (otherwise it breaks the connection between the
> index and the content of vec_srdb),
Yes, my bad. I didn't notice I got the findex vec_srdb in the wrong place.
> but personally I preferred it to be placed at
> the end of this node, that is: after
> "int vec_any_le (vector unsigned __int128, vector unsigned __int128);
> @end smallexample
> " as what's in your previous version, since most of these beginning entries have
> their headings but this @smallexample section doesn't have a heading, it looks a
> bit weird.
OK, perhaps I didn't understand where you wanted it in the previous
email. I moved it. Hopefully I have it correct this time.
>> Vector Splat
>> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
>> new file mode 100644
>> index 00000000000..65e8e94ec07
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
>> @@ -0,0 +1,358 @@
>> +/* { dg-do run { target power10_hw } } */
>> +/* { dg-do link { target { ! power10_hw } } } */
>> +/* { dg-require-effective-target power10_ok } */
> As Peter pointed out in another thread, you need int128 effective target check as well,
> otherwise it will fail with power10 -m32.
>
> Another nit: power10_hw should already guarantee power10_ok, so power10_ok
> is only required for dg-do link.
Changed to:
+/* { dg-do run { target power10_hw } } */
+/* { dg-do compile { target { ! power10_hw } } } */
+/* { dg-require-effective-target int128 } */
per the discussion/feedback from Kewen and Peter.
Carl
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-07-31 20:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-26 22:37 [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients Carl Love
2024-07-29 10:21 ` Kewen.Lin
2024-07-29 15:47 ` Peter Bergner
2024-07-30 2:27 ` Kewen.Lin
2024-07-30 15:17 ` Carl Love
2024-07-30 23:16 ` Peter Bergner
2024-07-31 20:49 ` Carl Love
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).