* [PATCH] IFN: Fix vector extraction into promoted subreg.
@ 2023-08-16 1:31 juzhe.zhong
2023-08-16 6:45 ` Richard Sandiford
0 siblings, 1 reply; 5+ messages in thread
From: juzhe.zhong @ 2023-08-16 1:31 UTC (permalink / raw)
To: gcc-patches; +Cc: richard.sandiford, rguenther
[-- Attachment #1: Type: text/plain, Size: 411 bytes --]
Hi, Robin, Richard and Richi.
I am wondering whether we can just simply replace the VEC_EXTRACT expander with binary?
Like this :?
DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW,
- vec_extract, vec_extract)
+ vec_extract, binary)
to fix the sign extend issue.
And remove the vec_extract explicit expander in internal-fn.cc ?
Thanks.
juzhe.zhong@rivai.ai
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] IFN: Fix vector extraction into promoted subreg.
2023-08-16 1:31 [PATCH] IFN: Fix vector extraction into promoted subreg juzhe.zhong
@ 2023-08-16 6:45 ` Richard Sandiford
2023-08-16 9:37 ` Robin Dapp
0 siblings, 1 reply; 5+ messages in thread
From: Richard Sandiford @ 2023-08-16 6:45 UTC (permalink / raw)
To: juzhe.zhong; +Cc: gcc-patches, rguenther
"juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes:
> Hi, Robin, Richard and Richi.
>
> I am wondering whether we can just simply replace the VEC_EXTRACT expander with binary?
>
> Like this :?
>
> DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW,
> - vec_extract, vec_extract)
> + vec_extract, binary)
>
> to fix the sign extend issue.
>
> And remove the vec_extract explicit expander in internal-fn.cc ?
I'm not sure how that would work. The vec_extract optab takes two
modes whereas binary optabs take one mode.
However:
| #define vec_extract_direct { 3, 3, false }
This looks wrong. The numbers are argument numbers (or -1 for a return
value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range.
| #define direct_vec_extract_optab_supported_p direct_optab_supported_p
I would expect this to be convert_optab_supported_p.
On the promoted subreg thing, I think expand_vec_extract_optab_fn
should use expand_fn_using_insn.
Thanks,
Richard
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] IFN: Fix vector extraction into promoted subreg.
2023-08-16 6:45 ` Richard Sandiford
@ 2023-08-16 9:37 ` Robin Dapp
2023-08-16 10:05 ` Richard Sandiford
0 siblings, 1 reply; 5+ messages in thread
From: Robin Dapp @ 2023-08-16 9:37 UTC (permalink / raw)
To: juzhe.zhong, gcc-patches, rguenther, richard.sandiford; +Cc: rdapp.gcc
> However:
>
> | #define vec_extract_direct { 3, 3, false }
>
> This looks wrong. The numbers are argument numbers (or -1 for a return
> value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range.
>
> | #define direct_vec_extract_optab_supported_p direct_optab_supported_p
>
> I would expect this to be convert_optab_supported_p.
>
> On the promoted subreg thing, I think expand_vec_extract_optab_fn
> should use expand_fn_using_insn.
Thanks, really easier that way. Attached a new version that's currently
bootstrapping. Does that look better?
Regards
Robin
Subject: [PATCH v2] internal-fn: Fix vector extraction into promoted subreg.
This patch fixes the case where vec_extract gets passed a promoted
subreg (e.g. from a return value). This is achieved by using
expand_convert_optab_fn instead of a separate expander function.
gcc/ChangeLog:
* internal-fn.cc (vec_extract_direct): Change type argument
numbers.
(expand_vec_extract_optab_fn): Call convert_optab_fn.
(direct_vec_extract_optab_supported_p): Use
convert_optab_supported_p.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: New test.
---
gcc/internal-fn.cc | 44 +-----
| 63 ++++++++
| 69 +++++++++
| 69 +++++++++
| 70 +++++++++
| 137 ++++++++++++++++++
6 files changed, 413 insertions(+), 39 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 4f2b20a79e5..5cce36a789b 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -175,7 +175,7 @@ init_internal_fns ()
#define len_store_direct { 3, 3, false }
#define mask_len_store_direct { 4, 5, false }
#define vec_set_direct { 3, 3, false }
-#define vec_extract_direct { 3, 3, false }
+#define vec_extract_direct { 0, -1, false }
#define unary_direct { 0, 0, true }
#define unary_convert_direct { -1, 0, true }
#define binary_direct { 0, 0, true }
@@ -3127,43 +3127,6 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
gcc_unreachable ();
}
-/* Expand VEC_EXTRACT optab internal function. */
-
-static void
-expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
-{
- tree lhs = gimple_call_lhs (stmt);
- tree op0 = gimple_call_arg (stmt, 0);
- tree op1 = gimple_call_arg (stmt, 1);
-
- rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-
- machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
- machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
-
- rtx src = expand_normal (op0);
- rtx pos = expand_normal (op1);
-
- class expand_operand ops[3];
- enum insn_code icode = convert_optab_handler (optab, outermode,
- extract_mode);
-
- if (icode != CODE_FOR_nothing)
- {
- create_output_operand (&ops[0], target, extract_mode);
- create_input_operand (&ops[1], src, outermode);
- create_convert_operand_from (&ops[2], pos,
- TYPE_MODE (TREE_TYPE (op1)), true);
- if (maybe_expand_insn (icode, 3, ops))
- {
- if (!rtx_equal_p (target, ops[0].value))
- emit_move_insn (target, ops[0].value);
- return;
- }
- }
- gcc_unreachable ();
-}
-
static void
expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
{
@@ -3917,6 +3880,9 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, convert_optab optab,
#define expand_unary_convert_optab_fn(FN, STMT, OPTAB) \
expand_convert_optab_fn (FN, STMT, OPTAB, 1)
+#define expand_vec_extract_optab_fn(FN, STMT, OPTAB) \
+ expand_convert_optab_fn (FN, STMT, OPTAB, 2)
+
/* RETURN_TYPE and ARGS are a return type and argument list that are
in principle compatible with FN (which satisfies direct_internal_fn_p).
Return the types that should be used to determine whether the
@@ -4019,7 +3985,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
#define direct_mask_len_fold_left_optab_supported_p direct_optab_supported_p
#define direct_check_ptrs_optab_supported_p direct_optab_supported_p
#define direct_vec_set_optab_supported_p direct_optab_supported_p
-#define direct_vec_extract_optab_supported_p direct_optab_supported_p
+#define direct_vec_extract_optab_supported_p convert_optab_supported_p
/* Return the optab used by internal function FN. */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
new file mode 100644
index 00000000000..a35988ff55d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx2di __attribute__((vector_size (16)));
+typedef uint32_t vnx4si __attribute__((vector_size (16)));
+typedef uint16_t vnx8hi __attribute__((vector_size (16)));
+typedef uint8_t vnx16qi __attribute__((vector_size (16)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR1(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int8_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL1(T) \
+ T (uint64_t, vnx2di, 0) \
+ T (uint64_t, vnx2di, 1) \
+ T (uint32_t, vnx4si, 0) \
+ T (uint32_t, vnx4si, 1) \
+ T (uint32_t, vnx4si, 3) \
+ T (uint16_t, vnx8hi, 0) \
+ T (uint16_t, vnx8hi, 2) \
+ T (uint16_t, vnx8hi, 6) \
+ T (uint8_t, vnx16qi, 0) \
+ T (uint8_t, vnx16qi, 1) \
+ T (uint8_t, vnx16qi, 7) \
+ T (uint8_t, vnx16qi, 11) \
+ T (uint8_t, vnx16qi, 15) \
+
+#define TEST_ALL_VAR1(T) \
+ T (uint64_t, vnx2di) \
+ T (uint32_t, vnx4si) \
+ T (uint16_t, vnx8hi) \
+ T (uint8_t, vnx16qi) \
+
+TEST_ALL1 (VEC_EXTRACT)
+TEST_ALL_VAR1 (VEC_EXTRACT_VAR1)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 3 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 9 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 17 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 4 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 4 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
new file mode 100644
index 00000000000..8c3c16a7047
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx4di __attribute__((vector_size (32)));
+typedef uint32_t vnx8si __attribute__((vector_size (32)));
+typedef uint16_t vnx16hi __attribute__((vector_size (32)));
+typedef uint8_t vnx32qi __attribute__((vector_size (32)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR2(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int16_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL2(T) \
+ T (uint64_t, vnx4di, 0) \
+ T (uint64_t, vnx4di, 1) \
+ T (uint64_t, vnx4di, 2) \
+ T (uint64_t, vnx4di, 3) \
+ T (uint32_t, vnx8si, 0) \
+ T (uint32_t, vnx8si, 1) \
+ T (uint32_t, vnx8si, 3) \
+ T (uint32_t, vnx8si, 4) \
+ T (uint32_t, vnx8si, 7) \
+ T (uint16_t, vnx16hi, 0) \
+ T (uint16_t, vnx16hi, 1) \
+ T (uint16_t, vnx16hi, 7) \
+ T (uint16_t, vnx16hi, 8) \
+ T (uint16_t, vnx16hi, 15) \
+ T (uint8_t, vnx32qi, 0) \
+ T (uint8_t, vnx32qi, 1) \
+ T (uint8_t, vnx32qi, 15) \
+ T (uint8_t, vnx32qi, 16) \
+ T (uint8_t, vnx32qi, 31) \
+
+#define TEST_ALL_VAR2(T) \
+ T (uint64_t, vnx4di) \
+ T (uint32_t, vnx8si) \
+ T (uint16_t, vnx16hi) \
+ T (uint8_t, vnx32qi) \
+
+TEST_ALL2 (VEC_EXTRACT)
+TEST_ALL_VAR2 (VEC_EXTRACT_VAR2)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 5 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 15 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 23 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
new file mode 100644
index 00000000000..ab49f29c3f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx8di __attribute__((vector_size (64)));
+typedef uint32_t vnx16si __attribute__((vector_size (64)));
+typedef uint16_t vnx32hi __attribute__((vector_size (64)));
+typedef uint8_t vnx64qi __attribute__((vector_size (64)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR3(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int32_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL3(T) \
+ T (uint64_t, vnx8di, 0) \
+ T (uint64_t, vnx8di, 2) \
+ T (uint64_t, vnx8di, 4) \
+ T (uint64_t, vnx8di, 6) \
+ T (uint32_t, vnx16si, 0) \
+ T (uint32_t, vnx16si, 2) \
+ T (uint32_t, vnx16si, 6) \
+ T (uint32_t, vnx16si, 8) \
+ T (uint32_t, vnx16si, 14) \
+ T (uint16_t, vnx32hi, 0) \
+ T (uint16_t, vnx32hi, 2) \
+ T (uint16_t, vnx32hi, 14) \
+ T (uint16_t, vnx32hi, 16) \
+ T (uint16_t, vnx32hi, 30) \
+ T (uint8_t, vnx64qi, 0) \
+ T (uint8_t, vnx64qi, 2) \
+ T (uint8_t, vnx64qi, 30) \
+ T (uint8_t, vnx64qi, 32) \
+ T (uint8_t, vnx64qi, 63) \
+
+#define TEST_ALL_VAR3(T) \
+ T (uint64_t, vnx8di) \
+ T (uint32_t, vnx16si) \
+ T (uint16_t, vnx32hi) \
+ T (uint8_t, vnx64qi) \
+
+TEST_ALL3 (VEC_EXTRACT)
+TEST_ALL_VAR3 (VEC_EXTRACT_VAR3)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 5 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 13 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 23 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
new file mode 100644
index 00000000000..328d426e572
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx16di __attribute__((vector_size (128)));
+typedef uint32_t vnx32si __attribute__((vector_size (128)));
+typedef uint16_t vnx64hi __attribute__((vector_size (128)));
+typedef uint8_t vnx128qi __attribute__((vector_size (128)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR4(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int64_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL4(T) \
+ T (uint64_t, vnx16di, 0) \
+ T (uint64_t, vnx16di, 4) \
+ T (uint64_t, vnx16di, 8) \
+ T (uint64_t, vnx16di, 12) \
+ T (uint32_t, vnx32si, 0) \
+ T (uint32_t, vnx32si, 4) \
+ T (uint32_t, vnx32si, 12) \
+ T (uint32_t, vnx32si, 16) \
+ T (uint32_t, vnx32si, 28) \
+ T (uint16_t, vnx64hi, 0) \
+ T (uint16_t, vnx64hi, 4) \
+ T (uint16_t, vnx64hi, 28) \
+ T (uint16_t, vnx64hi, 32) \
+ T (uint16_t, vnx64hi, 60) \
+ T (uint8_t, vnx128qi, 0) \
+ T (uint8_t, vnx128qi, 4) \
+ T (uint8_t, vnx128qi, 30) \
+ T (uint8_t, vnx128qi, 60) \
+ T (uint8_t, vnx128qi, 64) \
+ T (uint8_t, vnx128qi, 127) \
+
+#define TEST_ALL_VAR4(T) \
+ T (uint64_t, vnx16di) \
+ T (uint32_t, vnx32si) \
+ T (uint16_t, vnx64hi) \
+ T (uint8_t, vnx128qi) \
+
+TEST_ALL4 (VEC_EXTRACT)
+TEST_ALL_VAR4 (VEC_EXTRACT_VAR4)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 7 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 5 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 11 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 9 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 24 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 7 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
new file mode 100644
index 00000000000..924e40c9dbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
@@ -0,0 +1,137 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-std=c99 -Wno-pedantic -Wno-psabi" } */
+
+#include <assert.h>
+#include <limits.h>
+
+#include "vec_extract-1u.c"
+#include "vec_extract-2u.c"
+#include "vec_extract-3u.c"
+#include "vec_extract-4u.c"
+
+#define CHECK(S, V, IDX) \
+ __attribute__ ((noipa, optimize ("0"))) void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = (S) (INT_MAX - i); \
+ S res = vec_extract_##V##_##IDX (v); \
+ assert (res == (S) (INT_MAX - IDX)); \
+ }
+
+#define CHECK_VAR(S, V) \
+ __attribute__ ((noipa, optimize ("0"))) void check_var_##V (int32_t idx) \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = (S) (INT_MAX - i); \
+ S res = vec_extract_var_##V (v, idx); \
+ assert (res == (S) (INT_MAX - idx)); \
+ }
+
+#define RUN(S, V, IDX) check_##V##_##IDX ();
+
+#define RUN_VAR(S, V) \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ check_var_##V (i);
+
+#define RUN_ALL(T) \
+ T (uint64_t, vnx2di, 0) \
+ T (uint64_t, vnx2di, 1) \
+ T (uint32_t, vnx4si, 0) \
+ T (uint32_t, vnx4si, 1) \
+ T (uint32_t, vnx4si, 3) \
+ T (uint16_t, vnx8hi, 0) \
+ T (uint16_t, vnx8hi, 2) \
+ T (uint16_t, vnx8hi, 6) \
+ T (uint8_t, vnx16qi, 0) \
+ T (uint8_t, vnx16qi, 1) \
+ T (uint8_t, vnx16qi, 7) \
+ T (uint8_t, vnx16qi, 11) \
+ T (uint8_t, vnx16qi, 15) \
+ T (uint64_t, vnx4di, 0) \
+ T (uint64_t, vnx4di, 1) \
+ T (uint64_t, vnx4di, 2) \
+ T (uint64_t, vnx4di, 3) \
+ T (uint32_t, vnx8si, 0) \
+ T (uint32_t, vnx8si, 1) \
+ T (uint32_t, vnx8si, 3) \
+ T (uint32_t, vnx8si, 4) \
+ T (uint32_t, vnx8si, 7) \
+ T (uint16_t, vnx16hi, 0) \
+ T (uint16_t, vnx16hi, 1) \
+ T (uint16_t, vnx16hi, 7) \
+ T (uint16_t, vnx16hi, 8) \
+ T (uint16_t, vnx16hi, 15) \
+ T (uint8_t, vnx32qi, 0) \
+ T (uint8_t, vnx32qi, 1) \
+ T (uint8_t, vnx32qi, 15) \
+ T (uint8_t, vnx32qi, 16) \
+ T (uint8_t, vnx32qi, 31) \
+ T (uint64_t, vnx8di, 0) \
+ T (uint64_t, vnx8di, 2) \
+ T (uint64_t, vnx8di, 4) \
+ T (uint64_t, vnx8di, 6) \
+ T (uint32_t, vnx16si, 0) \
+ T (uint32_t, vnx16si, 2) \
+ T (uint32_t, vnx16si, 6) \
+ T (uint32_t, vnx16si, 8) \
+ T (uint32_t, vnx16si, 14) \
+ T (uint16_t, vnx32hi, 0) \
+ T (uint16_t, vnx32hi, 2) \
+ T (uint16_t, vnx32hi, 14) \
+ T (uint16_t, vnx32hi, 16) \
+ T (uint16_t, vnx32hi, 30) \
+ T (uint8_t, vnx64qi, 0) \
+ T (uint8_t, vnx64qi, 2) \
+ T (uint8_t, vnx64qi, 30) \
+ T (uint8_t, vnx64qi, 32) \
+ T (uint8_t, vnx64qi, 63) \
+ T (uint64_t, vnx16di, 0) \
+ T (uint64_t, vnx16di, 4) \
+ T (uint64_t, vnx16di, 8) \
+ T (uint64_t, vnx16di, 12) \
+ T (uint32_t, vnx32si, 0) \
+ T (uint32_t, vnx32si, 4) \
+ T (uint32_t, vnx32si, 12) \
+ T (uint32_t, vnx32si, 16) \
+ T (uint32_t, vnx32si, 28) \
+ T (uint16_t, vnx64hi, 0) \
+ T (uint16_t, vnx64hi, 4) \
+ T (uint16_t, vnx64hi, 28) \
+ T (uint16_t, vnx64hi, 32) \
+ T (uint16_t, vnx64hi, 60) \
+ T (uint8_t, vnx128qi, 0) \
+ T (uint8_t, vnx128qi, 4) \
+ T (uint8_t, vnx128qi, 30) \
+ T (uint8_t, vnx128qi, 60) \
+ T (uint8_t, vnx128qi, 64) \
+ T (uint8_t, vnx128qi, 127)
+
+#define RUN_ALL_VAR(T) \
+ T (uint64_t, vnx2di) \
+ T (uint32_t, vnx4si) \
+ T (uint16_t, vnx8hi) \
+ T (uint8_t, vnx16qi) \
+ T (uint64_t, vnx4di) \
+ T (uint32_t, vnx8si) \
+ T (uint16_t, vnx16hi) \
+ T (uint8_t, vnx32qi) \
+ T (uint64_t, vnx8di) \
+ T (uint32_t, vnx16si) \
+ T (uint16_t, vnx32hi) \
+ T (uint8_t, vnx64qi) \
+ T (uint64_t, vnx16di) \
+ T (uint32_t, vnx32si) \
+ T (uint16_t, vnx64hi) \
+ T (uint8_t, vnx128qi)
+
+RUN_ALL (CHECK)
+RUN_ALL_VAR (CHECK_VAR)
+
+int
+main ()
+{
+ RUN_ALL (RUN);
+ RUN_ALL_VAR (RUN_VAR);
+}
--
2.41.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] IFN: Fix vector extraction into promoted subreg.
2023-08-16 9:37 ` Robin Dapp
@ 2023-08-16 10:05 ` Richard Sandiford
0 siblings, 0 replies; 5+ messages in thread
From: Richard Sandiford @ 2023-08-16 10:05 UTC (permalink / raw)
To: Robin Dapp; +Cc: juzhe.zhong, gcc-patches, rguenther
Robin Dapp <rdapp.gcc@gmail.com> writes:
>> However:
>>
>> | #define vec_extract_direct { 3, 3, false }
>>
>> This looks wrong. The numbers are argument numbers (or -1 for a return
>> value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range.
>>
>> | #define direct_vec_extract_optab_supported_p direct_optab_supported_p
>>
>> I would expect this to be convert_optab_supported_p.
>>
>> On the promoted subreg thing, I think expand_vec_extract_optab_fn
>> should use expand_fn_using_insn.
>
> Thanks, really easier that way. Attached a new version that's currently
> bootstrapping. Does that look better?
LGTM, thanks. OK if testing passes.
Richard
> Regards
> Robin
>
> Subject: [PATCH v2] internal-fn: Fix vector extraction into promoted subreg.
>
> This patch fixes the case where vec_extract gets passed a promoted
> subreg (e.g. from a return value). This is achieved by using
> expand_convert_optab_fn instead of a separate expander function.
>
> gcc/ChangeLog:
>
> * internal-fn.cc (vec_extract_direct): Change type argument
> numbers.
> (expand_vec_extract_optab_fn): Call convert_optab_fn.
> (direct_vec_extract_optab_supported_p): Use
> convert_optab_supported_p.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: New test.
> * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: New test.
> * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: New test.
> * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: New test.
> * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: New test.
> ---
> gcc/internal-fn.cc | 44 +-----
> .../rvv/autovec/vls-vlmax/vec_extract-1u.c | 63 ++++++++
> .../rvv/autovec/vls-vlmax/vec_extract-2u.c | 69 +++++++++
> .../rvv/autovec/vls-vlmax/vec_extract-3u.c | 69 +++++++++
> .../rvv/autovec/vls-vlmax/vec_extract-4u.c | 70 +++++++++
> .../rvv/autovec/vls-vlmax/vec_extract-runu.c | 137 ++++++++++++++++++
> 6 files changed, 413 insertions(+), 39 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 4f2b20a79e5..5cce36a789b 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -175,7 +175,7 @@ init_internal_fns ()
> #define len_store_direct { 3, 3, false }
> #define mask_len_store_direct { 4, 5, false }
> #define vec_set_direct { 3, 3, false }
> -#define vec_extract_direct { 3, 3, false }
> +#define vec_extract_direct { 0, -1, false }
> #define unary_direct { 0, 0, true }
> #define unary_convert_direct { -1, 0, true }
> #define binary_direct { 0, 0, true }
> @@ -3127,43 +3127,6 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> gcc_unreachable ();
> }
>
> -/* Expand VEC_EXTRACT optab internal function. */
> -
> -static void
> -expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> -{
> - tree lhs = gimple_call_lhs (stmt);
> - tree op0 = gimple_call_arg (stmt, 0);
> - tree op1 = gimple_call_arg (stmt, 1);
> -
> - rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> -
> - machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
> - machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> -
> - rtx src = expand_normal (op0);
> - rtx pos = expand_normal (op1);
> -
> - class expand_operand ops[3];
> - enum insn_code icode = convert_optab_handler (optab, outermode,
> - extract_mode);
> -
> - if (icode != CODE_FOR_nothing)
> - {
> - create_output_operand (&ops[0], target, extract_mode);
> - create_input_operand (&ops[1], src, outermode);
> - create_convert_operand_from (&ops[2], pos,
> - TYPE_MODE (TREE_TYPE (op1)), true);
> - if (maybe_expand_insn (icode, 3, ops))
> - {
> - if (!rtx_equal_p (target, ops[0].value))
> - emit_move_insn (target, ops[0].value);
> - return;
> - }
> - }
> - gcc_unreachable ();
> -}
> -
> static void
> expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
> {
> @@ -3917,6 +3880,9 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, convert_optab optab,
> #define expand_unary_convert_optab_fn(FN, STMT, OPTAB) \
> expand_convert_optab_fn (FN, STMT, OPTAB, 1)
>
> +#define expand_vec_extract_optab_fn(FN, STMT, OPTAB) \
> + expand_convert_optab_fn (FN, STMT, OPTAB, 2)
> +
> /* RETURN_TYPE and ARGS are a return type and argument list that are
> in principle compatible with FN (which satisfies direct_internal_fn_p).
> Return the types that should be used to determine whether the
> @@ -4019,7 +3985,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
> #define direct_mask_len_fold_left_optab_supported_p direct_optab_supported_p
> #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
> #define direct_vec_set_optab_supported_p direct_optab_supported_p
> -#define direct_vec_extract_optab_supported_p direct_optab_supported_p
> +#define direct_vec_extract_optab_supported_p convert_optab_supported_p
>
> /* Return the optab used by internal function FN. */
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
> new file mode 100644
> index 00000000000..a35988ff55d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
> @@ -0,0 +1,63 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
> +
> +#include <stdint-gcc.h>
> +
> +typedef uint64_t vnx2di __attribute__((vector_size (16)));
> +typedef uint32_t vnx4si __attribute__((vector_size (16)));
> +typedef uint16_t vnx8hi __attribute__((vector_size (16)));
> +typedef uint8_t vnx16qi __attribute__((vector_size (16)));
> +
> +#define VEC_EXTRACT(S,V,IDX) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_##V##_##IDX (V v) \
> + { \
> + return v[IDX]; \
> + }
> +
> +#define VEC_EXTRACT_VAR1(S,V) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_var_##V (V v, int8_t idx) \
> + { \
> + return v[idx]; \
> + }
> +
> +#define TEST_ALL1(T) \
> + T (uint64_t, vnx2di, 0) \
> + T (uint64_t, vnx2di, 1) \
> + T (uint32_t, vnx4si, 0) \
> + T (uint32_t, vnx4si, 1) \
> + T (uint32_t, vnx4si, 3) \
> + T (uint16_t, vnx8hi, 0) \
> + T (uint16_t, vnx8hi, 2) \
> + T (uint16_t, vnx8hi, 6) \
> + T (uint8_t, vnx16qi, 0) \
> + T (uint8_t, vnx16qi, 1) \
> + T (uint8_t, vnx16qi, 7) \
> + T (uint8_t, vnx16qi, 11) \
> + T (uint8_t, vnx16qi, 15) \
> +
> +#define TEST_ALL_VAR1(T) \
> + T (uint64_t, vnx2di) \
> + T (uint32_t, vnx4si) \
> + T (uint16_t, vnx8hi) \
> + T (uint8_t, vnx16qi) \
> +
> +TEST_ALL1 (VEC_EXTRACT)
> +TEST_ALL_VAR1 (VEC_EXTRACT_VAR1)
> +
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 4 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 4 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 3 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvslidedown.vi} 9 } } */
> +/* { dg-final { scan-assembler-times {\tvslidedown.vx} 4 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvmv.x.s} 17 } } */
> +
> +/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
> +/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 4 } } */
> +/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 4 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
> new file mode 100644
> index 00000000000..8c3c16a7047
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
> @@ -0,0 +1,69 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
> +
> +#include <stdint-gcc.h>
> +
> +typedef uint64_t vnx4di __attribute__((vector_size (32)));
> +typedef uint32_t vnx8si __attribute__((vector_size (32)));
> +typedef uint16_t vnx16hi __attribute__((vector_size (32)));
> +typedef uint8_t vnx32qi __attribute__((vector_size (32)));
> +
> +#define VEC_EXTRACT(S,V,IDX) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_##V##_##IDX (V v) \
> + { \
> + return v[IDX]; \
> + }
> +
> +#define VEC_EXTRACT_VAR2(S,V) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_var_##V (V v, int16_t idx) \
> + { \
> + return v[idx]; \
> + }
> +
> +#define TEST_ALL2(T) \
> + T (uint64_t, vnx4di, 0) \
> + T (uint64_t, vnx4di, 1) \
> + T (uint64_t, vnx4di, 2) \
> + T (uint64_t, vnx4di, 3) \
> + T (uint32_t, vnx8si, 0) \
> + T (uint32_t, vnx8si, 1) \
> + T (uint32_t, vnx8si, 3) \
> + T (uint32_t, vnx8si, 4) \
> + T (uint32_t, vnx8si, 7) \
> + T (uint16_t, vnx16hi, 0) \
> + T (uint16_t, vnx16hi, 1) \
> + T (uint16_t, vnx16hi, 7) \
> + T (uint16_t, vnx16hi, 8) \
> + T (uint16_t, vnx16hi, 15) \
> + T (uint8_t, vnx32qi, 0) \
> + T (uint8_t, vnx32qi, 1) \
> + T (uint8_t, vnx32qi, 15) \
> + T (uint8_t, vnx32qi, 16) \
> + T (uint8_t, vnx32qi, 31) \
> +
> +#define TEST_ALL_VAR2(T) \
> + T (uint64_t, vnx4di) \
> + T (uint32_t, vnx8si) \
> + T (uint16_t, vnx16hi) \
> + T (uint8_t, vnx32qi) \
> +
> +TEST_ALL2 (VEC_EXTRACT)
> +TEST_ALL_VAR2 (VEC_EXTRACT_VAR2)
> +
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 5 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvslidedown.vi} 15 } } */
> +/* { dg-final { scan-assembler-times {\tvslidedown.vx} 4 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvmv.x.s} 23 } } */
> +
> +/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
> +/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
> +/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
> new file mode 100644
> index 00000000000..ab49f29c3f2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
> @@ -0,0 +1,69 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
> +
> +#include <stdint-gcc.h>
> +
> +typedef uint64_t vnx8di __attribute__((vector_size (64)));
> +typedef uint32_t vnx16si __attribute__((vector_size (64)));
> +typedef uint16_t vnx32hi __attribute__((vector_size (64)));
> +typedef uint8_t vnx64qi __attribute__((vector_size (64)));
> +
> +#define VEC_EXTRACT(S,V,IDX) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_##V##_##IDX (V v) \
> + { \
> + return v[IDX]; \
> + }
> +
> +#define VEC_EXTRACT_VAR3(S,V) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_var_##V (V v, int32_t idx) \
> + { \
> + return v[idx]; \
> + }
> +
> +#define TEST_ALL3(T) \
> + T (uint64_t, vnx8di, 0) \
> + T (uint64_t, vnx8di, 2) \
> + T (uint64_t, vnx8di, 4) \
> + T (uint64_t, vnx8di, 6) \
> + T (uint32_t, vnx16si, 0) \
> + T (uint32_t, vnx16si, 2) \
> + T (uint32_t, vnx16si, 6) \
> + T (uint32_t, vnx16si, 8) \
> + T (uint32_t, vnx16si, 14) \
> + T (uint16_t, vnx32hi, 0) \
> + T (uint16_t, vnx32hi, 2) \
> + T (uint16_t, vnx32hi, 14) \
> + T (uint16_t, vnx32hi, 16) \
> + T (uint16_t, vnx32hi, 30) \
> + T (uint8_t, vnx64qi, 0) \
> + T (uint8_t, vnx64qi, 2) \
> + T (uint8_t, vnx64qi, 30) \
> + T (uint8_t, vnx64qi, 32) \
> + T (uint8_t, vnx64qi, 63) \
> +
> +#define TEST_ALL_VAR3(T) \
> + T (uint64_t, vnx8di) \
> + T (uint32_t, vnx16si) \
> + T (uint16_t, vnx32hi) \
> + T (uint8_t, vnx64qi) \
> +
> +TEST_ALL3 (VEC_EXTRACT)
> +TEST_ALL_VAR3 (VEC_EXTRACT_VAR3)
> +
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 5 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvslidedown.vi} 13 } } */
> +/* { dg-final { scan-assembler-times {\tvslidedown.vx} 6 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvmv.x.s} 23 } } */
> +
> +/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
> +/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
> +/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
> new file mode 100644
> index 00000000000..328d426e572
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
> @@ -0,0 +1,70 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
> +
> +#include <stdint-gcc.h>
> +
> +typedef uint64_t vnx16di __attribute__((vector_size (128)));
> +typedef uint32_t vnx32si __attribute__((vector_size (128)));
> +typedef uint16_t vnx64hi __attribute__((vector_size (128)));
> +typedef uint8_t vnx128qi __attribute__((vector_size (128)));
> +
> +#define VEC_EXTRACT(S,V,IDX) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_##V##_##IDX (V v) \
> + { \
> + return v[IDX]; \
> + }
> +
> +#define VEC_EXTRACT_VAR4(S,V) \
> + S \
> + __attribute__((noipa)) \
> + vec_extract_var_##V (V v, int64_t idx) \
> + { \
> + return v[idx]; \
> + }
> +
> +#define TEST_ALL4(T) \
> + T (uint64_t, vnx16di, 0) \
> + T (uint64_t, vnx16di, 4) \
> + T (uint64_t, vnx16di, 8) \
> + T (uint64_t, vnx16di, 12) \
> + T (uint32_t, vnx32si, 0) \
> + T (uint32_t, vnx32si, 4) \
> + T (uint32_t, vnx32si, 12) \
> + T (uint32_t, vnx32si, 16) \
> + T (uint32_t, vnx32si, 28) \
> + T (uint16_t, vnx64hi, 0) \
> + T (uint16_t, vnx64hi, 4) \
> + T (uint16_t, vnx64hi, 28) \
> + T (uint16_t, vnx64hi, 32) \
> + T (uint16_t, vnx64hi, 60) \
> + T (uint8_t, vnx128qi, 0) \
> + T (uint8_t, vnx128qi, 4) \
> + T (uint8_t, vnx128qi, 30) \
> + T (uint8_t, vnx128qi, 60) \
> + T (uint8_t, vnx128qi, 64) \
> + T (uint8_t, vnx128qi, 127) \
> +
> +#define TEST_ALL_VAR4(T) \
> + T (uint64_t, vnx16di) \
> + T (uint32_t, vnx32si) \
> + T (uint16_t, vnx64hi) \
> + T (uint8_t, vnx128qi) \
> +
> +TEST_ALL4 (VEC_EXTRACT)
> +TEST_ALL_VAR4 (VEC_EXTRACT_VAR4)
> +
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 7 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 6 } } */
> +/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 5 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvslidedown.vi} 11 } } */
> +/* { dg-final { scan-assembler-times {\tvslidedown.vx} 9 } } */
> +
> +/* { dg-final { scan-assembler-times {\tvmv.x.s} 24 } } */
> +
> +/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 7 } } */
> +/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
> +/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
> new file mode 100644
> index 00000000000..924e40c9dbb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
> @@ -0,0 +1,137 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "-std=c99 -Wno-pedantic -Wno-psabi" } */
> +
> +#include <assert.h>
> +#include <limits.h>
> +
> +#include "vec_extract-1u.c"
> +#include "vec_extract-2u.c"
> +#include "vec_extract-3u.c"
> +#include "vec_extract-4u.c"
> +
> +#define CHECK(S, V, IDX) \
> + __attribute__ ((noipa, optimize ("0"))) void check_##V##_##IDX () \
> + { \
> + V v; \
> + for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
> + v[i] = (S) (INT_MAX - i); \
> + S res = vec_extract_##V##_##IDX (v); \
> + assert (res == (S) (INT_MAX - IDX)); \
> + }
> +
> +#define CHECK_VAR(S, V) \
> + __attribute__ ((noipa, optimize ("0"))) void check_var_##V (int32_t idx) \
> + { \
> + V v; \
> + for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
> + v[i] = (S) (INT_MAX - i); \
> + S res = vec_extract_var_##V (v, idx); \
> + assert (res == (S) (INT_MAX - idx)); \
> + }
> +
> +#define RUN(S, V, IDX) check_##V##_##IDX ();
> +
> +#define RUN_VAR(S, V) \
> + for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
> + check_var_##V (i);
> +
> +#define RUN_ALL(T) \
> + T (uint64_t, vnx2di, 0) \
> + T (uint64_t, vnx2di, 1) \
> + T (uint32_t, vnx4si, 0) \
> + T (uint32_t, vnx4si, 1) \
> + T (uint32_t, vnx4si, 3) \
> + T (uint16_t, vnx8hi, 0) \
> + T (uint16_t, vnx8hi, 2) \
> + T (uint16_t, vnx8hi, 6) \
> + T (uint8_t, vnx16qi, 0) \
> + T (uint8_t, vnx16qi, 1) \
> + T (uint8_t, vnx16qi, 7) \
> + T (uint8_t, vnx16qi, 11) \
> + T (uint8_t, vnx16qi, 15) \
> + T (uint64_t, vnx4di, 0) \
> + T (uint64_t, vnx4di, 1) \
> + T (uint64_t, vnx4di, 2) \
> + T (uint64_t, vnx4di, 3) \
> + T (uint32_t, vnx8si, 0) \
> + T (uint32_t, vnx8si, 1) \
> + T (uint32_t, vnx8si, 3) \
> + T (uint32_t, vnx8si, 4) \
> + T (uint32_t, vnx8si, 7) \
> + T (uint16_t, vnx16hi, 0) \
> + T (uint16_t, vnx16hi, 1) \
> + T (uint16_t, vnx16hi, 7) \
> + T (uint16_t, vnx16hi, 8) \
> + T (uint16_t, vnx16hi, 15) \
> + T (uint8_t, vnx32qi, 0) \
> + T (uint8_t, vnx32qi, 1) \
> + T (uint8_t, vnx32qi, 15) \
> + T (uint8_t, vnx32qi, 16) \
> + T (uint8_t, vnx32qi, 31) \
> + T (uint64_t, vnx8di, 0) \
> + T (uint64_t, vnx8di, 2) \
> + T (uint64_t, vnx8di, 4) \
> + T (uint64_t, vnx8di, 6) \
> + T (uint32_t, vnx16si, 0) \
> + T (uint32_t, vnx16si, 2) \
> + T (uint32_t, vnx16si, 6) \
> + T (uint32_t, vnx16si, 8) \
> + T (uint32_t, vnx16si, 14) \
> + T (uint16_t, vnx32hi, 0) \
> + T (uint16_t, vnx32hi, 2) \
> + T (uint16_t, vnx32hi, 14) \
> + T (uint16_t, vnx32hi, 16) \
> + T (uint16_t, vnx32hi, 30) \
> + T (uint8_t, vnx64qi, 0) \
> + T (uint8_t, vnx64qi, 2) \
> + T (uint8_t, vnx64qi, 30) \
> + T (uint8_t, vnx64qi, 32) \
> + T (uint8_t, vnx64qi, 63) \
> + T (uint64_t, vnx16di, 0) \
> + T (uint64_t, vnx16di, 4) \
> + T (uint64_t, vnx16di, 8) \
> + T (uint64_t, vnx16di, 12) \
> + T (uint32_t, vnx32si, 0) \
> + T (uint32_t, vnx32si, 4) \
> + T (uint32_t, vnx32si, 12) \
> + T (uint32_t, vnx32si, 16) \
> + T (uint32_t, vnx32si, 28) \
> + T (uint16_t, vnx64hi, 0) \
> + T (uint16_t, vnx64hi, 4) \
> + T (uint16_t, vnx64hi, 28) \
> + T (uint16_t, vnx64hi, 32) \
> + T (uint16_t, vnx64hi, 60) \
> + T (uint8_t, vnx128qi, 0) \
> + T (uint8_t, vnx128qi, 4) \
> + T (uint8_t, vnx128qi, 30) \
> + T (uint8_t, vnx128qi, 60) \
> + T (uint8_t, vnx128qi, 64) \
> + T (uint8_t, vnx128qi, 127)
> +
> +#define RUN_ALL_VAR(T) \
> + T (uint64_t, vnx2di) \
> + T (uint32_t, vnx4si) \
> + T (uint16_t, vnx8hi) \
> + T (uint8_t, vnx16qi) \
> + T (uint64_t, vnx4di) \
> + T (uint32_t, vnx8si) \
> + T (uint16_t, vnx16hi) \
> + T (uint8_t, vnx32qi) \
> + T (uint64_t, vnx8di) \
> + T (uint32_t, vnx16si) \
> + T (uint16_t, vnx32hi) \
> + T (uint8_t, vnx64qi) \
> + T (uint64_t, vnx16di) \
> + T (uint32_t, vnx32si) \
> + T (uint16_t, vnx64hi) \
> + T (uint8_t, vnx128qi)
> +
> +RUN_ALL (CHECK)
> +RUN_ALL_VAR (CHECK_VAR)
> +
> +int
> +main ()
> +{
> + RUN_ALL (RUN);
> + RUN_ALL_VAR (RUN_VAR);
> +}
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] IFN: Fix vector extraction into promoted subreg.
@ 2023-08-15 14:02 Robin Dapp
0 siblings, 0 replies; 5+ messages in thread
From: Robin Dapp @ 2023-08-15 14:02 UTC (permalink / raw)
To: gcc-patches; +Cc: rdapp.gcc
Hi,
this patch fixes the case where vec_extract gets passed a promoted
subreg (e.g. from a return value). When such a subreg is the
destination of a vector extraction we create a separate pseudo
register and ensure that the necessary promotion is performed
afterwards.
Before this patch a sign-extended subreg would erroneously not
be zero-extended e.g. when used as return value. I added missing
test cases for unsigned vec_extract on RISC-V that check the
proper behavior.
Testsuite and bootstrap done on x86, aarch64 and power10.
Regards
Robin
gcc/ChangeLog:
* internal-fn.cc (expand_vec_extract_optab_fn): Handle
SUBREG_PROMOTED_VAR_P.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: New test.
---
gcc/internal-fn.cc | 25 +++-
| 63 ++++++++
| 69 +++++++++
| 69 +++++++++
| 70 +++++++++
| 137 ++++++++++++++++++
6 files changed, 430 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 4f2b20a79e5..b1b12cc8369 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3150,14 +3150,33 @@ expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
if (icode != CODE_FOR_nothing)
{
- create_output_operand (&ops[0], target, extract_mode);
+ /* Some backends like riscv sign-extend the extraction result to a full
+ Pmode register. If we are passed a promoted subreg as target make
+ sure not to use it as target directly. Instead, use a new pseudo
+ and perform the necessary extension afterwards. */
+ rtx dest = target;
+ if (target && SUBREG_P (target) && SUBREG_PROMOTED_VAR_P (target))
+ dest = gen_reg_rtx (extract_mode);
+
+ create_output_operand (&ops[0], dest, extract_mode);
+
create_input_operand (&ops[1], src, outermode);
create_convert_operand_from (&ops[2], pos,
TYPE_MODE (TREE_TYPE (op1)), true);
if (maybe_expand_insn (icode, 3, ops))
{
- if (!rtx_equal_p (target, ops[0].value))
- emit_move_insn (target, ops[0].value);
+ if (!rtx_equal_p (dest, target))
+ {
+ if (SUBREG_P (target) && SUBREG_PROMOTED_VAR_P (target))
+ {
+ /* Have convert_move perform the subreg promotion. */
+ rtx tmp = convert_to_mode (extract_mode, ops[0].value, 0);
+ convert_move (SUBREG_REG (target), tmp,
+ SUBREG_PROMOTED_SIGN (target));
+ }
+ else
+ emit_move_insn (target, dest);
+ }
return;
}
}
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
new file mode 100644
index 00000000000..a35988ff55d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx2di __attribute__((vector_size (16)));
+typedef uint32_t vnx4si __attribute__((vector_size (16)));
+typedef uint16_t vnx8hi __attribute__((vector_size (16)));
+typedef uint8_t vnx16qi __attribute__((vector_size (16)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR1(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int8_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL1(T) \
+ T (uint64_t, vnx2di, 0) \
+ T (uint64_t, vnx2di, 1) \
+ T (uint32_t, vnx4si, 0) \
+ T (uint32_t, vnx4si, 1) \
+ T (uint32_t, vnx4si, 3) \
+ T (uint16_t, vnx8hi, 0) \
+ T (uint16_t, vnx8hi, 2) \
+ T (uint16_t, vnx8hi, 6) \
+ T (uint8_t, vnx16qi, 0) \
+ T (uint8_t, vnx16qi, 1) \
+ T (uint8_t, vnx16qi, 7) \
+ T (uint8_t, vnx16qi, 11) \
+ T (uint8_t, vnx16qi, 15) \
+
+#define TEST_ALL_VAR1(T) \
+ T (uint64_t, vnx2di) \
+ T (uint32_t, vnx4si) \
+ T (uint16_t, vnx8hi) \
+ T (uint8_t, vnx16qi) \
+
+TEST_ALL1 (VEC_EXTRACT)
+TEST_ALL_VAR1 (VEC_EXTRACT_VAR1)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 3 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 9 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 17 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 4 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 4 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
new file mode 100644
index 00000000000..8c3c16a7047
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx4di __attribute__((vector_size (32)));
+typedef uint32_t vnx8si __attribute__((vector_size (32)));
+typedef uint16_t vnx16hi __attribute__((vector_size (32)));
+typedef uint8_t vnx32qi __attribute__((vector_size (32)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR2(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int16_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL2(T) \
+ T (uint64_t, vnx4di, 0) \
+ T (uint64_t, vnx4di, 1) \
+ T (uint64_t, vnx4di, 2) \
+ T (uint64_t, vnx4di, 3) \
+ T (uint32_t, vnx8si, 0) \
+ T (uint32_t, vnx8si, 1) \
+ T (uint32_t, vnx8si, 3) \
+ T (uint32_t, vnx8si, 4) \
+ T (uint32_t, vnx8si, 7) \
+ T (uint16_t, vnx16hi, 0) \
+ T (uint16_t, vnx16hi, 1) \
+ T (uint16_t, vnx16hi, 7) \
+ T (uint16_t, vnx16hi, 8) \
+ T (uint16_t, vnx16hi, 15) \
+ T (uint8_t, vnx32qi, 0) \
+ T (uint8_t, vnx32qi, 1) \
+ T (uint8_t, vnx32qi, 15) \
+ T (uint8_t, vnx32qi, 16) \
+ T (uint8_t, vnx32qi, 31) \
+
+#define TEST_ALL_VAR2(T) \
+ T (uint64_t, vnx4di) \
+ T (uint32_t, vnx8si) \
+ T (uint16_t, vnx16hi) \
+ T (uint8_t, vnx32qi) \
+
+TEST_ALL2 (VEC_EXTRACT)
+TEST_ALL_VAR2 (VEC_EXTRACT_VAR2)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m2,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m2,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m2,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m2,\s*ta,\s*ma} 5 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 15 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 23 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
new file mode 100644
index 00000000000..ab49f29c3f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx8di __attribute__((vector_size (64)));
+typedef uint32_t vnx16si __attribute__((vector_size (64)));
+typedef uint16_t vnx32hi __attribute__((vector_size (64)));
+typedef uint8_t vnx64qi __attribute__((vector_size (64)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR3(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int32_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL3(T) \
+ T (uint64_t, vnx8di, 0) \
+ T (uint64_t, vnx8di, 2) \
+ T (uint64_t, vnx8di, 4) \
+ T (uint64_t, vnx8di, 6) \
+ T (uint32_t, vnx16si, 0) \
+ T (uint32_t, vnx16si, 2) \
+ T (uint32_t, vnx16si, 6) \
+ T (uint32_t, vnx16si, 8) \
+ T (uint32_t, vnx16si, 14) \
+ T (uint16_t, vnx32hi, 0) \
+ T (uint16_t, vnx32hi, 2) \
+ T (uint16_t, vnx32hi, 14) \
+ T (uint16_t, vnx32hi, 16) \
+ T (uint16_t, vnx32hi, 30) \
+ T (uint8_t, vnx64qi, 0) \
+ T (uint8_t, vnx64qi, 2) \
+ T (uint8_t, vnx64qi, 30) \
+ T (uint8_t, vnx64qi, 32) \
+ T (uint8_t, vnx64qi, 63) \
+
+#define TEST_ALL_VAR3(T) \
+ T (uint64_t, vnx8di) \
+ T (uint32_t, vnx16si) \
+ T (uint16_t, vnx32hi) \
+ T (uint8_t, vnx64qi) \
+
+TEST_ALL3 (VEC_EXTRACT)
+TEST_ALL_VAR3 (VEC_EXTRACT_VAR3)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m4,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m4,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m4,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m4,\s*ta,\s*ma} 5 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 13 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 6 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 23 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 6 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
new file mode 100644
index 00000000000..328d426e572
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c
@@ -0,0 +1,70 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d -Wno-pedantic -Wno-psabi" } */
+
+#include <stdint-gcc.h>
+
+typedef uint64_t vnx16di __attribute__((vector_size (128)));
+typedef uint32_t vnx32si __attribute__((vector_size (128)));
+typedef uint16_t vnx64hi __attribute__((vector_size (128)));
+typedef uint8_t vnx128qi __attribute__((vector_size (128)));
+
+#define VEC_EXTRACT(S,V,IDX) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_##V##_##IDX (V v) \
+ { \
+ return v[IDX]; \
+ }
+
+#define VEC_EXTRACT_VAR4(S,V) \
+ S \
+ __attribute__((noipa)) \
+ vec_extract_var_##V (V v, int64_t idx) \
+ { \
+ return v[idx]; \
+ }
+
+#define TEST_ALL4(T) \
+ T (uint64_t, vnx16di, 0) \
+ T (uint64_t, vnx16di, 4) \
+ T (uint64_t, vnx16di, 8) \
+ T (uint64_t, vnx16di, 12) \
+ T (uint32_t, vnx32si, 0) \
+ T (uint32_t, vnx32si, 4) \
+ T (uint32_t, vnx32si, 12) \
+ T (uint32_t, vnx32si, 16) \
+ T (uint32_t, vnx32si, 28) \
+ T (uint16_t, vnx64hi, 0) \
+ T (uint16_t, vnx64hi, 4) \
+ T (uint16_t, vnx64hi, 28) \
+ T (uint16_t, vnx64hi, 32) \
+ T (uint16_t, vnx64hi, 60) \
+ T (uint8_t, vnx128qi, 0) \
+ T (uint8_t, vnx128qi, 4) \
+ T (uint8_t, vnx128qi, 30) \
+ T (uint8_t, vnx128qi, 60) \
+ T (uint8_t, vnx128qi, 64) \
+ T (uint8_t, vnx128qi, 127) \
+
+#define TEST_ALL_VAR4(T) \
+ T (uint64_t, vnx16di) \
+ T (uint32_t, vnx32si) \
+ T (uint16_t, vnx64hi) \
+ T (uint8_t, vnx128qi) \
+
+TEST_ALL4 (VEC_EXTRACT)
+TEST_ALL_VAR4 (VEC_EXTRACT_VAR4)
+
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m8,\s*ta,\s*ma} 7 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m8,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m8,\s*ta,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times {vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m8,\s*ta,\s*ma} 5 } } */
+
+/* { dg-final { scan-assembler-times {\tvslidedown.vi} 11 } } */
+/* { dg-final { scan-assembler-times {\tvslidedown.vx} 9 } } */
+
+/* { dg-final { scan-assembler-times {\tvmv.x.s} 24 } } */
+
+/* { dg-final { scan-assembler-times {\tandi\ta0,a0,0xff} 7 } } */
+/* { dg-final { scan-assembler-times {\tslli\ta0,a0,48} 6 } } */
+/* { dg-final { scan-assembler-times {\tsrli\ta0,a0,48} 6 } } */
--git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
new file mode 100644
index 00000000000..924e40c9dbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c
@@ -0,0 +1,137 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-std=c99 -Wno-pedantic -Wno-psabi" } */
+
+#include <assert.h>
+#include <limits.h>
+
+#include "vec_extract-1u.c"
+#include "vec_extract-2u.c"
+#include "vec_extract-3u.c"
+#include "vec_extract-4u.c"
+
+#define CHECK(S, V, IDX) \
+ __attribute__ ((noipa, optimize ("0"))) void check_##V##_##IDX () \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = (S) (INT_MAX - i); \
+ S res = vec_extract_##V##_##IDX (v); \
+ assert (res == (S) (INT_MAX - IDX)); \
+ }
+
+#define CHECK_VAR(S, V) \
+ __attribute__ ((noipa, optimize ("0"))) void check_var_##V (int32_t idx) \
+ { \
+ V v; \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ v[i] = (S) (INT_MAX - i); \
+ S res = vec_extract_var_##V (v, idx); \
+ assert (res == (S) (INT_MAX - idx)); \
+ }
+
+#define RUN(S, V, IDX) check_##V##_##IDX ();
+
+#define RUN_VAR(S, V) \
+ for (int i = 0; i < sizeof (V) / sizeof (S); i++) \
+ check_var_##V (i);
+
+#define RUN_ALL(T) \
+ T (uint64_t, vnx2di, 0) \
+ T (uint64_t, vnx2di, 1) \
+ T (uint32_t, vnx4si, 0) \
+ T (uint32_t, vnx4si, 1) \
+ T (uint32_t, vnx4si, 3) \
+ T (uint16_t, vnx8hi, 0) \
+ T (uint16_t, vnx8hi, 2) \
+ T (uint16_t, vnx8hi, 6) \
+ T (uint8_t, vnx16qi, 0) \
+ T (uint8_t, vnx16qi, 1) \
+ T (uint8_t, vnx16qi, 7) \
+ T (uint8_t, vnx16qi, 11) \
+ T (uint8_t, vnx16qi, 15) \
+ T (uint64_t, vnx4di, 0) \
+ T (uint64_t, vnx4di, 1) \
+ T (uint64_t, vnx4di, 2) \
+ T (uint64_t, vnx4di, 3) \
+ T (uint32_t, vnx8si, 0) \
+ T (uint32_t, vnx8si, 1) \
+ T (uint32_t, vnx8si, 3) \
+ T (uint32_t, vnx8si, 4) \
+ T (uint32_t, vnx8si, 7) \
+ T (uint16_t, vnx16hi, 0) \
+ T (uint16_t, vnx16hi, 1) \
+ T (uint16_t, vnx16hi, 7) \
+ T (uint16_t, vnx16hi, 8) \
+ T (uint16_t, vnx16hi, 15) \
+ T (uint8_t, vnx32qi, 0) \
+ T (uint8_t, vnx32qi, 1) \
+ T (uint8_t, vnx32qi, 15) \
+ T (uint8_t, vnx32qi, 16) \
+ T (uint8_t, vnx32qi, 31) \
+ T (uint64_t, vnx8di, 0) \
+ T (uint64_t, vnx8di, 2) \
+ T (uint64_t, vnx8di, 4) \
+ T (uint64_t, vnx8di, 6) \
+ T (uint32_t, vnx16si, 0) \
+ T (uint32_t, vnx16si, 2) \
+ T (uint32_t, vnx16si, 6) \
+ T (uint32_t, vnx16si, 8) \
+ T (uint32_t, vnx16si, 14) \
+ T (uint16_t, vnx32hi, 0) \
+ T (uint16_t, vnx32hi, 2) \
+ T (uint16_t, vnx32hi, 14) \
+ T (uint16_t, vnx32hi, 16) \
+ T (uint16_t, vnx32hi, 30) \
+ T (uint8_t, vnx64qi, 0) \
+ T (uint8_t, vnx64qi, 2) \
+ T (uint8_t, vnx64qi, 30) \
+ T (uint8_t, vnx64qi, 32) \
+ T (uint8_t, vnx64qi, 63) \
+ T (uint64_t, vnx16di, 0) \
+ T (uint64_t, vnx16di, 4) \
+ T (uint64_t, vnx16di, 8) \
+ T (uint64_t, vnx16di, 12) \
+ T (uint32_t, vnx32si, 0) \
+ T (uint32_t, vnx32si, 4) \
+ T (uint32_t, vnx32si, 12) \
+ T (uint32_t, vnx32si, 16) \
+ T (uint32_t, vnx32si, 28) \
+ T (uint16_t, vnx64hi, 0) \
+ T (uint16_t, vnx64hi, 4) \
+ T (uint16_t, vnx64hi, 28) \
+ T (uint16_t, vnx64hi, 32) \
+ T (uint16_t, vnx64hi, 60) \
+ T (uint8_t, vnx128qi, 0) \
+ T (uint8_t, vnx128qi, 4) \
+ T (uint8_t, vnx128qi, 30) \
+ T (uint8_t, vnx128qi, 60) \
+ T (uint8_t, vnx128qi, 64) \
+ T (uint8_t, vnx128qi, 127)
+
+#define RUN_ALL_VAR(T) \
+ T (uint64_t, vnx2di) \
+ T (uint32_t, vnx4si) \
+ T (uint16_t, vnx8hi) \
+ T (uint8_t, vnx16qi) \
+ T (uint64_t, vnx4di) \
+ T (uint32_t, vnx8si) \
+ T (uint16_t, vnx16hi) \
+ T (uint8_t, vnx32qi) \
+ T (uint64_t, vnx8di) \
+ T (uint32_t, vnx16si) \
+ T (uint16_t, vnx32hi) \
+ T (uint8_t, vnx64qi) \
+ T (uint64_t, vnx16di) \
+ T (uint32_t, vnx32si) \
+ T (uint16_t, vnx64hi) \
+ T (uint8_t, vnx128qi)
+
+RUN_ALL (CHECK)
+RUN_ALL_VAR (CHECK_VAR)
+
+int
+main ()
+{
+ RUN_ALL (RUN);
+ RUN_ALL_VAR (RUN_VAR);
+}
--
2.41.0
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-16 10:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-16 1:31 [PATCH] IFN: Fix vector extraction into promoted subreg juzhe.zhong
2023-08-16 6:45 ` Richard Sandiford
2023-08-16 9:37 ` Robin Dapp
2023-08-16 10:05 ` Richard Sandiford
-- strict thread matches above, loose matches on Subject: below --
2023-08-15 14:02 Robin Dapp
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).