* [PATCH V2] RISC-V: Enable len_mask{load, store} and remove len_{load, store}
@ 2023-06-25 8:39 Juzhe-Zhong
2023-06-25 12:52 ` Jeff Law
0 siblings, 1 reply; 3+ messages in thread
From: Juzhe-Zhong @ 2023-06-25 8:39 UTC (permalink / raw)
To: gcc-patches
Cc: kito.cheng, kito.cheng, palmer, palmer, jeffreyalaw, rdapp.gcc,
Juzhe-Zhong
This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization.
Consider this following case:
void
f (int32_t *__restrict a,
int32_t *__restrict b,
int32_t *__restrict cond,
int n)
{
for (int i = 0; i < n; i++)
if (cond[i])
a[i] = b[i];
}
Before this patch:
<source>:9:21: missed: couldn't vectorize loop
<source>:9:21: missed: not vectorized: control flow in loop.
After this patch:
f:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,ta,ma
vle32.v v0,0(a2)
vsetvli a6,zero,e32,m1,ta,ma
slli a4,a5,2
vmsne.vi v0,v0,0
sub a3,a3,a5
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v1,0(a1),v0.t
vse32.v v1,0(a0),v0.t
add a2,a2,a4
add a1,a1,a4
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret
gcc/ChangeLog:
* config/riscv/autovec.md (len_load_<mode>): Remove.
(len_maskload<mode><vm>): Remove.
(len_store_<mode>): New pattern.
(len_maskstore<mode><vm>): New pattern.
* config/riscv/predicates.md (autovec_length_operand): New predicate.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_load_store): New function.
* config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto.
(emit_nonvlmax_masked_insn): Ditto.
(expand_load_store): Ditto.
* config/riscv/riscv-vector-builtins.cc (function_expander::use_contiguous_store_insn): Add avl_type operand into pred_store.
* config/riscv/vector.md: Ditto.
---
gcc/config/riscv/autovec.md | 22 ++-
gcc/config/riscv/predicates.md | 7 +
gcc/config/riscv/riscv-protos.h | 2 +
gcc/config/riscv/riscv-v.cc | 78 +++++++++
gcc/config/riscv/riscv-vector-builtins.cc | 1 +
gcc/config/riscv/vector.md | 10 +-
.../rvv/autovec/partial/single_rgroup-2.c | 8 +
.../rvv/autovec/partial/single_rgroup-2.h | 44 ++++++
.../rvv/autovec/partial/single_rgroup-3.c | 8 +
.../rvv/autovec/partial/single_rgroup-3.h | 149 ++++++++++++++++++
.../rvv/autovec/partial/single_rgroup_run-2.c | 10 ++
.../rvv/autovec/partial/single_rgroup_run-3.c | 22 +++
12 files changed, 346 insertions(+), 15 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 731ffe8ff89..5de43a8d647 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -22,29 +22,27 @@
;; == Loads/Stores
;; =========================================================================
-;; len_load/len_store is a sub-optimal pattern for RVV auto-vectorization support.
-;; We will replace them when len_maskload/len_maskstore is supported in loop vectorizer.
-(define_expand "len_load_<mode>"
+(define_expand "len_maskload<mode><vm>"
[(match_operand:V 0 "register_operand")
(match_operand:V 1 "memory_operand")
- (match_operand 2 "vector_length_operand")
- (match_operand 3 "const_0_operand")]
+ (match_operand 2 "autovec_length_operand")
+ (match_operand:<VM> 3 "vector_mask_operand")
+ (match_operand 4 "const_0_operand")]
"TARGET_VECTOR"
{
- riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (<MODE>mode),
- riscv_vector::RVV_UNOP, operands, operands[2]);
+ riscv_vector::expand_load_store (operands, true);
DONE;
})
-(define_expand "len_store_<mode>"
+(define_expand "len_maskstore<mode><vm>"
[(match_operand:V 0 "memory_operand")
(match_operand:V 1 "register_operand")
- (match_operand 2 "vector_length_operand")
- (match_operand 3 "const_0_operand")]
+ (match_operand 2 "autovec_length_operand")
+ (match_operand:<VM> 3 "vector_mask_operand")
+ (match_operand 4 "const_0_operand")]
"TARGET_VECTOR"
{
- riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (<MODE>mode),
- riscv_vector::RVV_UNOP, operands, operands[2]);
+ riscv_vector::expand_load_store (operands, false);
DONE;
})
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 04ca6ceabc7..eb975eaf994 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -276,6 +276,13 @@
(ior (match_operand 0 "pmode_register_operand")
(match_operand 0 "const_csr_operand")))
+(define_special_predicate "autovec_length_operand"
+ (ior (match_operand 0 "pmode_register_operand")
+ (ior (match_operand 0 "const_csr_operand")
+ (match_test "rtx_equal_p (op, gen_int_mode
+ (GET_MODE_NUNITS (GET_MODE (op)),
+ Pmode))"))))
+
(define_predicate "reg_or_mem_operand"
(ior (match_operand 0 "register_operand")
(match_operand 0 "memory_operand")))
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 6d607dc61d1..f686edab3d1 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -143,6 +143,7 @@ enum insn_type
RVV_CMP_OP = 4,
RVV_CMP_MU_OP = RVV_CMP_OP + 2, /* +2 means mask and maskoff operand. */
RVV_UNOP_MU = RVV_UNOP + 2, /* Likewise. */
+ RVV_UNOP_M = RVV_UNOP + 2, /* Likewise. */
RVV_TERNOP = 5,
RVV_WIDEN_TERNOP = 4,
RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md. */
@@ -254,6 +255,7 @@ void expand_vec_init (rtx, rtx);
void expand_vcond (rtx *);
void expand_vec_perm (rtx, rtx, rtx, rtx);
void expand_select_vl (rtx *);
+void expand_load_store (rtx *, bool);
/* Rounding mode bitfield for fixed point VXRM. */
enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 52b9c202ec4..5518394be1e 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -864,6 +864,43 @@ emit_vlmax_cmp_mu_insn (unsigned icode, rtx *ops)
e.emit_insn ((enum insn_code) icode, ops);
}
+/* This function emits a masked instruction. */
+static void
+emit_vlmax_masked_insn (unsigned icode, int op_num, rtx *ops)
+{
+ machine_mode dest_mode = GET_MODE (ops[0]);
+ machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+ insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+ /*HAS_DEST_P*/ true,
+ /*FULLY_UNMASKED_P*/ false,
+ /*USE_REAL_MERGE_P*/ true,
+ /*HAS_AVL_P*/ true,
+ /*VLMAX_P*/ true, dest_mode,
+ mask_mode);
+ e.set_policy (TAIL_ANY);
+ e.set_policy (MASK_ANY);
+ e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a masked instruction. */
+static void
+emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
+{
+ machine_mode dest_mode = GET_MODE (ops[0]);
+ machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+ insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+ /*HAS_DEST_P*/ true,
+ /*FULLY_UNMASKED_P*/ false,
+ /*USE_REAL_MERGE_P*/ true,
+ /*HAS_AVL_P*/ true,
+ /*VLMAX_P*/ false, dest_mode,
+ mask_mode);
+ e.set_policy (TAIL_ANY);
+ e.set_policy (MASK_ANY);
+ e.set_vl (avl);
+ e.emit_insn ((enum insn_code) icode, ops);
+}
+
/* This function emits a masked instruction. */
void
emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@@ -2746,4 +2783,45 @@ expand_select_vl (rtx *ops)
emit_insn (gen_no_side_effects_vsetvl_rtx (rvv_mode, ops[0], ops[1]));
}
+/* Expand LEN_MASK_{LOAD,STORE}. */
+void
+expand_load_store (rtx *ops, bool is_load)
+{
+ poly_int64 value;
+ rtx len = ops[2];
+ rtx mask = ops[3];
+ machine_mode mode = GET_MODE (ops[0]);
+
+ if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)))
+ {
+ /* If the length operand is equal to VF, it is VLMAX load/store. */
+ if (is_load)
+ {
+ rtx m_ops[] = {ops[0], mask, RVV_VUNDEF (mode), ops[1]};
+ emit_vlmax_masked_insn (code_for_pred_mov (mode), RVV_UNOP_M, m_ops);
+ }
+ else
+ {
+ len = gen_reg_rtx (Pmode);
+ emit_vlmax_vsetvl (mode, len);
+ emit_insn (gen_pred_store (mode, ops[0], mask, ops[1], len,
+ get_avl_type_rtx (VLMAX)));
+ }
+ }
+ else
+ {
+ if (!satisfies_constraint_K (len))
+ len = force_reg (Pmode, len);
+ if (is_load)
+ {
+ rtx m_ops[] = {ops[0], mask, RVV_VUNDEF (mode), ops[1]};
+ emit_nonvlmax_masked_insn (code_for_pred_mov (mode), RVV_UNOP_M,
+ m_ops, len);
+ }
+ else
+ emit_insn (gen_pred_store (mode, ops[0], mask, ops[1], len,
+ get_avl_type_rtx (NONVLMAX)));
+ }
+}
+
} // namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 9e6dae98a6d..466e36d50b7 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3636,6 +3636,7 @@ function_expander::use_contiguous_store_insn (insn_code icode)
for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++)
add_input_operand (argno);
+ add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
return generate_insn (icode);
}
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 741c30e3f2d..53ec94fe9c8 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1068,6 +1068,7 @@
(unspec:<VM>
[(match_operand:<VM> 1 "vector_mask_operand" "vmWc1")
(match_operand 3 "vector_length_operand" " rK")
+ (match_operand 4 "const_int_operand" " i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(match_operand:V 2 "register_operand" " vr")
@@ -1076,7 +1077,7 @@
"vse<sew>.v\t%2,%0%p1"
[(set_attr "type" "vste")
(set_attr "mode" "<MODE>")
- (set (attr "avl_type") (symbol_ref "riscv_vector::NONVLMAX"))
+ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))
(set_attr "vl_op_idx" "3")])
;; vlm.v/vsm.v/vmclr.m/vmset.m.
@@ -1118,6 +1119,7 @@
(unspec:VB
[(match_operand:VB 1 "vector_all_trues_mask_operand" "Wc1")
(match_operand 3 "vector_length_operand" " rK")
+ (match_operand 4 "const_int_operand" " i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(match_operand:VB 2 "register_operand" " vr")
@@ -1126,7 +1128,7 @@
"vsm.v\t%2,%0"
[(set_attr "type" "vstm")
(set_attr "mode" "<MODE>")
- (set (attr "avl_type") (symbol_ref "riscv_vector::NONVLMAX"))
+ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))
(set_attr "vl_op_idx" "3")])
(define_insn "@pred_merge<mode>"
@@ -1438,6 +1440,7 @@
(unspec:<VM>
[(match_operand:<VM> 1 "vector_mask_operand" "vmWc1")
(match_operand 4 "vector_length_operand" " rK")
+ (match_operand 5 "const_int_operand" " i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(unspec:V
@@ -1447,7 +1450,8 @@
"TARGET_VECTOR"
"vsse<sew>.v\t%3,%0,%z2%p1"
[(set_attr "type" "vsts")
- (set_attr "mode" "<MODE>")])
+ (set_attr "mode" "<MODE>")
+ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))])
;; -------------------------------------------------------------------------------
;; ---- Predicated indexed loads/stores
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
new file mode 100644
index 00000000000..24490dc6bc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfhmin -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -fdump-tree-vect-details" } */
+
+#include "single_rgroup-2.h"
+
+TEST_ALL (test_1)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
new file mode 100644
index 00000000000..a94f3eb0f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
@@ -0,0 +1,44 @@
+#include <assert.h>
+#include <stdint-gcc.h>
+
+#define N 777
+
+#define test_1(TYPE) \
+ TYPE a_##TYPE[N] = {0}; \
+ TYPE b_##TYPE[N] = {0}; \
+ void __attribute__ ((noinline, noclone)) \
+ test_1_##TYPE (int *__restrict cond) \
+ { \
+ unsigned int i = 0; \
+ for (i = 0; i < 8; i++) \
+ if (cond[i]) \
+ b_##TYPE[i] = a_##TYPE[i]; \
+ }
+
+#define run_1(TYPE) \
+ int cond_##TYPE[N] = {0}; \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 33 + 1 + 109; \
+ for (unsigned int i = 0; i < N; i++) \
+ cond_##TYPE[i] = i & 1; \
+ test_1_##TYPE (cond_##TYPE); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond_##TYPE[i] && i < 8) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define TEST_ALL(T) \
+ T (int8_t) \
+ T (uint8_t) \
+ T (int16_t) \
+ T (uint16_t) \
+ T (int32_t) \
+ T (uint32_t) \
+ T (int64_t) \
+ T (uint64_t) \
+ T (_Float16) \
+ T (float) \
+ T (double)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
new file mode 100644
index 00000000000..9cbae13de06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfhmin -mabi=ilp32d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */
+
+#include "single_rgroup-3.h"
+
+TEST_ALL (test_1)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
new file mode 100644
index 00000000000..e60e0b1ae33
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
@@ -0,0 +1,149 @@
+#include <assert.h>
+#include <stdint-gcc.h>
+
+#define N 777
+
+int cond[N] = {0};
+#define test_1(TYPE) \
+ TYPE a_##TYPE[N]; \
+ TYPE b_##TYPE[N]; \
+ void __attribute__ ((noinline, noclone)) test_1_##TYPE (unsigned int n) \
+ { \
+ unsigned int i = 0; \
+ for (i = 0; i < n; i++) \
+ if (cond[i]) \
+ b_##TYPE[i] = a_##TYPE[i]; \
+ }
+
+#define run_1(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 33 + 1 + 109; \
+ test_1_##TYPE (5); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 5) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_2(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 57 + 1 + 999; \
+ test_1_##TYPE (17); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 17) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_3(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 77 + 1 + 3; \
+ test_1_##TYPE (32); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 32) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_4(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 45 + 1 + 11; \
+ test_1_##TYPE (128); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 128) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_5(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 199 + 1 + 79; \
+ test_1_##TYPE (177); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 177) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_6(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 377 + 1 + 73; \
+ test_1_##TYPE (255); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 255) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_7(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 98 + 1 + 66; \
+ test_1_##TYPE (333); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 333) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_8(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 7 + 1 * 7; \
+ test_1_##TYPE (512); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 512) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_9(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 + 1 + 88; \
+ test_1_##TYPE (637); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 637) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define run_10(TYPE) \
+ for (unsigned int i = 0; i < N; i++) \
+ a_##TYPE[i] = i * 2 * 331 + 1 + 547; \
+ test_1_##TYPE (777); \
+ for (unsigned int i = 0; i < N; i++) \
+ { \
+ if (cond[i] && i < 777) \
+ assert (b_##TYPE[i] == a_##TYPE[i]); \
+ else \
+ assert (b_##TYPE[i] == 0); \
+ }
+
+#define TEST_ALL(T) \
+ T (int8_t) \
+ T (uint8_t) \
+ T (int16_t) \
+ T (uint16_t) \
+ T (int32_t) \
+ T (uint32_t) \
+ T (int64_t) \
+ T (uint64_t) \
+ T (_Float16) \
+ T (float) \
+ T (double)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
new file mode 100644
index 00000000000..8767efe2382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
@@ -0,0 +1,10 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */
+
+#include "single_rgroup-2.c"
+
+int main (void)
+{
+ TEST_ALL (run_1)
+ return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
new file mode 100644
index 00000000000..9ff6e928697
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable" } */
+
+#include "single_rgroup-3.c"
+
+int
+main (void)
+{
+ for (int i = 0; i < N; i++)
+ cond[i] = i & 1;
+ TEST_ALL (run_1)
+ TEST_ALL (run_2)
+ TEST_ALL (run_3)
+ TEST_ALL (run_4)
+ TEST_ALL (run_5)
+ TEST_ALL (run_6)
+ TEST_ALL (run_7)
+ TEST_ALL (run_8)
+ TEST_ALL (run_9)
+ TEST_ALL (run_10)
+ return 0;
+}
--
2.36.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH V2] RISC-V: Enable len_mask{load, store} and remove len_{load, store}
2023-06-25 8:39 [PATCH V2] RISC-V: Enable len_mask{load, store} and remove len_{load, store} Juzhe-Zhong
@ 2023-06-25 12:52 ` Jeff Law
2023-06-25 13:52 ` Li, Pan2
0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2023-06-25 12:52 UTC (permalink / raw)
To: Juzhe-Zhong, gcc-patches
Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc
On 6/25/23 02:39, Juzhe-Zhong wrote:
> This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization.
>
> Consider this following case:
> void
> f (int32_t *__restrict a,
> int32_t *__restrict b,
> int32_t *__restrict cond,
> int n)
> {
> for (int i = 0; i < n; i++)
> if (cond[i])
> a[i] = b[i];
> }
>
> Before this patch:
> <source>:9:21: missed: couldn't vectorize loop
> <source>:9:21: missed: not vectorized: control flow in loop.
>
> After this patch:
> f:
> ble a3,zero,.L5
> .L3:
> vsetvli a5,a3,e32,m1,ta,ma
> vle32.v v0,0(a2)
> vsetvli a6,zero,e32,m1,ta,ma
> slli a4,a5,2
> vmsne.vi v0,v0,0
> sub a3,a3,a5
> vsetvli zero,a5,e32,m1,ta,ma
> vle32.v v1,0(a1),v0.t
> vse32.v v1,0(a0),v0.t
> add a2,a2,a4
> add a1,a1,a4
> add a0,a0,a4
> bne a3,zero,.L3
> .L5:
> ret
>
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (len_load_<mode>): Remove.
> (len_maskload<mode><vm>): Remove.
> (len_store_<mode>): New pattern.
> (len_maskstore<mode><vm>): New pattern.
> * config/riscv/predicates.md (autovec_length_operand): New predicate.
> * config/riscv/riscv-protos.h (enum insn_type): New enum.
> (expand_load_store): New function.
> * config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto.
> (emit_nonvlmax_masked_insn): Ditto.
> (expand_load_store): Ditto.
> * config/riscv/riscv-vector-builtins.cc (function_expander::use_contiguous_store_insn): Add avl_type operand into pred_store.
> * config/riscv/vector.md: Ditto.
OK
jeff
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [PATCH V2] RISC-V: Enable len_mask{load, store} and remove len_{load, store}
2023-06-25 12:52 ` Jeff Law
@ 2023-06-25 13:52 ` Li, Pan2
0 siblings, 0 replies; 3+ messages in thread
From: Li, Pan2 @ 2023-06-25 13:52 UTC (permalink / raw)
To: Jeff Law, Juzhe-Zhong, gcc-patches
Cc: kito.cheng, kito.cheng, palmer, palmer, rdapp.gcc
Committed, thanks Jeff.
Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Jeff Law via Gcc-patches
Sent: Sunday, June 25, 2023 8:53 PM
To: Juzhe-Zhong <juzhe.zhong@rivai.ai>; gcc-patches@gcc.gnu.org
Cc: kito.cheng@gmail.com; kito.cheng@sifive.com; palmer@dabbelt.com; palmer@rivosinc.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH V2] RISC-V: Enable len_mask{load, store} and remove len_{load, store}
On 6/25/23 02:39, Juzhe-Zhong wrote:
> This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization.
>
> Consider this following case:
> void
> f (int32_t *__restrict a,
> int32_t *__restrict b,
> int32_t *__restrict cond,
> int n)
> {
> for (int i = 0; i < n; i++)
> if (cond[i])
> a[i] = b[i];
> }
>
> Before this patch:
> <source>:9:21: missed: couldn't vectorize loop
> <source>:9:21: missed: not vectorized: control flow in loop.
>
> After this patch:
> f:
> ble a3,zero,.L5
> .L3:
> vsetvli a5,a3,e32,m1,ta,ma
> vle32.v v0,0(a2)
> vsetvli a6,zero,e32,m1,ta,ma
> slli a4,a5,2
> vmsne.vi v0,v0,0
> sub a3,a3,a5
> vsetvli zero,a5,e32,m1,ta,ma
> vle32.v v1,0(a1),v0.t
> vse32.v v1,0(a0),v0.t
> add a2,a2,a4
> add a1,a1,a4
> add a0,a0,a4
> bne a3,zero,.L3
> .L5:
> ret
>
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (len_load_<mode>): Remove.
> (len_maskload<mode><vm>): Remove.
> (len_store_<mode>): New pattern.
> (len_maskstore<mode><vm>): New pattern.
> * config/riscv/predicates.md (autovec_length_operand): New predicate.
> * config/riscv/riscv-protos.h (enum insn_type): New enum.
> (expand_load_store): New function.
> * config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto.
> (emit_nonvlmax_masked_insn): Ditto.
> (expand_load_store): Ditto.
> * config/riscv/riscv-vector-builtins.cc (function_expander::use_contiguous_store_insn): Add avl_type operand into pred_store.
> * config/riscv/vector.md: Ditto.
OK
jeff
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-06-25 13:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-25 8:39 [PATCH V2] RISC-V: Enable len_mask{load, store} and remove len_{load, store} Juzhe-Zhong
2023-06-25 12:52 ` Jeff Law
2023-06-25 13:52 ` Li, Pan2
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).