public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/vendors/riscv/heads/gcc-13-with-riscv-opts)] RISC-V: Enable len_mask{load, store} and remove len_{load, store}
@ 2023-06-26 20:57 Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2023-06-26 20:57 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:a0af5aebed1056706c01de66ae4997827b17c350

commit a0af5aebed1056706c01de66ae4997827b17c350
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Sun Jun 25 16:39:52 2023 +0800

    RISC-V: Enable len_mask{load, store} and remove len_{load, store}
    
    This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization.
    
    Consider this following case:
    void
    f (int32_t *__restrict a,
       int32_t *__restrict b,
       int32_t *__restrict cond,
       int n)
    {
      for (int i = 0; i < n; i++)
        if (cond[i])
          a[i] = b[i];
    }
    
    Before this patch:
    <source>:9:21: missed: couldn't vectorize loop
    <source>:9:21: missed: not vectorized: control flow in loop.
    
    After this patch:
    f:
            ble     a3,zero,.L5
    .L3:
            vsetvli a5,a3,e32,m1,ta,ma
            vle32.v v0,0(a2)
            vsetvli a6,zero,e32,m1,ta,ma
            slli    a4,a5,2
            vmsne.vi        v0,v0,0
            sub     a3,a3,a5
            vsetvli zero,a5,e32,m1,ta,ma
            vle32.v v1,0(a1),v0.t
            vse32.v v1,0(a0),v0.t
            add     a2,a2,a4
            add     a1,a1,a4
            add     a0,a0,a4
            bne     a3,zero,.L3
    .L5:
            ret
    
    gcc/ChangeLog:
    
            * config/riscv/autovec.md (len_load_<mode>): Remove.
            (len_maskload<mode><vm>): Remove.
            (len_store_<mode>): New pattern.
            (len_maskstore<mode><vm>): New pattern.
            * config/riscv/predicates.md (autovec_length_operand): New predicate.
            * config/riscv/riscv-protos.h (enum insn_type): New enum.
            (expand_load_store): New function.
            * config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto.
            (emit_nonvlmax_masked_insn): Ditto.
            (expand_load_store): Ditto.
            * config/riscv/riscv-vector-builtins.cc
            (function_expander::use_contiguous_store_insn): Add avl_type operand
            into pred_store.
            * config/riscv/vector.md: Ditto.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c: New test.

Diff:
---
 gcc/config/riscv/autovec.md                        |  22 ++-
 gcc/config/riscv/predicates.md                     |   7 +
 gcc/config/riscv/riscv-protos.h                    |   2 +
 gcc/config/riscv/riscv-v.cc                        |  78 +++++++++++
 gcc/config/riscv/riscv-vector-builtins.cc          |   1 +
 gcc/config/riscv/vector.md                         |  10 +-
 .../riscv/rvv/autovec/partial/single_rgroup-2.c    |   8 ++
 .../riscv/rvv/autovec/partial/single_rgroup-2.h    |  44 ++++++
 .../riscv/rvv/autovec/partial/single_rgroup-3.c    |   8 ++
 .../riscv/rvv/autovec/partial/single_rgroup-3.h    | 149 +++++++++++++++++++++
 .../rvv/autovec/partial/single_rgroup_run-2.c      |  10 ++
 .../rvv/autovec/partial/single_rgroup_run-3.c      |  22 +++
 12 files changed, 346 insertions(+), 15 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 731ffe8ff89..5de43a8d647 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -22,29 +22,27 @@
 ;; == Loads/Stores
 ;; =========================================================================
 
-;; len_load/len_store is a sub-optimal pattern for RVV auto-vectorization support.
-;; We will replace them when len_maskload/len_maskstore is supported in loop vectorizer.
-(define_expand "len_load_<mode>"
+(define_expand "len_maskload<mode><vm>"
   [(match_operand:V 0 "register_operand")
    (match_operand:V 1 "memory_operand")
-   (match_operand 2 "vector_length_operand")
-   (match_operand 3 "const_0_operand")]
+   (match_operand 2 "autovec_length_operand")
+   (match_operand:<VM> 3 "vector_mask_operand")
+   (match_operand 4 "const_0_operand")]
   "TARGET_VECTOR"
 {
-  riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (<MODE>mode),
-  				    riscv_vector::RVV_UNOP, operands, operands[2]);
+  riscv_vector::expand_load_store (operands, true);
   DONE;
 })
 
-(define_expand "len_store_<mode>"
+(define_expand "len_maskstore<mode><vm>"
   [(match_operand:V 0 "memory_operand")
    (match_operand:V 1 "register_operand")
-   (match_operand 2 "vector_length_operand")
-   (match_operand 3 "const_0_operand")]
+   (match_operand 2 "autovec_length_operand")
+   (match_operand:<VM> 3 "vector_mask_operand")
+   (match_operand 4 "const_0_operand")]
   "TARGET_VECTOR"
 {
-  riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (<MODE>mode),
-  				    riscv_vector::RVV_UNOP, operands, operands[2]);
+  riscv_vector::expand_load_store (operands, false);
   DONE;
 })
 
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 04ca6ceabc7..eb975eaf994 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -276,6 +276,13 @@
   (ior (match_operand 0 "pmode_register_operand")
        (match_operand 0 "const_csr_operand")))
 
+(define_special_predicate "autovec_length_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+       (ior (match_operand 0 "const_csr_operand")
+            (match_test "rtx_equal_p (op, gen_int_mode
+                         (GET_MODE_NUNITS (GET_MODE (op)),
+                                           Pmode))"))))
+
 (define_predicate "reg_or_mem_operand"
   (ior (match_operand 0 "register_operand")
        (match_operand 0 "memory_operand")))
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 6d607dc61d1..f686edab3d1 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -143,6 +143,7 @@ enum insn_type
   RVV_CMP_OP = 4,
   RVV_CMP_MU_OP = RVV_CMP_OP + 2, /* +2 means mask and maskoff operand.  */
   RVV_UNOP_MU = RVV_UNOP + 2,	  /* Likewise.  */
+  RVV_UNOP_M = RVV_UNOP + 2,	  /* Likewise.  */
   RVV_TERNOP = 5,
   RVV_WIDEN_TERNOP = 4,
   RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md.  */
@@ -254,6 +255,7 @@ void expand_vec_init (rtx, rtx);
 void expand_vcond (rtx *);
 void expand_vec_perm (rtx, rtx, rtx, rtx);
 void expand_select_vl (rtx *);
+void expand_load_store (rtx *, bool);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 52b9c202ec4..5518394be1e 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -864,6 +864,43 @@ emit_vlmax_cmp_mu_insn (unsigned icode, rtx *ops)
   e.emit_insn ((enum insn_code) icode, ops);
 }
 
+/* This function emits a masked instruction.  */
+static void
+emit_vlmax_masked_insn (unsigned icode, int op_num, rtx *ops)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ true,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ true, dest_mode,
+					  mask_mode);
+  e.set_policy (TAIL_ANY);
+  e.set_policy (MASK_ANY);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a masked instruction.  */
+static void
+emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ true,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ false, dest_mode,
+					  mask_mode);
+  e.set_policy (TAIL_ANY);
+  e.set_policy (MASK_ANY);
+  e.set_vl (avl);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
 /* This function emits a masked instruction.  */
 void
 emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@@ -2746,4 +2783,45 @@ expand_select_vl (rtx *ops)
   emit_insn (gen_no_side_effects_vsetvl_rtx (rvv_mode, ops[0], ops[1]));
 }
 
+/* Expand LEN_MASK_{LOAD,STORE}.  */
+void
+expand_load_store (rtx *ops, bool is_load)
+{
+  poly_int64 value;
+  rtx len = ops[2];
+  rtx mask = ops[3];
+  machine_mode mode = GET_MODE (ops[0]);
+
+  if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)))
+    {
+      /* If the length operand is equal to VF, it is VLMAX load/store.  */
+      if (is_load)
+	{
+	  rtx m_ops[] = {ops[0], mask, RVV_VUNDEF (mode), ops[1]};
+	  emit_vlmax_masked_insn (code_for_pred_mov (mode), RVV_UNOP_M, m_ops);
+	}
+      else
+	{
+	  len = gen_reg_rtx (Pmode);
+	  emit_vlmax_vsetvl (mode, len);
+	  emit_insn (gen_pred_store (mode, ops[0], mask, ops[1], len,
+				     get_avl_type_rtx (VLMAX)));
+	}
+    }
+  else
+    {
+      if (!satisfies_constraint_K (len))
+	len = force_reg (Pmode, len);
+      if (is_load)
+	{
+	  rtx m_ops[] = {ops[0], mask, RVV_VUNDEF (mode), ops[1]};
+	  emit_nonvlmax_masked_insn (code_for_pred_mov (mode), RVV_UNOP_M,
+				     m_ops, len);
+	}
+      else
+	emit_insn (gen_pred_store (mode, ops[0], mask, ops[1], len,
+				   get_avl_type_rtx (NONVLMAX)));
+    }
+}
+
 } // namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 9e6dae98a6d..466e36d50b7 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3636,6 +3636,7 @@ function_expander::use_contiguous_store_insn (insn_code icode)
   for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++)
     add_input_operand (argno);
 
+  add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
   return generate_insn (icode);
 }
 
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 858abdc684c..674e602dec6 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1063,6 +1063,7 @@
 	  (unspec:<VM>
 	    [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1")
 	     (match_operand 3 "vector_length_operand"    "   rK")
+	     (match_operand 4 "const_int_operand"        "    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (match_operand:V 2 "register_operand"         "    vr")
@@ -1071,7 +1072,7 @@
   "vse<sew>.v\t%2,%0%p1"
   [(set_attr "type" "vste")
    (set_attr "mode" "<MODE>")
-   (set (attr "avl_type") (symbol_ref "riscv_vector::NONVLMAX"))
+   (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))
    (set_attr "vl_op_idx" "3")])
 
 ;; vlm.v/vsm.v/vmclr.m/vmset.m.
@@ -1113,6 +1114,7 @@
 	  (unspec:VB
 	    [(match_operand:VB 1 "vector_all_trues_mask_operand" "Wc1")
 	     (match_operand 3 "vector_length_operand"            " rK")
+	     (match_operand 4 "const_int_operand"                "  i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (match_operand:VB 2 "register_operand"                 " vr")
@@ -1121,7 +1123,7 @@
   "vsm.v\t%2,%0"
   [(set_attr "type" "vstm")
    (set_attr "mode" "<MODE>")
-   (set (attr "avl_type") (symbol_ref "riscv_vector::NONVLMAX"))
+   (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))
    (set_attr "vl_op_idx" "3")])
 
 (define_insn "@pred_merge<mode>"
@@ -1433,6 +1435,7 @@
 	  (unspec:<VM>
 	    [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1")
 	     (match_operand 4 "vector_length_operand"    "   rK")
+	     (match_operand 5 "const_int_operand"        "    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (unspec:V
@@ -1442,7 +1445,8 @@
   "TARGET_VECTOR"
   "vsse<sew>.v\t%3,%0,%z2%p1"
   [(set_attr "type" "vsts")
-   (set_attr "mode" "<MODE>")])
+   (set_attr "mode" "<MODE>")
+   (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))])
 
 ;; -------------------------------------------------------------------------------
 ;; ---- Predicated indexed loads/stores
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
new file mode 100644
index 00000000000..24490dc6bc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfhmin -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -fdump-tree-vect-details" } */
+
+#include "single_rgroup-2.h"
+
+TEST_ALL (test_1)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
new file mode 100644
index 00000000000..a94f3eb0f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
@@ -0,0 +1,44 @@
+#include <assert.h>
+#include <stdint-gcc.h>
+
+#define N 777
+
+#define test_1(TYPE)                                                           \
+  TYPE a_##TYPE[N] = {0};                                                      \
+  TYPE b_##TYPE[N] = {0};                                                      \
+  void __attribute__ ((noinline, noclone))                                     \
+  test_1_##TYPE (int *__restrict cond)                                         \
+  {                                                                            \
+    unsigned int i = 0;                                                        \
+    for (i = 0; i < 8; i++)                                                    \
+      if (cond[i])                                                             \
+	b_##TYPE[i] = a_##TYPE[i];                                             \
+  }
+
+#define run_1(TYPE)                                                            \
+  int cond_##TYPE[N] = {0};                                                    \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 33 + 1 + 109;                                        \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    cond_##TYPE[i] = i & 1;                                                    \
+  test_1_##TYPE (cond_##TYPE);                                                 \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond_##TYPE[i] && i < 8)                                             \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (float)                                                                    \
+  T (double)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
new file mode 100644
index 00000000000..9cbae13de06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfhmin -mabi=ilp32d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */
+
+#include "single_rgroup-3.h"
+
+TEST_ALL (test_1)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
new file mode 100644
index 00000000000..e60e0b1ae33
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
@@ -0,0 +1,149 @@
+#include <assert.h>
+#include <stdint-gcc.h>
+
+#define N 777
+
+int cond[N] = {0};
+#define test_1(TYPE)                                                           \
+  TYPE a_##TYPE[N];                                                            \
+  TYPE b_##TYPE[N];                                                            \
+  void __attribute__ ((noinline, noclone)) test_1_##TYPE (unsigned int n)      \
+  {                                                                            \
+    unsigned int i = 0;                                                        \
+    for (i = 0; i < n; i++)                                                    \
+      if (cond[i])                                                             \
+	b_##TYPE[i] = a_##TYPE[i];                                             \
+  }
+
+#define run_1(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 33 + 1 + 109;                                        \
+  test_1_##TYPE (5);                                                           \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 5)                                                    \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_2(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 57 + 1 + 999;                                        \
+  test_1_##TYPE (17);                                                          \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 17)                                                   \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_3(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 77 + 1 + 3;                                          \
+  test_1_##TYPE (32);                                                          \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 32)                                                   \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_4(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 45 + 1 + 11;                                         \
+  test_1_##TYPE (128);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 128)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_5(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 199 + 1 + 79;                                        \
+  test_1_##TYPE (177);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 177)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_6(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 377 + 1 + 73;                                        \
+  test_1_##TYPE (255);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 255)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_7(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 98 + 1 + 66;                                         \
+  test_1_##TYPE (333);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 333)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_8(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 7 + 1 * 7;                                           \
+  test_1_##TYPE (512);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 512)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_9(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 + 1 + 88;                                              \
+  test_1_##TYPE (637);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 637)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_10(TYPE)                                                           \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 331 + 1 + 547;                                       \
+  test_1_##TYPE (777);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 777)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (float)                                                                    \
+  T (double)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
new file mode 100644
index 00000000000..8767efe2382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
@@ -0,0 +1,10 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */
+
+#include "single_rgroup-2.c"
+
+int main (void)
+{
+  TEST_ALL (run_1)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
new file mode 100644
index 00000000000..9ff6e928697
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable" } */
+
+#include "single_rgroup-3.c"
+
+int
+main (void)
+{
+  for (int i = 0; i < N; i++)
+    cond[i] = i & 1;
+  TEST_ALL (run_1)
+  TEST_ALL (run_2)
+  TEST_ALL (run_3)
+  TEST_ALL (run_4)
+  TEST_ALL (run_5)
+  TEST_ALL (run_6)
+  TEST_ALL (run_7)
+  TEST_ALL (run_8)
+  TEST_ALL (run_9)
+  TEST_ALL (run_10)
+  return 0;
+}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [gcc(refs/vendors/riscv/heads/gcc-13-with-riscv-opts)] RISC-V: Enable len_mask{load, store} and remove len_{load, store}
@ 2023-07-14  2:49 Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2023-07-14  2:49 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:6bfd64d62698d3c4473f77bcbfa67e45ef8d61d0

commit 6bfd64d62698d3c4473f77bcbfa67e45ef8d61d0
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Sun Jun 25 16:39:52 2023 +0800

    RISC-V: Enable len_mask{load, store} and remove len_{load, store}
    
    This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization.
    
    Consider this following case:
    void
    f (int32_t *__restrict a,
       int32_t *__restrict b,
       int32_t *__restrict cond,
       int n)
    {
      for (int i = 0; i < n; i++)
        if (cond[i])
          a[i] = b[i];
    }
    
    Before this patch:
    <source>:9:21: missed: couldn't vectorize loop
    <source>:9:21: missed: not vectorized: control flow in loop.
    
    After this patch:
    f:
            ble     a3,zero,.L5
    .L3:
            vsetvli a5,a3,e32,m1,ta,ma
            vle32.v v0,0(a2)
            vsetvli a6,zero,e32,m1,ta,ma
            slli    a4,a5,2
            vmsne.vi        v0,v0,0
            sub     a3,a3,a5
            vsetvli zero,a5,e32,m1,ta,ma
            vle32.v v1,0(a1),v0.t
            vse32.v v1,0(a0),v0.t
            add     a2,a2,a4
            add     a1,a1,a4
            add     a0,a0,a4
            bne     a3,zero,.L3
    .L5:
            ret
    
    gcc/ChangeLog:
    
            * config/riscv/autovec.md (len_load_<mode>): Remove.
            (len_maskload<mode><vm>): Remove.
            (len_store_<mode>): New pattern.
            (len_maskstore<mode><vm>): New pattern.
            * config/riscv/predicates.md (autovec_length_operand): New predicate.
            * config/riscv/riscv-protos.h (enum insn_type): New enum.
            (expand_load_store): New function.
            * config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto.
            (emit_nonvlmax_masked_insn): Ditto.
            (expand_load_store): Ditto.
            * config/riscv/riscv-vector-builtins.cc
            (function_expander::use_contiguous_store_insn): Add avl_type operand
            into pred_store.
            * config/riscv/vector.md: Ditto.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c: New test.
            * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c: New test.

Diff:
---
 gcc/config/riscv/autovec.md                        |  22 ++-
 gcc/config/riscv/predicates.md                     |   7 +
 gcc/config/riscv/riscv-protos.h                    |   2 +
 gcc/config/riscv/riscv-v.cc                        |  78 +++++++++++
 gcc/config/riscv/riscv-vector-builtins.cc          |   1 +
 gcc/config/riscv/vector.md                         |  10 +-
 .../riscv/rvv/autovec/partial/single_rgroup-2.c    |   8 ++
 .../riscv/rvv/autovec/partial/single_rgroup-2.h    |  44 ++++++
 .../riscv/rvv/autovec/partial/single_rgroup-3.c    |   8 ++
 .../riscv/rvv/autovec/partial/single_rgroup-3.h    | 149 +++++++++++++++++++++
 .../rvv/autovec/partial/single_rgroup_run-2.c      |  10 ++
 .../rvv/autovec/partial/single_rgroup_run-3.c      |  22 +++
 12 files changed, 346 insertions(+), 15 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 731ffe8ff89..5de43a8d647 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -22,29 +22,27 @@
 ;; == Loads/Stores
 ;; =========================================================================
 
-;; len_load/len_store is a sub-optimal pattern for RVV auto-vectorization support.
-;; We will replace them when len_maskload/len_maskstore is supported in loop vectorizer.
-(define_expand "len_load_<mode>"
+(define_expand "len_maskload<mode><vm>"
   [(match_operand:V 0 "register_operand")
    (match_operand:V 1 "memory_operand")
-   (match_operand 2 "vector_length_operand")
-   (match_operand 3 "const_0_operand")]
+   (match_operand 2 "autovec_length_operand")
+   (match_operand:<VM> 3 "vector_mask_operand")
+   (match_operand 4 "const_0_operand")]
   "TARGET_VECTOR"
 {
-  riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (<MODE>mode),
-  				    riscv_vector::RVV_UNOP, operands, operands[2]);
+  riscv_vector::expand_load_store (operands, true);
   DONE;
 })
 
-(define_expand "len_store_<mode>"
+(define_expand "len_maskstore<mode><vm>"
   [(match_operand:V 0 "memory_operand")
    (match_operand:V 1 "register_operand")
-   (match_operand 2 "vector_length_operand")
-   (match_operand 3 "const_0_operand")]
+   (match_operand 2 "autovec_length_operand")
+   (match_operand:<VM> 3 "vector_mask_operand")
+   (match_operand 4 "const_0_operand")]
   "TARGET_VECTOR"
 {
-  riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (<MODE>mode),
-  				    riscv_vector::RVV_UNOP, operands, operands[2]);
+  riscv_vector::expand_load_store (operands, false);
   DONE;
 })
 
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 04ca6ceabc7..eb975eaf994 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -276,6 +276,13 @@
   (ior (match_operand 0 "pmode_register_operand")
        (match_operand 0 "const_csr_operand")))
 
+(define_special_predicate "autovec_length_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+       (ior (match_operand 0 "const_csr_operand")
+            (match_test "rtx_equal_p (op, gen_int_mode
+                         (GET_MODE_NUNITS (GET_MODE (op)),
+                                           Pmode))"))))
+
 (define_predicate "reg_or_mem_operand"
   (ior (match_operand 0 "register_operand")
        (match_operand 0 "memory_operand")))
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 6d607dc61d1..f686edab3d1 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -143,6 +143,7 @@ enum insn_type
   RVV_CMP_OP = 4,
   RVV_CMP_MU_OP = RVV_CMP_OP + 2, /* +2 means mask and maskoff operand.  */
   RVV_UNOP_MU = RVV_UNOP + 2,	  /* Likewise.  */
+  RVV_UNOP_M = RVV_UNOP + 2,	  /* Likewise.  */
   RVV_TERNOP = 5,
   RVV_WIDEN_TERNOP = 4,
   RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md.  */
@@ -254,6 +255,7 @@ void expand_vec_init (rtx, rtx);
 void expand_vcond (rtx *);
 void expand_vec_perm (rtx, rtx, rtx, rtx);
 void expand_select_vl (rtx *);
+void expand_load_store (rtx *, bool);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 52b9c202ec4..5518394be1e 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -864,6 +864,43 @@ emit_vlmax_cmp_mu_insn (unsigned icode, rtx *ops)
   e.emit_insn ((enum insn_code) icode, ops);
 }
 
+/* This function emits a masked instruction.  */
+static void
+emit_vlmax_masked_insn (unsigned icode, int op_num, rtx *ops)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ true,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ true, dest_mode,
+					  mask_mode);
+  e.set_policy (TAIL_ANY);
+  e.set_policy (MASK_ANY);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a masked instruction.  */
+static void
+emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ true,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ false, dest_mode,
+					  mask_mode);
+  e.set_policy (TAIL_ANY);
+  e.set_policy (MASK_ANY);
+  e.set_vl (avl);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
 /* This function emits a masked instruction.  */
 void
 emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@@ -2746,4 +2783,45 @@ expand_select_vl (rtx *ops)
   emit_insn (gen_no_side_effects_vsetvl_rtx (rvv_mode, ops[0], ops[1]));
 }
 
+/* Expand LEN_MASK_{LOAD,STORE}.  */
+void
+expand_load_store (rtx *ops, bool is_load)
+{
+  poly_int64 value;
+  rtx len = ops[2];
+  rtx mask = ops[3];
+  machine_mode mode = GET_MODE (ops[0]);
+
+  if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)))
+    {
+      /* If the length operand is equal to VF, it is VLMAX load/store.  */
+      if (is_load)
+	{
+	  rtx m_ops[] = {ops[0], mask, RVV_VUNDEF (mode), ops[1]};
+	  emit_vlmax_masked_insn (code_for_pred_mov (mode), RVV_UNOP_M, m_ops);
+	}
+      else
+	{
+	  len = gen_reg_rtx (Pmode);
+	  emit_vlmax_vsetvl (mode, len);
+	  emit_insn (gen_pred_store (mode, ops[0], mask, ops[1], len,
+				     get_avl_type_rtx (VLMAX)));
+	}
+    }
+  else
+    {
+      if (!satisfies_constraint_K (len))
+	len = force_reg (Pmode, len);
+      if (is_load)
+	{
+	  rtx m_ops[] = {ops[0], mask, RVV_VUNDEF (mode), ops[1]};
+	  emit_nonvlmax_masked_insn (code_for_pred_mov (mode), RVV_UNOP_M,
+				     m_ops, len);
+	}
+      else
+	emit_insn (gen_pred_store (mode, ops[0], mask, ops[1], len,
+				   get_avl_type_rtx (NONVLMAX)));
+    }
+}
+
 } // namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 9e6dae98a6d..466e36d50b7 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3636,6 +3636,7 @@ function_expander::use_contiguous_store_insn (insn_code icode)
   for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++)
     add_input_operand (argno);
 
+  add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
   return generate_insn (icode);
 }
 
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 858abdc684c..674e602dec6 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1063,6 +1063,7 @@
 	  (unspec:<VM>
 	    [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1")
 	     (match_operand 3 "vector_length_operand"    "   rK")
+	     (match_operand 4 "const_int_operand"        "    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (match_operand:V 2 "register_operand"         "    vr")
@@ -1071,7 +1072,7 @@
   "vse<sew>.v\t%2,%0%p1"
   [(set_attr "type" "vste")
    (set_attr "mode" "<MODE>")
-   (set (attr "avl_type") (symbol_ref "riscv_vector::NONVLMAX"))
+   (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))
    (set_attr "vl_op_idx" "3")])
 
 ;; vlm.v/vsm.v/vmclr.m/vmset.m.
@@ -1113,6 +1114,7 @@
 	  (unspec:VB
 	    [(match_operand:VB 1 "vector_all_trues_mask_operand" "Wc1")
 	     (match_operand 3 "vector_length_operand"            " rK")
+	     (match_operand 4 "const_int_operand"                "  i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (match_operand:VB 2 "register_operand"                 " vr")
@@ -1121,7 +1123,7 @@
   "vsm.v\t%2,%0"
   [(set_attr "type" "vstm")
    (set_attr "mode" "<MODE>")
-   (set (attr "avl_type") (symbol_ref "riscv_vector::NONVLMAX"))
+   (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))
    (set_attr "vl_op_idx" "3")])
 
 (define_insn "@pred_merge<mode>"
@@ -1433,6 +1435,7 @@
 	  (unspec:<VM>
 	    [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1")
 	     (match_operand 4 "vector_length_operand"    "   rK")
+	     (match_operand 5 "const_int_operand"        "    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (unspec:V
@@ -1442,7 +1445,8 @@
   "TARGET_VECTOR"
   "vsse<sew>.v\t%3,%0,%z2%p1"
   [(set_attr "type" "vsts")
-   (set_attr "mode" "<MODE>")])
+   (set_attr "mode" "<MODE>")
+   (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))])
 
 ;; -------------------------------------------------------------------------------
 ;; ---- Predicated indexed loads/stores
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
new file mode 100644
index 00000000000..24490dc6bc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfhmin -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -fdump-tree-vect-details" } */
+
+#include "single_rgroup-2.h"
+
+TEST_ALL (test_1)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
new file mode 100644
index 00000000000..a94f3eb0f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h
@@ -0,0 +1,44 @@
+#include <assert.h>
+#include <stdint-gcc.h>
+
+#define N 777
+
+#define test_1(TYPE)                                                           \
+  TYPE a_##TYPE[N] = {0};                                                      \
+  TYPE b_##TYPE[N] = {0};                                                      \
+  void __attribute__ ((noinline, noclone))                                     \
+  test_1_##TYPE (int *__restrict cond)                                         \
+  {                                                                            \
+    unsigned int i = 0;                                                        \
+    for (i = 0; i < 8; i++)                                                    \
+      if (cond[i])                                                             \
+	b_##TYPE[i] = a_##TYPE[i];                                             \
+  }
+
+#define run_1(TYPE)                                                            \
+  int cond_##TYPE[N] = {0};                                                    \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 33 + 1 + 109;                                        \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    cond_##TYPE[i] = i & 1;                                                    \
+  test_1_##TYPE (cond_##TYPE);                                                 \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond_##TYPE[i] && i < 8)                                             \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (float)                                                                    \
+  T (double)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
new file mode 100644
index 00000000000..9cbae13de06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfhmin -mabi=ilp32d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */
+
+#include "single_rgroup-3.h"
+
+TEST_ALL (test_1)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
new file mode 100644
index 00000000000..e60e0b1ae33
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h
@@ -0,0 +1,149 @@
+#include <assert.h>
+#include <stdint-gcc.h>
+
+#define N 777
+
+int cond[N] = {0};
+#define test_1(TYPE)                                                           \
+  TYPE a_##TYPE[N];                                                            \
+  TYPE b_##TYPE[N];                                                            \
+  void __attribute__ ((noinline, noclone)) test_1_##TYPE (unsigned int n)      \
+  {                                                                            \
+    unsigned int i = 0;                                                        \
+    for (i = 0; i < n; i++)                                                    \
+      if (cond[i])                                                             \
+	b_##TYPE[i] = a_##TYPE[i];                                             \
+  }
+
+#define run_1(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 33 + 1 + 109;                                        \
+  test_1_##TYPE (5);                                                           \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 5)                                                    \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_2(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 57 + 1 + 999;                                        \
+  test_1_##TYPE (17);                                                          \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 17)                                                   \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_3(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 77 + 1 + 3;                                          \
+  test_1_##TYPE (32);                                                          \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 32)                                                   \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_4(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 45 + 1 + 11;                                         \
+  test_1_##TYPE (128);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 128)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_5(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 199 + 1 + 79;                                        \
+  test_1_##TYPE (177);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 177)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_6(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 377 + 1 + 73;                                        \
+  test_1_##TYPE (255);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 255)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_7(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 98 + 1 + 66;                                         \
+  test_1_##TYPE (333);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 333)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_8(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 7 + 1 * 7;                                           \
+  test_1_##TYPE (512);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 512)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_9(TYPE)                                                            \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 + 1 + 88;                                              \
+  test_1_##TYPE (637);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 637)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define run_10(TYPE)                                                           \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    a_##TYPE[i] = i * 2 * 331 + 1 + 547;                                       \
+  test_1_##TYPE (777);                                                         \
+  for (unsigned int i = 0; i < N; i++)                                         \
+    {                                                                          \
+      if (cond[i] && i < 777)                                                  \
+	assert (b_##TYPE[i] == a_##TYPE[i]);                                   \
+      else                                                                     \
+	assert (b_##TYPE[i] == 0);                                             \
+    }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (float)                                                                    \
+  T (double)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
new file mode 100644
index 00000000000..8767efe2382
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c
@@ -0,0 +1,10 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */
+
+#include "single_rgroup-2.c"
+
+int main (void)
+{
+  TEST_ALL (run_1)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
new file mode 100644
index 00000000000..9ff6e928697
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable" } */
+
+#include "single_rgroup-3.c"
+
+int
+main (void)
+{
+  for (int i = 0; i < N; i++)
+    cond[i] = i & 1;
+  TEST_ALL (run_1)
+  TEST_ALL (run_2)
+  TEST_ALL (run_3)
+  TEST_ALL (run_4)
+  TEST_ALL (run_5)
+  TEST_ALL (run_6)
+  TEST_ALL (run_7)
+  TEST_ALL (run_8)
+  TEST_ALL (run_9)
+  TEST_ALL (run_10)
+  return 0;
+}

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-07-14  2:49 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-26 20:57 [gcc(refs/vendors/riscv/heads/gcc-13-with-riscv-opts)] RISC-V: Enable len_mask{load, store} and remove len_{load, store} Jeff Law
2023-07-14  2:49 Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).