public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
@ 2023-08-22  5:41 Lehua Ding
  2023-08-22  5:59 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Lehua Ding @ 2023-08-22  5:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: juzhe.zhong, kito.cheng, rdapp.gcc, palmer, jeffreyalaw

Hi,

This patch add conditional unary neg/abs/not autovec patterns to RISC-V backend.
Consider this C code:

void
test_3 (float *__restrict a, float *__restrict b, int *__restrict pred, int n)
{
  for (int i = 0; i < n; i += 1)
    {
      a[i] = pred[i] ? __builtin_fabsf (b[i]) : a[i];
    }
}

Before this patch:
        ...
        vsetvli a7,zero,e32,m1,ta,ma
        vfabs.v v2,v2
        vmerge.vvm      v1,v1,v2,v0
        ...

After this patch:
        ...
        vsetvli	a7,zero,e32,m1,ta,mu
        vfabs.v	v1,v2,v0.t
        ...

For int neg/abs/not and FP neg patterns, Defining the corresponding cond_xxx
paterns is enough.
For the FP abs pattern, We need to change the definition of `abs<mode>2` and
`@vcond_mask_<mode><vm>` pattern from define_expand to define_insn_and_split
in order to fuse them into a new pattern `*cond_abs<mode>` at the combine pass.
After changing the pattern of neg, a vlmax copysin + neg fusion pattern needs
to be added.

A fusion process similar to the one below:

(insn 30 29 31 4 (set (reg:RVVM1SF 152 [ vect_iftmp.15 ])
        (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))) "float.c":15:56 discrim 1 12799 {absrvvm1sf2}
     (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
        (nil)))

(insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
        (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
            (reg:RVVM1SF 152 [ vect_iftmp.15 ])
            (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 12707 {vcond_mask_rvvm1sfrvvmf32bi}
     (expr_list:REG_DEAD (reg:RVVM1SF 152 [ vect_iftmp.15 ])
        (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
            (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
                (nil)))))
==>

(insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
        (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
            (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))
            (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 13444 {*cond_absrvvm1sf}
     (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
        (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
            (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
                (nil)))))

Best,
Lehua

gcc/ChangeLog:

	* config/riscv/autovec-opt.md (*cond_abs<mode>): New combine pattern.
	(*copysign<mode>_neg): Ditto.
	* config/riscv/autovec.md (@vcond_mask_<mode><vm>): Adjust.
	(<optab><mode>2): Ditto.
	(cond_<optab><mode>): New.
	(cond_len_<optab><mode>): Ditto.
	* config/riscv/riscv-protos.h (enum insn_type): New.
	(expand_cond_len_unop): New helper func.
	* config/riscv/riscv-v.cc (shuffle_merge_patterns): Adjust.
	(expand_cond_len_unop): New helper func.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c: New test.

---
 gcc/config/riscv/autovec-opt.md               | 39 ++++++++
 gcc/config/riscv/autovec.md                   | 97 +++++++++++++++++--
 gcc/config/riscv/riscv-protos.h               |  7 +-
 gcc/config/riscv/riscv-v.cc                   | 56 ++++++++++-
 .../riscv/rvv/autovec/cond/cond_unary_1.c     | 43 ++++++++
 .../riscv/rvv/autovec/cond/cond_unary_1_run.c | 27 ++++++
 .../riscv/rvv/autovec/cond/cond_unary_2.c     | 47 +++++++++
 .../riscv/rvv/autovec/cond/cond_unary_2_run.c | 28 ++++++
 .../riscv/rvv/autovec/cond/cond_unary_3.c     | 43 ++++++++
 .../riscv/rvv/autovec/cond/cond_unary_3_run.c | 27 ++++++
 .../riscv/rvv/autovec/cond/cond_unary_4.c     | 43 ++++++++
 .../riscv/rvv/autovec/cond/cond_unary_4_run.c | 27 ++++++
 .../riscv/rvv/autovec/cond/cond_unary_5.c     | 37 +++++++
 .../riscv/rvv/autovec/cond/cond_unary_5_run.c | 26 +++++
 .../riscv/rvv/autovec/cond/cond_unary_6.c     | 41 ++++++++
 .../riscv/rvv/autovec/cond/cond_unary_6_run.c | 27 ++++++
 .../riscv/rvv/autovec/cond/cond_unary_7.c     | 37 +++++++
 .../riscv/rvv/autovec/cond/cond_unary_7_run.c | 26 +++++
 .../riscv/rvv/autovec/cond/cond_unary_8.c     | 37 +++++++
 .../riscv/rvv/autovec/cond/cond_unary_8_run.c | 28 ++++++
 20 files changed, 730 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 99b609a99d9..8247eb87ddb 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -684,3 +684,42 @@
   }
   [(set_attr "type" "vfwmuladd")
    (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; Combine <op> and vcond_mask generated by midend into cond_len_<op>
+;; Currently supported operations:
+;;   abs(FP)
+(define_insn_and_split "*cond_abs<mode>"
+  [(set (match_operand:VF 0 "register_operand")
+        (if_then_else:VF
+          (match_operand:<VM> 3 "register_operand")
+          (abs:VF (match_operand:VF 1 "nonmemory_operand"))
+          (match_operand:VF 2 "register_operand")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  emit_insn (gen_cond_len_abs<mode> (operands[0], operands[3], operands[1],
+				     operands[2],
+				     gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
+				     const0_rtx));
+  DONE;
+})
+
+;; Combine vlmax neg and UNSPEC_VCOPYSIGN
+(define_insn_and_split "*copysign<mode>_neg"
+  [(set (match_operand:VF 0 "register_operand")
+        (neg:VF
+          (unspec:VF [
+            (match_operand:VF 1 "register_operand")
+            (match_operand:VF 2 "register_operand")
+          ] UNSPEC_VCOPYSIGN)))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  riscv_vector::emit_vlmax_insn (code_for_pred_ncopysign (<MODE>mode),
+                                 riscv_vector::RVV_BINOP, operands);
+  DONE;
+})
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index acca4c22b90..e1addc07036 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -552,12 +552,16 @@
 ;; - vfmerge.vf
 ;; -------------------------------------------------------------------------
 
-(define_expand "@vcond_mask_<mode><vm>"
-  [(match_operand:V 0 "register_operand")
-   (match_operand:<VM> 3 "register_operand")
-   (match_operand:V 1 "nonmemory_operand")
-   (match_operand:V 2 "register_operand")]
-  "TARGET_VECTOR"
+(define_insn_and_split "@vcond_mask_<mode><vm>"
+  [(set (match_operand:V 0 "register_operand")
+        (if_then_else:V
+          (match_operand:<VM> 3 "register_operand")
+          (match_operand:V 1 "nonmemory_operand")
+          (match_operand:V 2 "register_operand")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
   {
     /* The order of vcond_mask is opposite to pred_merge.  */
     std::swap (operands[1], operands[2]);
@@ -979,11 +983,14 @@
 ;; Includes:
 ;; - vfneg.v/vfabs.v
 ;; -------------------------------------------------------------------------------
-(define_expand "<optab><mode>2"
+(define_insn_and_split "<optab><mode>2"
   [(set (match_operand:VF 0 "register_operand")
     (any_float_unop_nofrm:VF
      (match_operand:VF 1 "register_operand")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
 {
   insn_code icode = code_for_pred (<CODE>, <MODE>mode);
   riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
@@ -1499,6 +1506,80 @@
   DONE;
 })
 
+;; -------------------------------------------------------------------------
+;; ---- [INT] Conditional unary operations
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vneg/vnot
+;; -------------------------------------------------------------------------
+
+(define_expand "cond_<optab><mode>"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:<VM> 1 "vector_mask_operand")
+   (any_int_unop:VI
+     (match_operand:VI 2 "register_operand"))
+   (match_operand:VI 3 "register_operand")]
+  "TARGET_VECTOR"
+{
+  /* Normalize into cond_len_* operations.  */
+  emit_insn (gen_cond_len_<optab><mode> (operands[0], operands[1], operands[2],
+					 operands[3],
+					 gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
+					 const0_rtx));
+  DONE;
+})
+
+(define_expand "cond_len_<optab><mode>"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:<VM> 1 "vector_mask_operand")
+   (any_int_unop:VI
+     (match_operand:VI 2 "register_operand"))
+   (match_operand:VI 3 "register_operand")
+   (match_operand 4 "autovec_length_operand")
+   (match_operand 5 "const_0_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_cond_len_unop (<CODE>, operands);
+  DONE;
+})
+
+;; -------------------------------------------------------------------------
+;; ---- [FP] Conditional unary operations
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vfneg/vfabs
+;; -------------------------------------------------------------------------
+
+(define_expand "cond_<optab><mode>"
+  [(match_operand:VF 0 "register_operand")
+   (match_operand:<VM> 1 "vector_mask_operand")
+   (any_float_unop_nofrm:VF
+     (match_operand:VF 2 "register_operand"))
+   (match_operand:VF 3 "register_operand")]
+  "TARGET_VECTOR"
+{
+  /* Normalize into cond_len_* operations.  */
+  emit_insn (gen_cond_len_<optab><mode> (operands[0], operands[1], operands[2],
+					 operands[3],
+					 gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
+					 const0_rtx));
+  DONE;
+})
+
+(define_expand "cond_len_<optab><mode>"
+  [(match_operand:VF 0 "register_operand")
+   (match_operand:<VM> 1 "vector_mask_operand")
+   (any_float_unop_nofrm:VF
+     (match_operand:VF 2 "register_operand"))
+   (match_operand:VF 3 "register_operand")
+   (match_operand 4 "autovec_length_operand")
+   (match_operand 5 "const_0_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_cond_len_unop (<CODE>, operands);
+  DONE;
+})
+
 ;; -------------------------------------------------------------------------
 ;; ---- [INT] Conditional binary operations
 ;; -------------------------------------------------------------------------
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 472c00dc439..2c4405c9860 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -184,6 +184,10 @@ enum insn_type
 {
   RVV_MISC_OP = 1,
   RVV_UNOP = 2,
+  RVV_UNOP_M = RVV_UNOP + 2,
+  RVV_UNOP_MU = RVV_UNOP + 2,
+  RVV_UNOP_TU = RVV_UNOP + 2,
+  RVV_UNOP_TUMU = RVV_UNOP + 2,
   RVV_BINOP = 3,
   RVV_BINOP_MU = RVV_BINOP + 2,
   RVV_BINOP_TU = RVV_BINOP + 2,
@@ -191,8 +195,6 @@ enum insn_type
   RVV_MERGE_OP = 4,
   RVV_CMP_OP = 4,
   RVV_CMP_MU_OP = RVV_CMP_OP + 2, /* +2 means mask and maskoff operand.  */
-  RVV_UNOP_MU = RVV_UNOP + 2,	  /* Likewise.  */
-  RVV_UNOP_M = RVV_UNOP + 2,	  /* Likewise.  */
   RVV_TERNOP = 5,
   RVV_TERNOP_MU = RVV_TERNOP + 1,
   RVV_TERNOP_TU = RVV_TERNOP + 1,
@@ -294,6 +296,7 @@ bool neg_simm5_p (rtx);
 bool has_vi_variant_p (rtx_code, rtx);
 void expand_vec_cmp (rtx, rtx_code, rtx, rtx);
 bool expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool);
+void expand_cond_len_unop (rtx_code, rtx *);
 void expand_cond_len_binop (rtx_code, rtx *);
 void expand_reduction (rtx_code, rtx *, rtx,
 		       reduction_type = reduction_type::UNORDERED);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index b01028c6201..14eda581d00 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -2992,7 +2992,9 @@ shuffle_merge_patterns (struct expand_vec_perm_d *d)
   emit_vlmax_cmp_insn (icode, ops);
 
   /* TARGET = MASK ? OP0 : OP1.  */
-  emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op0, d->op1, mask));
+  /* swap op0 and op1 since the order is opposite to pred_merge.  */
+  rtx ops2[] = {d->target, d->op1, d->op0, mask};
+  emit_vlmax_merge_insn (code_for_pred_merge (vmode), riscv_vector::RVV_MERGE_OP, ops2);
   return true;
 }
 
@@ -3384,10 +3386,58 @@ needs_fp_rounding (rtx_code code, machine_mode mode)
 {
   if (!FLOAT_MODE_P (mode))
     return false;
-  return code != SMIN && code != SMAX;
+  return code != SMIN && code != SMAX && code != NEG && code != ABS;
 }
 
-/* Expand COND_LEN_*.  */
+/* Expand unary ops COND_LEN_*.  */
+void
+expand_cond_len_unop (rtx_code code, rtx *ops)
+{
+  rtx dest = ops[0];
+  rtx mask = ops[1];
+  rtx src = ops[2];
+  rtx merge = ops[3];
+  rtx len = ops[4];
+  machine_mode mode = GET_MODE (dest);
+  machine_mode mask_mode = GET_MODE (mask);
+
+  poly_int64 value;
+  bool is_dummy_mask = rtx_equal_p (mask, CONSTM1_RTX (mask_mode));
+  bool is_vlmax_len
+    = poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode));
+  rtx cond_ops[] = {dest, mask, merge, src};
+  insn_code icode = code_for_pred (code, mode);
+
+  if (is_dummy_mask)
+    {
+      /* Use TU, MASK ANY policy.  */
+      if (needs_fp_rounding (code, mode))
+	emit_nonvlmax_fp_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
+      else
+	emit_nonvlmax_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
+    }
+  else
+    {
+      if (is_vlmax_len)
+	{
+	  /* Use TAIL ANY, MU policy.  */
+	  if (needs_fp_rounding (code, mode))
+	    emit_vlmax_masked_fp_mu_insn (icode, RVV_UNOP_MU, cond_ops);
+	  else
+	    emit_vlmax_masked_mu_insn (icode, RVV_UNOP_MU, cond_ops);
+	}
+      else
+	{
+	  /* Use TU, MU policy.  */
+	  if (needs_fp_rounding (code, mode))
+	    emit_nonvlmax_fp_tumu_insn (icode, RVV_UNOP_TUMU, cond_ops, len);
+	  else
+	    emit_nonvlmax_tumu_insn (icode, RVV_UNOP_TUMU, cond_ops, len);
+	}
+    }
+}
+
+/* Expand binary ops COND_LEN_*.  */
 void
 expand_cond_len_binop (rtx_code code, rtx *ops)
 {
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c
new file mode 100644
index 00000000000..e8bf41a55a1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE, OP)					\
+  void __attribute__ ((noipa))					\
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,	\
+		      TYPE *__restrict pred, int n)		\
+  {								\
+    for (int i = 0; i < n; ++i)					\
+      r[i] = pred[i] ? OP (a[i]) : a[i];			\
+  }
+
+#define TEST_INT_TYPE(T, TYPE) \
+  T (TYPE, abs) \
+  T (TYPE, neg) \
+  T (TYPE, not)
+
+#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
+  T (TYPE, __builtin_fabs##SUFFIX) \
+  T (TYPE, neg)
+
+#define TEST_ALL(T) \
+  TEST_INT_TYPE (T, int8_t) \
+  TEST_INT_TYPE (T, int16_t) \
+  TEST_INT_TYPE (T, int32_t) \
+  TEST_INT_TYPE (T, int64_t) \
+  TEST_FLOAT_TYPE (T, _Float16, f16) \
+  TEST_FLOAT_TYPE (T, float, f) \
+  TEST_FLOAT_TYPE (T, double, )
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c
new file mode 100644
index 00000000000..90f1c1a71e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_1.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, OP)					\
+  {								\
+    TYPE r[N], a[N], pred[N];					\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	pred[i] = (i % 7 < 4);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE##_##OP (r, a, pred, N);				\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : a[i]))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c
new file mode 100644
index 00000000000..fb984ded12f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE, OP)					\
+  void __attribute__ ((noipa))					\
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,	\
+		      TYPE *__restrict b,			\
+		      TYPE *__restrict pred, int n)		\
+  {								\
+    for (int i = 0; i < n; ++i)					\
+      {								\
+	TYPE bi = b[i];						\
+	r[i] = pred[i] ? OP (a[i]) : bi;			\
+      }								\
+  }
+
+#define TEST_INT_TYPE(T, TYPE) \
+  T (TYPE, abs) \
+  T (TYPE, neg) \
+  T (TYPE, not)
+
+#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
+  T (TYPE, __builtin_fabs##SUFFIX) \
+  T (TYPE, neg)
+
+#define TEST_ALL(T) \
+  TEST_INT_TYPE (T, int8_t) \
+  TEST_INT_TYPE (T, int16_t) \
+  TEST_INT_TYPE (T, int32_t) \
+  TEST_INT_TYPE (T, int64_t) \
+  TEST_FLOAT_TYPE (T, _Float16, f16) \
+  TEST_FLOAT_TYPE (T, float, f) \
+  TEST_FLOAT_TYPE (T, double, )
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c
new file mode 100644
index 00000000000..b739c5d8df2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_2.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, OP)					\
+  {								\
+    TYPE r[N], a[N], b[N], pred[N];				\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	b[i] = (i % 9) * (i % 7 + 1);				\
+	pred[i] = (i % 7 < 4);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE##_##OP (r, a, b, pred, N);			\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : b[i]))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c
new file mode 100644
index 00000000000..a1aeaf076de
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE, OP)					\
+  void __attribute__ ((noipa))					\
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,	\
+		      TYPE *__restrict pred, int n)		\
+  {								\
+    for (int i = 0; i < n; ++i)					\
+      r[i] = pred[i] ? OP (a[i]) : 5;				\
+  }
+
+#define TEST_INT_TYPE(T, TYPE) \
+  T (TYPE, abs) \
+  T (TYPE, neg) \
+  T (TYPE, not)
+
+#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
+  T (TYPE, __builtin_fabs##SUFFIX) \
+  T (TYPE, neg)
+
+#define TEST_ALL(T) \
+  TEST_INT_TYPE (T, int8_t) \
+  TEST_INT_TYPE (T, int16_t) \
+  TEST_INT_TYPE (T, int32_t) \
+  TEST_INT_TYPE (T, int64_t) \
+  TEST_FLOAT_TYPE (T, _Float16, f16) \
+  TEST_FLOAT_TYPE (T, float, f) \
+  TEST_FLOAT_TYPE (T, double, )
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c
new file mode 100644
index 00000000000..9234c94d9bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_3.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, OP)					\
+  {								\
+    TYPE r[N], a[N], pred[N];					\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	pred[i] = (i % 7 < 4);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE##_##OP (r, a, pred, N);				\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : 5))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c
new file mode 100644
index 00000000000..ed041fe28cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE, OP)					\
+  void __attribute__ ((noipa))					\
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,	\
+		      TYPE *__restrict pred, int n)		\
+  {								\
+    for (int i = 0; i < n; ++i)					\
+      r[i] = pred[i] ? OP (a[i]) : 0;				\
+  }
+
+#define TEST_INT_TYPE(T, TYPE) \
+  T (TYPE, abs) \
+  T (TYPE, neg) \
+  T (TYPE, not)
+
+#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
+  T (TYPE, __builtin_fabs##SUFFIX) \
+  T (TYPE, neg)
+
+#define TEST_ALL(T) \
+  TEST_INT_TYPE (T, int8_t) \
+  TEST_INT_TYPE (T, int16_t) \
+  TEST_INT_TYPE (T, int32_t) \
+  TEST_INT_TYPE (T, int64_t) \
+  TEST_FLOAT_TYPE (T, _Float16, f16) \
+  TEST_FLOAT_TYPE (T, float, f) \
+  TEST_FLOAT_TYPE (T, double, )
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c
new file mode 100644
index 00000000000..52fd4d07be9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_4.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, OP)					\
+  {								\
+    TYPE r[N], a[N], pred[N];					\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	pred[i] = (i % 7 < 4);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE##_##OP (r, a, pred, N);				\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : 0))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c
new file mode 100644
index 00000000000..6c0a25aad6b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)			\
+  void __attribute__ ((noipa))					\
+  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,		\
+				 TYPE2 *__restrict a,		\
+				 TYPE1 *__restrict pred)	\
+  {								\
+    for (int i = 0; i < COUNT; ++i)				\
+      r[i] = pred[i] ? OP (a[i]) : a[i];			\
+  }
+
+#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
+  T (TYPE1, TYPE2, COUNT, abs) \
+  T (TYPE1, TYPE2, COUNT, neg) \
+  T (TYPE1, TYPE2, COUNT, not)
+
+#define TEST_ALL(T) \
+  TEST_TYPES (T, int16_t, int8_t, 7) \
+  TEST_TYPES (T, int32_t, int8_t, 3) \
+  TEST_TYPES (T, int32_t, int16_t, 3) \
+  TEST_TYPES (T, int64_t, int8_t, 5) \
+  TEST_TYPES (T, int64_t, int16_t, 5) \
+  TEST_TYPES (T, int64_t, int32_t, 5)
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c
new file mode 100644
index 00000000000..efd4c2936f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c
@@ -0,0 +1,26 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_5.c"
+
+#define TEST_LOOP(TYPE1, TYPE2, N, OP)				\
+  {								\
+    TYPE1 pred[N];						\
+    TYPE2 r[N], a[N];						\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	pred[i] = (i % 4 < 2);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE1##_##TYPE2##_##OP (r, a, pred);			\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : a[i]))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c
new file mode 100644
index 00000000000..b2050f21389
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)			\
+  void __attribute__ ((noipa))					\
+  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,		\
+				 TYPE2 *__restrict a,		\
+				 TYPE2 *__restrict b,		\
+				 TYPE1 *__restrict pred)	\
+  {								\
+    for (int i = 0; i < COUNT; ++i)				\
+      {								\
+	TYPE2 bi = b[i];					\
+	r[i] = pred[i] ? OP (a[i]) : bi;			\
+      }								\
+  }
+
+#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
+  T (TYPE1, TYPE2, COUNT, abs) \
+  T (TYPE1, TYPE2, COUNT, neg) \
+  T (TYPE1, TYPE2, COUNT, not)
+
+#define TEST_ALL(T) \
+  TEST_TYPES (T, int16_t, int8_t, 7) \
+  TEST_TYPES (T, int32_t, int8_t, 3) \
+  TEST_TYPES (T, int32_t, int16_t, 3) \
+  TEST_TYPES (T, int64_t, int8_t, 5) \
+  TEST_TYPES (T, int64_t, int16_t, 5) \
+  TEST_TYPES (T, int64_t, int32_t, 5)
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c
new file mode 100644
index 00000000000..e42be948748
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_6.c"
+
+#define TEST_LOOP(TYPE1, TYPE2, N, OP)				\
+  {								\
+    TYPE1 pred[N];						\
+    TYPE2 r[N], a[N], b[N];					\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	b[i] = (i % 5) * (i % 6 + 3);				\
+	pred[i] = (i % 4 < 2);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE1##_##TYPE2##_##OP (r, a, b, pred);		\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : b[i]))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c
new file mode 100644
index 00000000000..72577703cee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)			\
+  void __attribute__ ((noipa))					\
+  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,		\
+				 TYPE2 *__restrict a,		\
+				 TYPE1 *__restrict pred)	\
+  {								\
+    for (int i = 0; i < COUNT; ++i)				\
+      r[i] = pred[i] ? OP (a[i]) : 5;				\
+  }
+
+#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
+  T (TYPE1, TYPE2, COUNT, abs) \
+  T (TYPE1, TYPE2, COUNT, neg) \
+  T (TYPE1, TYPE2, COUNT, not)
+
+#define TEST_ALL(T) \
+  TEST_TYPES (T, int16_t, int8_t, 7) \
+  TEST_TYPES (T, int32_t, int8_t, 3) \
+  TEST_TYPES (T, int32_t, int16_t, 3) \
+  TEST_TYPES (T, int64_t, int8_t, 5) \
+  TEST_TYPES (T, int64_t, int16_t, 5) \
+  TEST_TYPES (T, int64_t, int32_t, 5)
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c
new file mode 100644
index 00000000000..50ff6727086
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c
@@ -0,0 +1,26 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_7.c"
+
+#define TEST_LOOP(TYPE1, TYPE2, N, OP)				\
+  {								\
+    TYPE1 pred[N];						\
+    TYPE2 r[N], a[N];						\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	pred[i] = (i % 4 < 2);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE1##_##TYPE2##_##OP (r, a, pred);			\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : 5))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c
new file mode 100644
index 00000000000..269cc3ded95
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint.h>
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+#define not(A) (~(A))
+
+#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)			\
+  void __attribute__ ((noipa))					\
+  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,		\
+				 TYPE2 *__restrict a,		\
+				 TYPE1 *__restrict pred)	\
+  {								\
+    for (int i = 0; i < COUNT; ++i)				\
+      r[i] = pred[i] ? OP (a[i]) : 0;				\
+  }
+
+#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
+  T (TYPE1, TYPE2, COUNT, abs) \
+  T (TYPE1, TYPE2, COUNT, neg) \
+  T (TYPE1, TYPE2, COUNT, not)
+
+#define TEST_ALL(T) \
+  TEST_TYPES (T, int16_t, int8_t, 7) \
+  TEST_TYPES (T, int32_t, int8_t, 3) \
+  TEST_TYPES (T, int32_t, int16_t, 3) \
+  TEST_TYPES (T, int64_t, int8_t, 5) \
+  TEST_TYPES (T, int64_t, int16_t, 5) \
+  TEST_TYPES (T, int64_t, int32_t, 5)
+
+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
+/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c
new file mode 100644
index 00000000000..dcc72313f99
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c
@@ -0,0 +1,28 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "cond_unary_8.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE1, TYPE2, N, OP)				\
+  {								\
+    TYPE1 pred[N];						\
+    TYPE2 r[N], a[N];						\
+    for (int i = 0; i < N; ++i)					\
+      {								\
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);	\
+	pred[i] = (i % 4 < 2);					\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    test_##TYPE1##_##TYPE2##_##OP (r, a, pred);			\
+    for (int i = 0; i < N; ++i)					\
+      if (r[i] != (pred[i] ? OP (a[i]) : 0))			\
+	__builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
-- 
2.36.3



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22  5:41 [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns Lehua Ding
@ 2023-08-22  5:59 ` Andrew Pinski
       [not found] ` <3C6D3EAB062992F9+F168382A-9849-46CE-8EEC-5B5419AFBDEF@rivai.ai>
  2023-08-22 21:26 ` Robin Dapp
  2 siblings, 0 replies; 9+ messages in thread
From: Andrew Pinski @ 2023-08-22  5:59 UTC (permalink / raw)
  To: Lehua Ding
  Cc: gcc-patches, juzhe.zhong, kito.cheng, rdapp.gcc, palmer, jeffreyalaw

On Mon, Aug 21, 2023 at 10:42 PM Lehua Ding <lehua.ding@rivai.ai> wrote:
>
> Hi,
>
> This patch add conditional unary neg/abs/not autovec patterns to RISC-V backend.
> Consider this C code:
>
> void
> test_3 (float *__restrict a, float *__restrict b, int *__restrict pred, int n)
> {
>   for (int i = 0; i < n; i += 1)
>     {
>       a[i] = pred[i] ? __builtin_fabsf (b[i]) : a[i];
>     }
> }
>
> Before this patch:
>         ...
>         vsetvli a7,zero,e32,m1,ta,ma
>         vfabs.v v2,v2
>         vmerge.vvm      v1,v1,v2,v0
>         ...
>
> After this patch:
>         ...
>         vsetvli a7,zero,e32,m1,ta,mu
>         vfabs.v v1,v2,v0.t
>         ...
>
> For int neg/abs/not and FP neg patterns, Defining the corresponding cond_xxx
> paterns is enough.

Maybe we should add optabs and IFN support for conditional ABS too.
I added it for conditional not with r14-3257-ga32de58c9e63 to fix up a
regression I had introduced with SVE code.

Thanks,
Andrew

> For the FP abs pattern, We need to change the definition of `abs<mode>2` and
> `@vcond_mask_<mode><vm>` pattern from define_expand to define_insn_and_split
> in order to fuse them into a new pattern `*cond_abs<mode>` at the combine pass.
> After changing the pattern of neg, a vlmax copysin + neg fusion pattern needs
> to be added.
>
> A fusion process similar to the one below:
>
> (insn 30 29 31 4 (set (reg:RVVM1SF 152 [ vect_iftmp.15 ])
>         (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))) "float.c":15:56 discrim 1 12799 {absrvvm1sf2}
>      (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
>         (nil)))
>
> (insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
>         (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
>             (reg:RVVM1SF 152 [ vect_iftmp.15 ])
>             (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 12707 {vcond_mask_rvvm1sfrvvmf32bi}
>      (expr_list:REG_DEAD (reg:RVVM1SF 152 [ vect_iftmp.15 ])
>         (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
>             (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
>                 (nil)))))
> ==>
>
> (insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
>         (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
>             (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))
>             (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 13444 {*cond_absrvvm1sf}
>      (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
>         (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
>             (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
>                 (nil)))))
>
> Best,
> Lehua
>
> gcc/ChangeLog:
>
>         * config/riscv/autovec-opt.md (*cond_abs<mode>): New combine pattern.
>         (*copysign<mode>_neg): Ditto.
>         * config/riscv/autovec.md (@vcond_mask_<mode><vm>): Adjust.
>         (<optab><mode>2): Ditto.
>         (cond_<optab><mode>): New.
>         (cond_len_<optab><mode>): Ditto.
>         * config/riscv/riscv-protos.h (enum insn_type): New.
>         (expand_cond_len_unop): New helper func.
>         * config/riscv/riscv-v.cc (shuffle_merge_patterns): Adjust.
>         (expand_cond_len_unop): New helper func.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c: New test.
>         * gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c: New test.
>
> ---
>  gcc/config/riscv/autovec-opt.md               | 39 ++++++++
>  gcc/config/riscv/autovec.md                   | 97 +++++++++++++++++--
>  gcc/config/riscv/riscv-protos.h               |  7 +-
>  gcc/config/riscv/riscv-v.cc                   | 56 ++++++++++-
>  .../riscv/rvv/autovec/cond/cond_unary_1.c     | 43 ++++++++
>  .../riscv/rvv/autovec/cond/cond_unary_1_run.c | 27 ++++++
>  .../riscv/rvv/autovec/cond/cond_unary_2.c     | 47 +++++++++
>  .../riscv/rvv/autovec/cond/cond_unary_2_run.c | 28 ++++++
>  .../riscv/rvv/autovec/cond/cond_unary_3.c     | 43 ++++++++
>  .../riscv/rvv/autovec/cond/cond_unary_3_run.c | 27 ++++++
>  .../riscv/rvv/autovec/cond/cond_unary_4.c     | 43 ++++++++
>  .../riscv/rvv/autovec/cond/cond_unary_4_run.c | 27 ++++++
>  .../riscv/rvv/autovec/cond/cond_unary_5.c     | 37 +++++++
>  .../riscv/rvv/autovec/cond/cond_unary_5_run.c | 26 +++++
>  .../riscv/rvv/autovec/cond/cond_unary_6.c     | 41 ++++++++
>  .../riscv/rvv/autovec/cond/cond_unary_6_run.c | 27 ++++++
>  .../riscv/rvv/autovec/cond/cond_unary_7.c     | 37 +++++++
>  .../riscv/rvv/autovec/cond/cond_unary_7_run.c | 26 +++++
>  .../riscv/rvv/autovec/cond/cond_unary_8.c     | 37 +++++++
>  .../riscv/rvv/autovec/cond/cond_unary_8_run.c | 28 ++++++
>  20 files changed, 730 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c
>
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
> index 99b609a99d9..8247eb87ddb 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -684,3 +684,42 @@
>    }
>    [(set_attr "type" "vfwmuladd")
>     (set_attr "mode" "<V_DOUBLE_TRUNC>")])
> +
> +;; Combine <op> and vcond_mask generated by midend into cond_len_<op>
> +;; Currently supported operations:
> +;;   abs(FP)
> +(define_insn_and_split "*cond_abs<mode>"
> +  [(set (match_operand:VF 0 "register_operand")
> +        (if_then_else:VF
> +          (match_operand:<VM> 3 "register_operand")
> +          (abs:VF (match_operand:VF 1 "nonmemory_operand"))
> +          (match_operand:VF 2 "register_operand")))]
> +  "TARGET_VECTOR && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
> +{
> +  emit_insn (gen_cond_len_abs<mode> (operands[0], operands[3], operands[1],
> +                                    operands[2],
> +                                    gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
> +                                    const0_rtx));
> +  DONE;
> +})
> +
> +;; Combine vlmax neg and UNSPEC_VCOPYSIGN
> +(define_insn_and_split "*copysign<mode>_neg"
> +  [(set (match_operand:VF 0 "register_operand")
> +        (neg:VF
> +          (unspec:VF [
> +            (match_operand:VF 1 "register_operand")
> +            (match_operand:VF 2 "register_operand")
> +          ] UNSPEC_VCOPYSIGN)))]
> +  "TARGET_VECTOR && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
> +{
> +  riscv_vector::emit_vlmax_insn (code_for_pred_ncopysign (<MODE>mode),
> +                                 riscv_vector::RVV_BINOP, operands);
> +  DONE;
> +})
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index acca4c22b90..e1addc07036 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -552,12 +552,16 @@
>  ;; - vfmerge.vf
>  ;; -------------------------------------------------------------------------
>
> -(define_expand "@vcond_mask_<mode><vm>"
> -  [(match_operand:V 0 "register_operand")
> -   (match_operand:<VM> 3 "register_operand")
> -   (match_operand:V 1 "nonmemory_operand")
> -   (match_operand:V 2 "register_operand")]
> -  "TARGET_VECTOR"
> +(define_insn_and_split "@vcond_mask_<mode><vm>"
> +  [(set (match_operand:V 0 "register_operand")
> +        (if_then_else:V
> +          (match_operand:<VM> 3 "register_operand")
> +          (match_operand:V 1 "nonmemory_operand")
> +          (match_operand:V 2 "register_operand")))]
> +  "TARGET_VECTOR && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
>    {
>      /* The order of vcond_mask is opposite to pred_merge.  */
>      std::swap (operands[1], operands[2]);
> @@ -979,11 +983,14 @@
>  ;; Includes:
>  ;; - vfneg.v/vfabs.v
>  ;; -------------------------------------------------------------------------------
> -(define_expand "<optab><mode>2"
> +(define_insn_and_split "<optab><mode>2"
>    [(set (match_operand:VF 0 "register_operand")
>      (any_float_unop_nofrm:VF
>       (match_operand:VF 1 "register_operand")))]
> -  "TARGET_VECTOR"
> +  "TARGET_VECTOR && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
>  {
>    insn_code icode = code_for_pred (<CODE>, <MODE>mode);
>    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
> @@ -1499,6 +1506,80 @@
>    DONE;
>  })
>
> +;; -------------------------------------------------------------------------
> +;; ---- [INT] Conditional unary operations
> +;; -------------------------------------------------------------------------
> +;; Includes:
> +;; - vneg/vnot
> +;; -------------------------------------------------------------------------
> +
> +(define_expand "cond_<optab><mode>"
> +  [(match_operand:VI 0 "register_operand")
> +   (match_operand:<VM> 1 "vector_mask_operand")
> +   (any_int_unop:VI
> +     (match_operand:VI 2 "register_operand"))
> +   (match_operand:VI 3 "register_operand")]
> +  "TARGET_VECTOR"
> +{
> +  /* Normalize into cond_len_* operations.  */
> +  emit_insn (gen_cond_len_<optab><mode> (operands[0], operands[1], operands[2],
> +                                        operands[3],
> +                                        gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
> +                                        const0_rtx));
> +  DONE;
> +})
> +
> +(define_expand "cond_len_<optab><mode>"
> +  [(match_operand:VI 0 "register_operand")
> +   (match_operand:<VM> 1 "vector_mask_operand")
> +   (any_int_unop:VI
> +     (match_operand:VI 2 "register_operand"))
> +   (match_operand:VI 3 "register_operand")
> +   (match_operand 4 "autovec_length_operand")
> +   (match_operand 5 "const_0_operand")]
> +  "TARGET_VECTOR"
> +{
> +  riscv_vector::expand_cond_len_unop (<CODE>, operands);
> +  DONE;
> +})
> +
> +;; -------------------------------------------------------------------------
> +;; ---- [FP] Conditional unary operations
> +;; -------------------------------------------------------------------------
> +;; Includes:
> +;; - vfneg/vfabs
> +;; -------------------------------------------------------------------------
> +
> +(define_expand "cond_<optab><mode>"
> +  [(match_operand:VF 0 "register_operand")
> +   (match_operand:<VM> 1 "vector_mask_operand")
> +   (any_float_unop_nofrm:VF
> +     (match_operand:VF 2 "register_operand"))
> +   (match_operand:VF 3 "register_operand")]
> +  "TARGET_VECTOR"
> +{
> +  /* Normalize into cond_len_* operations.  */
> +  emit_insn (gen_cond_len_<optab><mode> (operands[0], operands[1], operands[2],
> +                                        operands[3],
> +                                        gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
> +                                        const0_rtx));
> +  DONE;
> +})
> +
> +(define_expand "cond_len_<optab><mode>"
> +  [(match_operand:VF 0 "register_operand")
> +   (match_operand:<VM> 1 "vector_mask_operand")
> +   (any_float_unop_nofrm:VF
> +     (match_operand:VF 2 "register_operand"))
> +   (match_operand:VF 3 "register_operand")
> +   (match_operand 4 "autovec_length_operand")
> +   (match_operand 5 "const_0_operand")]
> +  "TARGET_VECTOR"
> +{
> +  riscv_vector::expand_cond_len_unop (<CODE>, operands);
> +  DONE;
> +})
> +
>  ;; -------------------------------------------------------------------------
>  ;; ---- [INT] Conditional binary operations
>  ;; -------------------------------------------------------------------------
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 472c00dc439..2c4405c9860 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -184,6 +184,10 @@ enum insn_type
>  {
>    RVV_MISC_OP = 1,
>    RVV_UNOP = 2,
> +  RVV_UNOP_M = RVV_UNOP + 2,
> +  RVV_UNOP_MU = RVV_UNOP + 2,
> +  RVV_UNOP_TU = RVV_UNOP + 2,
> +  RVV_UNOP_TUMU = RVV_UNOP + 2,
>    RVV_BINOP = 3,
>    RVV_BINOP_MU = RVV_BINOP + 2,
>    RVV_BINOP_TU = RVV_BINOP + 2,
> @@ -191,8 +195,6 @@ enum insn_type
>    RVV_MERGE_OP = 4,
>    RVV_CMP_OP = 4,
>    RVV_CMP_MU_OP = RVV_CMP_OP + 2, /* +2 means mask and maskoff operand.  */
> -  RVV_UNOP_MU = RVV_UNOP + 2,    /* Likewise.  */
> -  RVV_UNOP_M = RVV_UNOP + 2,     /* Likewise.  */
>    RVV_TERNOP = 5,
>    RVV_TERNOP_MU = RVV_TERNOP + 1,
>    RVV_TERNOP_TU = RVV_TERNOP + 1,
> @@ -294,6 +296,7 @@ bool neg_simm5_p (rtx);
>  bool has_vi_variant_p (rtx_code, rtx);
>  void expand_vec_cmp (rtx, rtx_code, rtx, rtx);
>  bool expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool);
> +void expand_cond_len_unop (rtx_code, rtx *);
>  void expand_cond_len_binop (rtx_code, rtx *);
>  void expand_reduction (rtx_code, rtx *, rtx,
>                        reduction_type = reduction_type::UNORDERED);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index b01028c6201..14eda581d00 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -2992,7 +2992,9 @@ shuffle_merge_patterns (struct expand_vec_perm_d *d)
>    emit_vlmax_cmp_insn (icode, ops);
>
>    /* TARGET = MASK ? OP0 : OP1.  */
> -  emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op0, d->op1, mask));
> +  /* swap op0 and op1 since the order is opposite to pred_merge.  */
> +  rtx ops2[] = {d->target, d->op1, d->op0, mask};
> +  emit_vlmax_merge_insn (code_for_pred_merge (vmode), riscv_vector::RVV_MERGE_OP, ops2);
>    return true;
>  }
>
> @@ -3384,10 +3386,58 @@ needs_fp_rounding (rtx_code code, machine_mode mode)
>  {
>    if (!FLOAT_MODE_P (mode))
>      return false;
> -  return code != SMIN && code != SMAX;
> +  return code != SMIN && code != SMAX && code != NEG && code != ABS;
>  }
>
> -/* Expand COND_LEN_*.  */
> +/* Expand unary ops COND_LEN_*.  */
> +void
> +expand_cond_len_unop (rtx_code code, rtx *ops)
> +{
> +  rtx dest = ops[0];
> +  rtx mask = ops[1];
> +  rtx src = ops[2];
> +  rtx merge = ops[3];
> +  rtx len = ops[4];
> +  machine_mode mode = GET_MODE (dest);
> +  machine_mode mask_mode = GET_MODE (mask);
> +
> +  poly_int64 value;
> +  bool is_dummy_mask = rtx_equal_p (mask, CONSTM1_RTX (mask_mode));
> +  bool is_vlmax_len
> +    = poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode));
> +  rtx cond_ops[] = {dest, mask, merge, src};
> +  insn_code icode = code_for_pred (code, mode);
> +
> +  if (is_dummy_mask)
> +    {
> +      /* Use TU, MASK ANY policy.  */
> +      if (needs_fp_rounding (code, mode))
> +       emit_nonvlmax_fp_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
> +      else
> +       emit_nonvlmax_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
> +    }
> +  else
> +    {
> +      if (is_vlmax_len)
> +       {
> +         /* Use TAIL ANY, MU policy.  */
> +         if (needs_fp_rounding (code, mode))
> +           emit_vlmax_masked_fp_mu_insn (icode, RVV_UNOP_MU, cond_ops);
> +         else
> +           emit_vlmax_masked_mu_insn (icode, RVV_UNOP_MU, cond_ops);
> +       }
> +      else
> +       {
> +         /* Use TU, MU policy.  */
> +         if (needs_fp_rounding (code, mode))
> +           emit_nonvlmax_fp_tumu_insn (icode, RVV_UNOP_TUMU, cond_ops, len);
> +         else
> +           emit_nonvlmax_tumu_insn (icode, RVV_UNOP_TUMU, cond_ops, len);
> +       }
> +    }
> +}
> +
> +/* Expand binary ops COND_LEN_*.  */
>  void
>  expand_cond_len_binop (rtx_code code, rtx *ops)
>  {
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c
> new file mode 100644
> index 00000000000..e8bf41a55a1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1.c
> @@ -0,0 +1,43 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE, OP)                                     \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,  \
> +                     TYPE *__restrict pred, int n)             \
> +  {                                                            \
> +    for (int i = 0; i < n; ++i)                                        \
> +      r[i] = pred[i] ? OP (a[i]) : a[i];                       \
> +  }
> +
> +#define TEST_INT_TYPE(T, TYPE) \
> +  T (TYPE, abs) \
> +  T (TYPE, neg) \
> +  T (TYPE, not)
> +
> +#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
> +  T (TYPE, __builtin_fabs##SUFFIX) \
> +  T (TYPE, neg)
> +
> +#define TEST_ALL(T) \
> +  TEST_INT_TYPE (T, int8_t) \
> +  TEST_INT_TYPE (T, int16_t) \
> +  TEST_INT_TYPE (T, int32_t) \
> +  TEST_INT_TYPE (T, int64_t) \
> +  TEST_FLOAT_TYPE (T, _Float16, f16) \
> +  TEST_FLOAT_TYPE (T, float, f) \
> +  TEST_FLOAT_TYPE (T, double, )
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> +/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c
> new file mode 100644
> index 00000000000..90f1c1a71e7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_1_run.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_1.c"
> +
> +#define N 99
> +
> +#define TEST_LOOP(TYPE, OP)                                    \
> +  {                                                            \
> +    TYPE r[N], a[N], pred[N];                                  \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       pred[i] = (i % 7 < 4);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE##_##OP (r, a, pred, N);                                \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : a[i]))                        \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c
> new file mode 100644
> index 00000000000..fb984ded12f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2.c
> @@ -0,0 +1,47 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE, OP)                                     \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,  \
> +                     TYPE *__restrict b,                       \
> +                     TYPE *__restrict pred, int n)             \
> +  {                                                            \
> +    for (int i = 0; i < n; ++i)                                        \
> +      {                                                                \
> +       TYPE bi = b[i];                                         \
> +       r[i] = pred[i] ? OP (a[i]) : bi;                        \
> +      }                                                                \
> +  }
> +
> +#define TEST_INT_TYPE(T, TYPE) \
> +  T (TYPE, abs) \
> +  T (TYPE, neg) \
> +  T (TYPE, not)
> +
> +#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
> +  T (TYPE, __builtin_fabs##SUFFIX) \
> +  T (TYPE, neg)
> +
> +#define TEST_ALL(T) \
> +  TEST_INT_TYPE (T, int8_t) \
> +  TEST_INT_TYPE (T, int16_t) \
> +  TEST_INT_TYPE (T, int32_t) \
> +  TEST_INT_TYPE (T, int64_t) \
> +  TEST_FLOAT_TYPE (T, _Float16, f16) \
> +  TEST_FLOAT_TYPE (T, float, f) \
> +  TEST_FLOAT_TYPE (T, double, )
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> +/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c
> new file mode 100644
> index 00000000000..b739c5d8df2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_2_run.c
> @@ -0,0 +1,28 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_2.c"
> +
> +#define N 99
> +
> +#define TEST_LOOP(TYPE, OP)                                    \
> +  {                                                            \
> +    TYPE r[N], a[N], b[N], pred[N];                            \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       b[i] = (i % 9) * (i % 7 + 1);                           \
> +       pred[i] = (i % 7 < 4);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE##_##OP (r, a, b, pred, N);                     \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : b[i]))                        \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c
> new file mode 100644
> index 00000000000..a1aeaf076de
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3.c
> @@ -0,0 +1,43 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE, OP)                                     \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,  \
> +                     TYPE *__restrict pred, int n)             \
> +  {                                                            \
> +    for (int i = 0; i < n; ++i)                                        \
> +      r[i] = pred[i] ? OP (a[i]) : 5;                          \
> +  }
> +
> +#define TEST_INT_TYPE(T, TYPE) \
> +  T (TYPE, abs) \
> +  T (TYPE, neg) \
> +  T (TYPE, not)
> +
> +#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
> +  T (TYPE, __builtin_fabs##SUFFIX) \
> +  T (TYPE, neg)
> +
> +#define TEST_ALL(T) \
> +  TEST_INT_TYPE (T, int8_t) \
> +  TEST_INT_TYPE (T, int16_t) \
> +  TEST_INT_TYPE (T, int32_t) \
> +  TEST_INT_TYPE (T, int64_t) \
> +  TEST_FLOAT_TYPE (T, _Float16, f16) \
> +  TEST_FLOAT_TYPE (T, float, f) \
> +  TEST_FLOAT_TYPE (T, double, )
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> +/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c
> new file mode 100644
> index 00000000000..9234c94d9bf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_3_run.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_3.c"
> +
> +#define N 99
> +
> +#define TEST_LOOP(TYPE, OP)                                    \
> +  {                                                            \
> +    TYPE r[N], a[N], pred[N];                                  \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       pred[i] = (i % 7 < 4);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE##_##OP (r, a, pred, N);                                \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : 5))                   \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c
> new file mode 100644
> index 00000000000..ed041fe28cd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4.c
> @@ -0,0 +1,43 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE, OP)                                     \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,  \
> +                     TYPE *__restrict pred, int n)             \
> +  {                                                            \
> +    for (int i = 0; i < n; ++i)                                        \
> +      r[i] = pred[i] ? OP (a[i]) : 0;                          \
> +  }
> +
> +#define TEST_INT_TYPE(T, TYPE) \
> +  T (TYPE, abs) \
> +  T (TYPE, neg) \
> +  T (TYPE, not)
> +
> +#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
> +  T (TYPE, __builtin_fabs##SUFFIX) \
> +  T (TYPE, neg)
> +
> +#define TEST_ALL(T) \
> +  TEST_INT_TYPE (T, int8_t) \
> +  TEST_INT_TYPE (T, int16_t) \
> +  TEST_INT_TYPE (T, int32_t) \
> +  TEST_INT_TYPE (T, int64_t) \
> +  TEST_FLOAT_TYPE (T, _Float16, f16) \
> +  TEST_FLOAT_TYPE (T, float, f) \
> +  TEST_FLOAT_TYPE (T, double, )
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 8 } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvfabs\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> +/* { dg-final { scan-assembler-times {\tvfneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c
> new file mode 100644
> index 00000000000..52fd4d07be9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_4_run.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_4.c"
> +
> +#define N 99
> +
> +#define TEST_LOOP(TYPE, OP)                                    \
> +  {                                                            \
> +    TYPE r[N], a[N], pred[N];                                  \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       pred[i] = (i % 7 < 4);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE##_##OP (r, a, pred, N);                                \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : 0))                   \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c
> new file mode 100644
> index 00000000000..6c0a25aad6b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)                      \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,          \
> +                                TYPE2 *__restrict a,           \
> +                                TYPE1 *__restrict pred)        \
> +  {                                                            \
> +    for (int i = 0; i < COUNT; ++i)                            \
> +      r[i] = pred[i] ? OP (a[i]) : a[i];                       \
> +  }
> +
> +#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
> +  T (TYPE1, TYPE2, COUNT, abs) \
> +  T (TYPE1, TYPE2, COUNT, neg) \
> +  T (TYPE1, TYPE2, COUNT, not)
> +
> +#define TEST_ALL(T) \
> +  TEST_TYPES (T, int16_t, int8_t, 7) \
> +  TEST_TYPES (T, int32_t, int8_t, 3) \
> +  TEST_TYPES (T, int32_t, int16_t, 3) \
> +  TEST_TYPES (T, int64_t, int8_t, 5) \
> +  TEST_TYPES (T, int64_t, int16_t, 5) \
> +  TEST_TYPES (T, int64_t, int32_t, 5)
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c
> new file mode 100644
> index 00000000000..efd4c2936f1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_5_run.c
> @@ -0,0 +1,26 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_5.c"
> +
> +#define TEST_LOOP(TYPE1, TYPE2, N, OP)                         \
> +  {                                                            \
> +    TYPE1 pred[N];                                             \
> +    TYPE2 r[N], a[N];                                          \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       pred[i] = (i % 4 < 2);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE1##_##TYPE2##_##OP (r, a, pred);                        \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : a[i]))                        \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c
> new file mode 100644
> index 00000000000..b2050f21389
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6.c
> @@ -0,0 +1,41 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)                      \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,          \
> +                                TYPE2 *__restrict a,           \
> +                                TYPE2 *__restrict b,           \
> +                                TYPE1 *__restrict pred)        \
> +  {                                                            \
> +    for (int i = 0; i < COUNT; ++i)                            \
> +      {                                                                \
> +       TYPE2 bi = b[i];                                        \
> +       r[i] = pred[i] ? OP (a[i]) : bi;                        \
> +      }                                                                \
> +  }
> +
> +#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
> +  T (TYPE1, TYPE2, COUNT, abs) \
> +  T (TYPE1, TYPE2, COUNT, neg) \
> +  T (TYPE1, TYPE2, COUNT, not)
> +
> +#define TEST_ALL(T) \
> +  TEST_TYPES (T, int16_t, int8_t, 7) \
> +  TEST_TYPES (T, int32_t, int8_t, 3) \
> +  TEST_TYPES (T, int32_t, int16_t, 3) \
> +  TEST_TYPES (T, int64_t, int8_t, 5) \
> +  TEST_TYPES (T, int64_t, int16_t, 5) \
> +  TEST_TYPES (T, int64_t, int32_t, 5)
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c
> new file mode 100644
> index 00000000000..e42be948748
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_6_run.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_6.c"
> +
> +#define TEST_LOOP(TYPE1, TYPE2, N, OP)                         \
> +  {                                                            \
> +    TYPE1 pred[N];                                             \
> +    TYPE2 r[N], a[N], b[N];                                    \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       b[i] = (i % 5) * (i % 6 + 3);                           \
> +       pred[i] = (i % 4 < 2);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE1##_##TYPE2##_##OP (r, a, b, pred);             \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : b[i]))                        \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c
> new file mode 100644
> index 00000000000..72577703cee
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)                      \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,          \
> +                                TYPE2 *__restrict a,           \
> +                                TYPE1 *__restrict pred)        \
> +  {                                                            \
> +    for (int i = 0; i < COUNT; ++i)                            \
> +      r[i] = pred[i] ? OP (a[i]) : 5;                          \
> +  }
> +
> +#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
> +  T (TYPE1, TYPE2, COUNT, abs) \
> +  T (TYPE1, TYPE2, COUNT, neg) \
> +  T (TYPE1, TYPE2, COUNT, not)
> +
> +#define TEST_ALL(T) \
> +  TEST_TYPES (T, int16_t, int8_t, 7) \
> +  TEST_TYPES (T, int32_t, int8_t, 3) \
> +  TEST_TYPES (T, int32_t, int16_t, 3) \
> +  TEST_TYPES (T, int64_t, int8_t, 5) \
> +  TEST_TYPES (T, int64_t, int16_t, 5) \
> +  TEST_TYPES (T, int64_t, int32_t, 5)
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c
> new file mode 100644
> index 00000000000..50ff6727086
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_7_run.c
> @@ -0,0 +1,26 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_7.c"
> +
> +#define TEST_LOOP(TYPE1, TYPE2, N, OP)                         \
> +  {                                                            \
> +    TYPE1 pred[N];                                             \
> +    TYPE2 r[N], a[N];                                          \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       pred[i] = (i % 4 < 2);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE1##_##TYPE2##_##OP (r, a, pred);                        \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : 5))                   \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c
> new file mode 100644
> index 00000000000..269cc3ded95
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define abs(A) ((A) < 0 ? -(A) : (A))
> +#define neg(A) (-(A))
> +#define not(A) (~(A))
> +
> +#define DEF_LOOP(TYPE1, TYPE2, COUNT, OP)                      \
> +  void __attribute__ ((noipa))                                 \
> +  test_##TYPE1##_##TYPE2##_##OP (TYPE2 *__restrict r,          \
> +                                TYPE2 *__restrict a,           \
> +                                TYPE1 *__restrict pred)        \
> +  {                                                            \
> +    for (int i = 0; i < COUNT; ++i)                            \
> +      r[i] = pred[i] ? OP (a[i]) : 0;                          \
> +  }
> +
> +#define TEST_TYPES(T, TYPE1, TYPE2, COUNT) \
> +  T (TYPE1, TYPE2, COUNT, abs) \
> +  T (TYPE1, TYPE2, COUNT, neg) \
> +  T (TYPE1, TYPE2, COUNT, not)
> +
> +#define TEST_ALL(T) \
> +  TEST_TYPES (T, int16_t, int8_t, 7) \
> +  TEST_TYPES (T, int32_t, int8_t, 3) \
> +  TEST_TYPES (T, int32_t, int16_t, 3) \
> +  TEST_TYPES (T, int64_t, int8_t, 5) \
> +  TEST_TYPES (T, int64_t, int16_t, 5) \
> +  TEST_TYPES (T, int64_t, int32_t, 5)
> +
> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> +/* { dg-final { scan-assembler-times {\tvnot\.v\tv[0-9]+,v[0-9]+,v0\.t} 6 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c
> new file mode 100644
> index 00000000000..dcc72313f99
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_unary_8_run.c
> @@ -0,0 +1,28 @@
> +/* { dg-do run { target { riscv_vector } } } */
> +/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model" } */
> +
> +#include "cond_unary_8.c"
> +
> +#define N 99
> +
> +#define TEST_LOOP(TYPE1, TYPE2, N, OP)                         \
> +  {                                                            \
> +    TYPE1 pred[N];                                             \
> +    TYPE2 r[N], a[N];                                          \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : -1);     \
> +       pred[i] = (i % 4 < 2);                                  \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +    test_##TYPE1##_##TYPE2##_##OP (r, a, pred);                        \
> +    for (int i = 0; i < N; ++i)                                        \
> +      if (r[i] != (pred[i] ? OP (a[i]) : 0))                   \
> +       __builtin_abort ();                                     \
> +  }
> +
> +int main ()
> +{
> +  TEST_ALL (TEST_LOOP)
> +  return 0;
> +}
> --
> 2.36.3
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
       [not found] ` <3C6D3EAB062992F9+F168382A-9849-46CE-8EEC-5B5419AFBDEF@rivai.ai>
@ 2023-08-22  7:33   ` Robin Dapp
  2023-08-22  8:08     ` juzhe.zhong
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Dapp @ 2023-08-22  7:33 UTC (permalink / raw)
  To: juzhe.zhong, Andrew Pinski
  Cc: rdapp.gcc, Lehua Ding, gcc-patches, kito.cheng, palmer, jeffreyalaw

> What about conditional zero_extension, sign_extension,
> float_extension, ...etc?
> 
> We have discussed this, we can have some many conditional situations
> that can be supported by either match.pd or rtl backend combine
> pass.
> 
> IMHO, it will be too many optabs/internal fns if we support all of
> them in match.pd? Feel free to correct me I am wrong.
I think the general trend is (and should be) to push things forward
in the pipeline and not just have combine fix it.  However, for now
this would complicate things and therefore I agree with the approach
the patch takes.  I'd rather have the patterns in now rather than change
the middle end for unclear benefit.  

IMHO long-term we want things to be optimized early but short-term
combine is good enough.  We can then move optimizations forward on a
case-by-case basis.

Regards
 Robin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22  7:33   ` Robin Dapp
@ 2023-08-22  8:08     ` juzhe.zhong
  2023-08-22 14:05       ` Jeff Law
  0 siblings, 1 reply; 9+ messages in thread
From: juzhe.zhong @ 2023-08-22  8:08 UTC (permalink / raw)
  To: Robin Dapp, pinskia
  Cc: Robin Dapp, 丁乐华,
	gcc-patches, kito.cheng, palmer, jeffreyalaw, richard.sandiford,
	Richard Biener

[-- Attachment #1: Type: text/plain, Size: 2383 bytes --]

Yes, I agree long-term we want every-thing be optimized as early as possible.

However, IMHO, it's impossible we can support every conditional patterns in the middle-end (match.pd).
It's a really big number.

For example, for sign_extend conversion, we have vsext.vf2 (vector SI -> vector DI),... vsext.vf4 (vector HI -> vector DI), vsext.vf8 (vector QI -> vector DI)..
Not only the conversion, every auto-vectorization patterns can have conditional format. 
For example, abs,..rotate, sqrt, floor, ceil,....etc.
I bet it could be over 100+ conditional optabs/internal FNs. It's huge number. 
I don't see necessity that we should support them in middle-end (match.pd) since we known RTL back-end combine PASS can do the good job here.

Besides, LLVM doesn't such many conditional pattern. LLVM just has "add" and "select" separate IR then do the combine in the back-end:
https://godbolt.org/z/rYcMMG1eT 

You can see LLVM didn't do the op + select optimization in generic IR, they do the optimization in combine PASS.

So I prefer this patch solution and apply such solution for the future more support : sign extend, zero extend, float extend, abs, sqrt, ceil, floor, ....etc.

Thanks. 


juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-08-22 15:33
To: juzhe.zhong; Andrew Pinski
CC: rdapp.gcc; Lehua Ding; gcc-patches@gcc.gnu.org; kito.cheng@gmail.com; palmer@rivosinc.com; jeffreyalaw@gmail.com
Subject: Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
> What about conditional zero_extension, sign_extension,
> float_extension, ...etc?
> 
> We have discussed this, we can have some many conditional situations
> that can be supported by either match.pd or rtl backend combine
> pass.
> 
> IMHO, it will be too many optabs/internal fns if we support all of
> them in match.pd? Feel free to correct me I am wrong.
I think the general trend is (and should be) to push things forward
in the pipeline and not just have combine fix it.  However, for now
this would complicate things and therefore I agree with the approach
the patch takes.  I'd rather have the patterns in now rather than change
the middle end for unclear benefit.  
 
IMHO long-term we want things to be optimized early but short-term
combine is good enough.  We can then move optimizations forward on a
case-by-case basis.
 
Regards
Robin
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22  8:08     ` juzhe.zhong
@ 2023-08-22 14:05       ` Jeff Law
  2023-08-22 14:28         ` 钟居哲
  2023-08-24  9:57         ` Richard Sandiford
  0 siblings, 2 replies; 9+ messages in thread
From: Jeff Law @ 2023-08-22 14:05 UTC (permalink / raw)
  To: juzhe.zhong, Robin Dapp, pinskia
  Cc: 丁乐华,
	gcc-patches, kito.cheng, palmer, richard.sandiford,
	Richard Biener



On 8/22/23 02:08, juzhe.zhong@rivai.ai wrote:
> Yes, I agree long-term we want every-thing be optimized as early as 
> possible.
> 
> However, IMHO, it's impossible we can support every conditional patterns 
> in the middle-end (match.pd).
> It's a really big number.
> 
> For example, for sign_extend conversion, we have vsext.vf2 (vector SI -> 
> vector DI),... vsext.vf4 (vector HI -> vector DI), vsext.vf8 (vector QI 
> -> vector DI)..
> Not only the conversion, every auto-vectorization patterns can have 
> conditional format.
> For example, abs,..rotate, sqrt, floor, ceil,....etc.
> I bet it could be over 100+ conditional optabs/internal FNs. It's huge 
> number.
> I don't see necessity that we should support them in middle-end 
> (match.pd) since we known RTL back-end combine PASS can do the good job 
> here.
> 
> Besides, LLVM doesn't such many conditional pattern. LLVM just has "add" 
> and "select" separate IR then do the combine in the back-end:
> https://godbolt.org/z/rYcMMG1eT <https://godbolt.org/z/rYcMMG1eT>
> 
> You can see LLVM didn't do the op + select optimization in generic IR, 
> they do the optimization in combine PASS.
> 
> So I prefer this patch solution and apply such solution for the future 
> more support : sign extend, zero extend, float extend, abs, sqrt, ceil, 
> floor, ....etc.
It's certainly got the potential to get out of hand.  And it's not just 
the vectorizer operations.  I know of an architecture that can execute 
most of its ALU and loads/stores conditionally (not predication, but 
actual conditional ops) like target  = (x COND Y) ? a << b ; a)

I'd tend to lean towards synthesizing these conditional ops around a 
conditional move/select primitive in gimple through the RTL expanders. 
That would in turn set things up so that if the target had various 
conditional operations like conditional shift it could be trivially 
discovered by the combiner.

We still get most of the benefit of eliminating control flow early, a 
sensible gimple representation, relatively easy translation into RTL and 
  easy combination for targets with actual conditional operations.

It turns out that model is something we may want to work towards anyway. 
  We were looking at this exact problem in the context of zicond for 
riscv.  The biggest problem we've seen so far is that the generic 
conditional move expansion generates fairly poor code when the target 
doesn't actually have a conditional move primitive.

jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22 14:05       ` Jeff Law
@ 2023-08-22 14:28         ` 钟居哲
  2023-08-24  9:57         ` Richard Sandiford
  1 sibling, 0 replies; 9+ messages in thread
From: 钟居哲 @ 2023-08-22 14:28 UTC (permalink / raw)
  To: Jeff Law, rdapp.gcc, Andrew Pinski
  Cc: 丁乐华,
	gcc-patches, kito.cheng, palmer, richard.sandiford,
	richard.guenther

[-- Attachment #1: Type: text/plain, Size: 3506 bytes --]

>> It's certainly got the potential to get out of hand.  And it's not just
>> the vectorizer operations.  I know of an architecture that can execute
>> most of its ALU and loads/stores conditionally (not predication, but
>> actual conditional ops) like target  = (x COND Y) ? a << b ; a)

Do you mean we need to add cond_abs, cond_sqrt, cond_sign_extend, cond_zero_extend, cond_float_extend,.
...etc, over 100+ optabs/fns for vectoriation optimizaiton and support them in gimple IR (middle-end match.pd) ?

Or it's ok fo now we try to support those conditional operations in RISC-V backend by combine PASS ?

I personally prefer the later and I assign Lehua working on it.

Thanks.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-08-22 22:05
To: juzhe.zhong@rivai.ai; Robin Dapp; pinskia
CC: 丁乐华; gcc-patches; kito.cheng; palmer; richard.sandiford; Richard Biener
Subject: Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
 
 
On 8/22/23 02:08, juzhe.zhong@rivai.ai wrote:
> Yes, I agree long-term we want every-thing be optimized as early as 
> possible.
> 
> However, IMHO, it's impossible we can support every conditional patterns 
> in the middle-end (match.pd).
> It's a really big number.
> 
> For example, for sign_extend conversion, we have vsext.vf2 (vector SI -> 
> vector DI),... vsext.vf4 (vector HI -> vector DI), vsext.vf8 (vector QI 
> -> vector DI)..
> Not only the conversion, every auto-vectorization patterns can have 
> conditional format.
> For example, abs,..rotate, sqrt, floor, ceil,....etc.
> I bet it could be over 100+ conditional optabs/internal FNs. It's huge 
> number.
> I don't see necessity that we should support them in middle-end 
> (match.pd) since we known RTL back-end combine PASS can do the good job 
> here.
> 
> Besides, LLVM doesn't such many conditional pattern. LLVM just has "add" 
> and "select" separate IR then do the combine in the back-end:
> https://godbolt.org/z/rYcMMG1eT <https://godbolt.org/z/rYcMMG1eT>
> 
> You can see LLVM didn't do the op + select optimization in generic IR, 
> they do the optimization in combine PASS.
> 
> So I prefer this patch solution and apply such solution for the future 
> more support : sign extend, zero extend, float extend, abs, sqrt, ceil, 
> floor, ....etc.
It's certainly got the potential to get out of hand.  And it's not just 
the vectorizer operations.  I know of an architecture that can execute 
most of its ALU and loads/stores conditionally (not predication, but 
actual conditional ops) like target  = (x COND Y) ? a << b ; a)
 
I'd tend to lean towards synthesizing these conditional ops around a 
conditional move/select primitive in gimple through the RTL expanders. 
That would in turn set things up so that if the target had various 
conditional operations like conditional shift it could be trivially 
discovered by the combiner.
 
We still get most of the benefit of eliminating control flow early, a 
sensible gimple representation, relatively easy translation into RTL and 
  easy combination for targets with actual conditional operations.
 
It turns out that model is something we may want to work towards anyway. 
  We were looking at this exact problem in the context of zicond for 
riscv.  The biggest problem we've seen so far is that the generic 
conditional move expansion generates fairly poor code when the target 
doesn't actually have a conditional move primitive.
 
jeff
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22  5:41 [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns Lehua Ding
  2023-08-22  5:59 ` Andrew Pinski
       [not found] ` <3C6D3EAB062992F9+F168382A-9849-46CE-8EEC-5B5419AFBDEF@rivai.ai>
@ 2023-08-22 21:26 ` Robin Dapp
  2023-08-23  3:33   ` Lehua Ding
  2 siblings, 1 reply; 9+ messages in thread
From: Robin Dapp @ 2023-08-22 21:26 UTC (permalink / raw)
  To: Lehua Ding, gcc-patches
  Cc: rdapp.gcc, juzhe.zhong, kito.cheng, palmer, jeffreyalaw

Hi Lehua,

no concerns here, just tiny remarks but in general LGTM as is.

> +(define_insn_and_split "*copysign<mode>_neg"
> +  [(set (match_operand:VF 0 "register_operand")
> +        (neg:VF
> +          (unspec:VF [
> +            (match_operand:VF 1 "register_operand")
> +            (match_operand:VF 2 "register_operand")
> +          ] UNSPEC_VCOPYSIGN)))]
> +  "TARGET_VECTOR && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
> +{
> +  riscv_vector::emit_vlmax_insn (code_for_pred_ncopysign (<MODE>mode),
> +                                 riscv_vector::RVV_BINOP, operands);
> +  DONE;
> +})

It's a bit unfortunate that we need this now but well, no way around it.

> -  emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op0, d->op1, mask));
> +  /* swap op0 and op1 since the order is opposite to pred_merge.  */
> +  rtx ops2[] = {d->target, d->op1, d->op0, mask};
> +  emit_vlmax_merge_insn (code_for_pred_merge (vmode), riscv_vector::RVV_MERGE_OP, ops2);
>    return true;
>  }

This seems a separate, general fix that just surfaced in the course of
this patch?  Would be nice to have this factored out but as we already have
it, no need I guess.

> +  if (is_dummy_mask)
> +    {
> +      /* Use TU, MASK ANY policy.  */
> +      if (needs_fp_rounding (code, mode))
> +	emit_nonvlmax_fp_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
> +      else
> +	emit_nonvlmax_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
> +    }

We have quite a bit of code duplication across the expand_cond_len functions
now (binop, ternop, unop).  Not particular to your patch but I'd suggest to
unify this later. 

> +TEST_ALL (DEF_LOOP)
> +
> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */

Why does this fail with LMUL == 2 (also in the following tests)?  A comment
would be nice here.

Regards
 Robin


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22 21:26 ` Robin Dapp
@ 2023-08-23  3:33   ` Lehua Ding
  0 siblings, 0 replies; 9+ messages in thread
From: Lehua Ding @ 2023-08-23  3:33 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches; +Cc: juzhe.zhong, kito.cheng, palmer, jeffreyalaw

Hi Robin,

Thanks for these nice comments!

>> -  emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op0, d->op1, mask));
>> +  /* swap op0 and op1 since the order is opposite to pred_merge.  */
>> +  rtx ops2[] = {d->target, d->op1, d->op0, mask};
>> +  emit_vlmax_merge_insn (code_for_pred_merge (vmode), riscv_vector::RVV_MERGE_OP, ops2);
>>     return true;
>>   }
> 
> This seems a separate, general fix that just surfaced in the course of
> this patch?  Would be nice to have this factored out but as we already have
> it, no need I guess.

Yes, since I change @vcond_mask_<mode><vm> from define_expand to 
define_insn_and_split. If I don't change it then I need to manually make 
sure that d->target, d->op1, d->op0 satisfy the predicate of the 
@vcond_mask (vregs pass will check it, so need forbidden mem operand). 
If I use emit_vlmax_merge_insn directly, it uses expand_insn inner, 
which automatically converts the operands for me to make it satisfy the 
predicate condition. This is one difference between gen_xxx and 
expand_insn. And I think calling emit_vlmax_merge_insn to generate 
pred_merge is the most appropriate and uniform way.

>> +  if (is_dummy_mask)
>> +    {
>> +      /* Use TU, MASK ANY policy.  */
>> +      if (needs_fp_rounding (code, mode))
>> +	emit_nonvlmax_fp_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
>> +      else
>> +	emit_nonvlmax_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
>> +    }
> 
> We have quite a bit of code duplication across the expand_cond_len functions
> now (binop, ternop, unop).  Not particular to your patch but I'd suggest to
> unify this later.

Indeed, leave it to me and I'll send another patch later to reduce this 
duplicate code.

> 
>> +TEST_ALL (DEF_LOOP)
>> +
>> +/* NOTE: int abs operator is converted to vmslt + vneg.v */
>> +/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { any-opts "--param riscv-autovec-lmul=m2" } } } } */
> 
> Why does this fail with LMUL == 2 (also in the following tests)?  A comment
> would be nice here.

This is because the number of iterations 5 in the testcase caused GCC to 
remove the Loop and turn it into two basic blocks, resulting in a 
doubling of the number of vnegs. I'm going to modify the iteration count 
(It should be big enough that that wouldn't happen even when LMUL=m8) so 
that it doesn't trigger that optimization.

V2 patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628210.html

-- 
Best,
Lehua


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns
  2023-08-22 14:05       ` Jeff Law
  2023-08-22 14:28         ` 钟居哲
@ 2023-08-24  9:57         ` Richard Sandiford
  1 sibling, 0 replies; 9+ messages in thread
From: Richard Sandiford @ 2023-08-24  9:57 UTC (permalink / raw)
  To: Jeff Law
  Cc: juzhe.zhong, Robin Dapp, pinskia, 丁乐华,
	gcc-patches, kito.cheng, palmer, Richard Biener

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 8/22/23 02:08, juzhe.zhong@rivai.ai wrote:
>> Yes, I agree long-term we want every-thing be optimized as early as 
>> possible.
>> 
>> However, IMHO, it's impossible we can support every conditional patterns 
>> in the middle-end (match.pd).
>> It's a really big number.
>> 
>> For example, for sign_extend conversion, we have vsext.vf2 (vector SI -> 
>> vector DI),... vsext.vf4 (vector HI -> vector DI), vsext.vf8 (vector QI 
>> -> vector DI)..
>> Not only the conversion, every auto-vectorization patterns can have 
>> conditional format.
>> For example, abs,..rotate, sqrt, floor, ceil,....etc.
>> I bet it could be over 100+ conditional optabs/internal FNs. It's huge 
>> number.
>> I don't see necessity that we should support them in middle-end 
>> (match.pd) since we known RTL back-end combine PASS can do the good job 
>> here.
>> 
>> Besides, LLVM doesn't such many conditional pattern. LLVM just has "add" 
>> and "select" separate IR then do the combine in the back-end:
>> https://godbolt.org/z/rYcMMG1eT <https://godbolt.org/z/rYcMMG1eT>
>> 
>> You can see LLVM didn't do the op + select optimization in generic IR, 
>> they do the optimization in combine PASS.
>> 
>> So I prefer this patch solution and apply such solution for the future 
>> more support : sign extend, zero extend, float extend, abs, sqrt, ceil, 
>> floor, ....etc.
> It's certainly got the potential to get out of hand.  And it's not just 
> the vectorizer operations.  I know of an architecture that can execute 
> most of its ALU and loads/stores conditionally (not predication, but 
> actual conditional ops) like target  = (x COND Y) ? a << b ; a)
>
> I'd tend to lean towards synthesizing these conditional ops around a 
> conditional move/select primitive in gimple through the RTL expanders. 
> That would in turn set things up so that if the target had various 
> conditional operations like conditional shift it could be trivially 
> discovered by the combiner.

FWIW, one of the original motivations behind the COND_* internal
functions was to represent the fact that the operation is suppressed
(rather than being performed and discarded) when the predicate is false.
This allows if-conversion for FP operations even in strict FP modes,
since inactive lanes are guaranteed not to generate an exception.

I think it makes sense to add COND_* functions for anything that can
reasonably be done on FP types, and that could generate an FP exception.
E.g. sqrt was one of the examples mentioned, and I think COND_SQRT is
something that we should have.

I agree it's less clear-cut for purely integer stuff, or for FP operations
like neg and abs that are pure bit manipulation.  But perhaps there's a
question of how many operations are only defined for integers, and
whether the number is high enough for them to be treated differently.

I wouldn't have expected an explosion of operations to be a significant
issue, since (a) the underlying infrastructure is pretty mechanical and
(b) any operation that a target supports is going to need an .md pattern
whatever happens.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-24  9:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-22  5:41 [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns Lehua Ding
2023-08-22  5:59 ` Andrew Pinski
     [not found] ` <3C6D3EAB062992F9+F168382A-9849-46CE-8EEC-5B5419AFBDEF@rivai.ai>
2023-08-22  7:33   ` Robin Dapp
2023-08-22  8:08     ` juzhe.zhong
2023-08-22 14:05       ` Jeff Law
2023-08-22 14:28         ` 钟居哲
2023-08-24  9:57         ` Richard Sandiford
2023-08-22 21:26 ` Robin Dapp
2023-08-23  3:33   ` Lehua Ding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).