public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r13-6277] RISC-V: Add RVV reduction C/C++ intrinsics support
@ 2023-02-22 13:44 Kito Cheng
  0 siblings, 0 replies; only message in thread
From: Kito Cheng @ 2023-02-22 13:44 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:c878c6586dee353e685364910e02ad1a611d4634

commit r13-6277-gc878c6586dee353e685364910e02ad1a611d4634
Author: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Date:   Mon Feb 20 14:54:45 2023 +0800

    RISC-V: Add RVV reduction C/C++ intrinsics support
    
    gcc/ChangeLog:
    
            * config/riscv/riscv-vector-builtins-bases.cc (class reducop): New class.
            (class widen_reducop): Ditto.
            (class freducop): Ditto.
            (class widen_freducop): Ditto.
            (BASE): Ditto.
            * config/riscv/riscv-vector-builtins-bases.h: Ditto.
            * config/riscv/riscv-vector-builtins-functions.def (vredsum): Add reduction support.
            (vredmaxu): Ditto.
            (vredmax): Ditto.
            (vredminu): Ditto.
            (vredmin): Ditto.
            (vredand): Ditto.
            (vredor): Ditto.
            (vredxor): Ditto.
            (vwredsum): Ditto.
            (vwredsumu): Ditto.
            (vfredusum): Ditto.
            (vfredosum): Ditto.
            (vfredmax): Ditto.
            (vfredmin): Ditto.
            (vfwredosum): Ditto.
            (vfwredusum): Ditto.
            * config/riscv/riscv-vector-builtins-shapes.cc (struct reduc_alu_def): Ditto.
            (SHAPE): Ditto.
            * config/riscv/riscv-vector-builtins-shapes.h: Ditto.
            * config/riscv/riscv-vector-builtins-types.def (DEF_RVV_WI_OPS): New macro.
            (DEF_RVV_WU_OPS): Ditto.
            (DEF_RVV_WF_OPS): Ditto.
            (vint8mf8_t): Ditto.
            (vint8mf4_t): Ditto.
            (vint8mf2_t): Ditto.
            (vint8m1_t): Ditto.
            (vint8m2_t): Ditto.
            (vint8m4_t): Ditto.
            (vint8m8_t): Ditto.
            (vint16mf4_t): Ditto.
            (vint16mf2_t): Ditto.
            (vint16m1_t): Ditto.
            (vint16m2_t): Ditto.
            (vint16m4_t): Ditto.
            (vint16m8_t): Ditto.
            (vint32mf2_t): Ditto.
            (vint32m1_t): Ditto.
            (vint32m2_t): Ditto.
            (vint32m4_t): Ditto.
            (vint32m8_t): Ditto.
            (vuint8mf8_t): Ditto.
            (vuint8mf4_t): Ditto.
            (vuint8mf2_t): Ditto.
            (vuint8m1_t): Ditto.
            (vuint8m2_t): Ditto.
            (vuint8m4_t): Ditto.
            (vuint8m8_t): Ditto.
            (vuint16mf4_t): Ditto.
            (vuint16mf2_t): Ditto.
            (vuint16m1_t): Ditto.
            (vuint16m2_t): Ditto.
            (vuint16m4_t): Ditto.
            (vuint16m8_t): Ditto.
            (vuint32mf2_t): Ditto.
            (vuint32m1_t): Ditto.
            (vuint32m2_t): Ditto.
            (vuint32m4_t): Ditto.
            (vuint32m8_t): Ditto.
            (vfloat32mf2_t): Ditto.
            (vfloat32m1_t): Ditto.
            (vfloat32m2_t): Ditto.
            (vfloat32m4_t): Ditto.
            (vfloat32m8_t): Ditto.
            * config/riscv/riscv-vector-builtins.cc (DEF_RVV_WI_OPS): Ditto.
            (DEF_RVV_WU_OPS): Ditto.
            (DEF_RVV_WF_OPS): Ditto.
            (required_extensions_p): Add reduction support.
            (rvv_arg_type_info::get_base_vector_type): Ditto.
            (rvv_arg_type_info::get_tree_type): Ditto.
            * config/riscv/riscv-vector-builtins.h (enum rvv_base_type): Ditto.
            * config/riscv/riscv.md: Ditto.
            * config/riscv/vector-iterators.md (minu): Ditto.
            * config/riscv/vector.md (@pred_reduc_<reduc><mode><vlmul1>): New patern.
            (@pred_reduc_<reduc><mode><vlmul1_zve32>): Ditto.
            (@pred_widen_reduc_plus<v_su><mode><vwlmul1>): Ditto.
            (@pred_widen_reduc_plus<v_su><mode><vwlmul1_zve32>):Ditto.
            (@pred_reduc_plus<order><mode><vlmul1>): Ditto.
            (@pred_reduc_plus<order><mode><vlmul1_zve32>): Ditto.
            (@pred_widen_reduc_plus<order><mode><vwlmul1>): Ditto.

Diff:
---
 gcc/config/riscv/riscv-vector-builtins-bases.cc    |  90 +++++++++
 gcc/config/riscv/riscv-vector-builtins-bases.h     |  16 ++
 .../riscv/riscv-vector-builtins-functions.def      |  26 ++-
 gcc/config/riscv/riscv-vector-builtins-shapes.cc   |  29 +++
 gcc/config/riscv/riscv-vector-builtins-shapes.h    |   1 +
 gcc/config/riscv/riscv-vector-builtins-types.def   |  65 ++++++
 gcc/config/riscv/riscv-vector-builtins.cc          |  92 ++++++++-
 gcc/config/riscv/riscv-vector-builtins.h           |   4 +-
 gcc/config/riscv/riscv.md                          |   6 +-
 gcc/config/riscv/vector-iterators.md               | 130 +++++++++++-
 gcc/config/riscv/vector.md                         | 223 ++++++++++++++++++++-
 11 files changed, 668 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index bfcfab55bb9..f6ed2e53453 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1283,6 +1283,64 @@ public:
   }
 };
 
+/* Implements reduction instructions.  */
+template<rtx_code CODE>
+class reducop : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+    return e.use_exact_insn (
+      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+  }
+};
+
+/* Implements widen reduction instructions.  */
+template<int UNSPEC>
+class widen_reducop : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+    return e.use_exact_insn (code_for_pred_widen_reduc_plus (UNSPEC,
+							     e.vector_mode (),
+							     e.vector_mode ()));
+  }
+};
+
+/* Implements floating-point reduction instructions.  */
+template<int UNSPEC>
+class freducop : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+    return e.use_exact_insn (
+      code_for_pred_reduc_plus (UNSPEC, e.vector_mode (), e.vector_mode ()));
+  }
+};
+
+/* Implements widening floating-point reduction instructions.  */
+template<int UNSPEC>
+class widen_freducop : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+    return e.use_exact_insn (code_for_pred_widen_reduc_plus (UNSPEC,
+							     e.vector_mode (),
+							     e.vector_mode ()));
+  }
+};
+
 static CONSTEXPR const vsetvl<false> vsetvl_obj;
 static CONSTEXPR const vsetvl<true> vsetvlmax_obj;
 static CONSTEXPR const loadstore<false, LST_UNIT_STRIDE, false> vle_obj;
@@ -1456,6 +1514,22 @@ static CONSTEXPR const vfncvt_rtz_x<FIX> vfncvt_rtz_x_obj;
 static CONSTEXPR const vfncvt_rtz_x<UNSIGNED_FIX> vfncvt_rtz_xu_obj;
 static CONSTEXPR const vfncvt_f vfncvt_f_obj;
 static CONSTEXPR const vfncvt_rod_f vfncvt_rod_f_obj;
+static CONSTEXPR const reducop<PLUS> vredsum_obj;
+static CONSTEXPR const reducop<UMAX> vredmaxu_obj;
+static CONSTEXPR const reducop<SMAX> vredmax_obj;
+static CONSTEXPR const reducop<UMIN> vredminu_obj;
+static CONSTEXPR const reducop<SMIN> vredmin_obj;
+static CONSTEXPR const reducop<AND> vredand_obj;
+static CONSTEXPR const reducop<IOR> vredor_obj;
+static CONSTEXPR const reducop<XOR> vredxor_obj;
+static CONSTEXPR const widen_reducop<UNSPEC_WREDUC_SUM> vwredsum_obj;
+static CONSTEXPR const widen_reducop<UNSPEC_WREDUC_USUM> vwredsumu_obj;
+static CONSTEXPR const freducop<UNSPEC_UNORDERED> vfredusum_obj;
+static CONSTEXPR const freducop<UNSPEC_ORDERED> vfredosum_obj;
+static CONSTEXPR const reducop<SMAX> vfredmax_obj;
+static CONSTEXPR const reducop<SMIN> vfredmin_obj;
+static CONSTEXPR const widen_freducop<UNSPEC_UNORDERED> vfwredusum_obj;
+static CONSTEXPR const widen_freducop<UNSPEC_ORDERED> vfwredosum_obj;
 
 /* Declare the function base NAME, pointing it to an instance
    of class <NAME>_obj.  */
@@ -1635,5 +1709,21 @@ BASE (vfncvt_rtz_x)
 BASE (vfncvt_rtz_xu)
 BASE (vfncvt_f)
 BASE (vfncvt_rod_f)
+BASE (vredsum)
+BASE (vredmaxu)
+BASE (vredmax)
+BASE (vredminu)
+BASE (vredmin)
+BASE (vredand)
+BASE (vredor)
+BASE (vredxor)
+BASE (vwredsum)
+BASE (vwredsumu)
+BASE (vfredusum)
+BASE (vfredosum)
+BASE (vfredmax)
+BASE (vfredmin)
+BASE (vfwredosum)
+BASE (vfwredusum)
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 5583dda3a08..9f0e4675f81 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -203,6 +203,22 @@ extern const function_base *const vfncvt_rtz_x;
 extern const function_base *const vfncvt_rtz_xu;
 extern const function_base *const vfncvt_f;
 extern const function_base *const vfncvt_rod_f;
+extern const function_base *const vredsum;
+extern const function_base *const vredmaxu;
+extern const function_base *const vredmax;
+extern const function_base *const vredminu;
+extern const function_base *const vredmin;
+extern const function_base *const vredand;
+extern const function_base *const vredor;
+extern const function_base *const vredxor;
+extern const function_base *const vwredsum;
+extern const function_base *const vwredsumu;
+extern const function_base *const vfredusum;
+extern const function_base *const vfredosum;
+extern const function_base *const vfredmax;
+extern const function_base *const vfredmin;
+extern const function_base *const vfwredosum;
+extern const function_base *const vfwredusum;
 }
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 1ca0537216b..230b76cd0f2 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -408,7 +408,31 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, u_to_nf_xu_w_ops)
 DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
 DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops)
 
-/* TODO: 14. Vector Reduction Operations.  */
+/* 14. Vector Reduction Operations.  */
+
+// 14.1. Vector Single-Width Integer Reduction Instructions
+DEF_RVV_FUNCTION (vredsum, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredmaxu, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredmax, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredminu, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredmin, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredand, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredor, reduc_alu, no_mu_preds, iu_vs_ops)
+DEF_RVV_FUNCTION (vredxor, reduc_alu, no_mu_preds, iu_vs_ops)
+
+// 14.2. Vector Widening Integer Reduction Instructions
+DEF_RVV_FUNCTION (vwredsum, reduc_alu, no_mu_preds, wi_vs_ops)
+DEF_RVV_FUNCTION (vwredsumu, reduc_alu, no_mu_preds, wu_vs_ops)
+
+// 14.3. Vector Single-Width Floating-Point Reduction Instructions
+DEF_RVV_FUNCTION (vfredusum, reduc_alu, no_mu_preds, f_vs_ops)
+DEF_RVV_FUNCTION (vfredosum, reduc_alu, no_mu_preds, f_vs_ops)
+DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, f_vs_ops)
+DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops)
+
+// 14.4. Vector Widening Floating-Point Reduction Instructions
+DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops)
+DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops)
 
 /* 15. Vector Mask Instructions.  */
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 1fbf0f4e902..b3f5951087d 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -374,6 +374,34 @@ struct mask_alu_def : public build_base
   }
 };
 
+/* reduc_alu_def class.  */
+struct reduc_alu_def : public build_base
+{
+  char *get_name (function_builder &b, const function_instance &instance,
+		  bool overloaded_p) const override
+  {
+    b.append_base_name (instance.base_name);
+
+    /* vop_<op> --> vop<sew>_<op>_<type>.  */
+    if (!overloaded_p)
+      {
+	b.append_name (operand_suffixes[instance.op_info->op]);
+	b.append_name (type_suffixes[instance.type.index].vector);
+	vector_type_index ret_type_idx
+	  = instance.op_info->ret.get_base_vector_type (
+	    builtin_types[instance.type.index].vector);
+	b.append_name (type_suffixes[ret_type_idx].vector);
+      }
+
+    /* According to rvv-intrinsic-doc, it does not add "_m" suffix
+       for vop_m C++ overloaded API.  */
+    if (overloaded_p && instance.pred == PRED_TYPE_m)
+      return b.finish_name ();
+    b.append_name (predication_suffixes[instance.pred]);
+    return b.finish_name ();
+  }
+};
+
 SHAPE(vsetvl, vsetvl)
 SHAPE(vsetvl, vsetvlmax)
 SHAPE(loadstore, loadstore)
@@ -385,5 +413,6 @@ SHAPE(return_mask, return_mask)
 SHAPE(narrow_alu, narrow_alu)
 SHAPE(move, move)
 SHAPE(mask_alu, mask_alu)
+SHAPE(reduc_alu, reduc_alu)
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.h b/gcc/config/riscv/riscv-vector-builtins-shapes.h
index 406abefdb10..85769ea024a 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.h
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.h
@@ -35,6 +35,7 @@ extern const function_shape *const return_mask;
 extern const function_shape *const narrow_alu;
 extern const function_shape *const move;
 extern const function_shape *const mask_alu;
+extern const function_shape *const reduc_alu;
 }
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def
index bb3811d2d90..a15e54c1572 100644
--- a/gcc/config/riscv/riscv-vector-builtins-types.def
+++ b/gcc/config/riscv/riscv-vector-builtins-types.def
@@ -133,6 +133,24 @@ along with GCC; see the file COPYING3. If not see
 #define DEF_RVV_WCONVERT_F_OPS(TYPE, REQUIRE)
 #endif
 
+/* Use "DEF_RVV_WI_OPS" macro include all signed integer can be widened which
+   will be iterated and registered as intrinsic functions.  */
+#ifndef DEF_RVV_WI_OPS
+#define DEF_RVV_WI_OPS(TYPE, REQUIRE)
+#endif
+
+/* Use "DEF_RVV_WU_OPS" macro include all unsigned integer can be widened which
+   will be iterated and registered as intrinsic functions.  */
+#ifndef DEF_RVV_WU_OPS
+#define DEF_RVV_WU_OPS(TYPE, REQUIRE)
+#endif
+
+/* Use "DEF_RVV_WF_OPS" macro include all floating-point can be widened which
+   will be iterated and registered as intrinsic functions.  */
+#ifndef DEF_RVV_WF_OPS
+#define DEF_RVV_WF_OPS(TYPE, REQUIRE)
+#endif
+
 DEF_RVV_I_OPS (vint8mf8_t, RVV_REQUIRE_ZVE64)
 DEF_RVV_I_OPS (vint8mf4_t, 0)
 DEF_RVV_I_OPS (vint8mf2_t, 0)
@@ -345,6 +363,50 @@ DEF_RVV_WCONVERT_F_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_WCONVERT_F_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_WCONVERT_F_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64)
 
+DEF_RVV_WI_OPS (vint8mf8_t, RVV_REQUIRE_ZVE64)
+DEF_RVV_WI_OPS (vint8mf4_t, 0)
+DEF_RVV_WI_OPS (vint8mf2_t, 0)
+DEF_RVV_WI_OPS (vint8m1_t, 0)
+DEF_RVV_WI_OPS (vint8m2_t, 0)
+DEF_RVV_WI_OPS (vint8m4_t, 0)
+DEF_RVV_WI_OPS (vint8m8_t, 0)
+DEF_RVV_WI_OPS (vint16mf4_t, RVV_REQUIRE_ZVE64)
+DEF_RVV_WI_OPS (vint16mf2_t, 0)
+DEF_RVV_WI_OPS (vint16m1_t, 0)
+DEF_RVV_WI_OPS (vint16m2_t, 0)
+DEF_RVV_WI_OPS (vint16m4_t, 0)
+DEF_RVV_WI_OPS (vint16m8_t, 0)
+DEF_RVV_WI_OPS (vint32mf2_t, RVV_REQUIRE_ZVE64)
+DEF_RVV_WI_OPS (vint32m1_t, 0)
+DEF_RVV_WI_OPS (vint32m2_t, 0)
+DEF_RVV_WI_OPS (vint32m4_t, 0)
+DEF_RVV_WI_OPS (vint32m8_t, 0)
+
+DEF_RVV_WU_OPS (vuint8mf8_t, RVV_REQUIRE_ZVE64)
+DEF_RVV_WU_OPS (vuint8mf4_t, 0)
+DEF_RVV_WU_OPS (vuint8mf2_t, 0)
+DEF_RVV_WU_OPS (vuint8m1_t, 0)
+DEF_RVV_WU_OPS (vuint8m2_t, 0)
+DEF_RVV_WU_OPS (vuint8m4_t, 0)
+DEF_RVV_WU_OPS (vuint8m8_t, 0)
+DEF_RVV_WU_OPS (vuint16mf4_t, RVV_REQUIRE_ZVE64)
+DEF_RVV_WU_OPS (vuint16mf2_t, 0)
+DEF_RVV_WU_OPS (vuint16m1_t, 0)
+DEF_RVV_WU_OPS (vuint16m2_t, 0)
+DEF_RVV_WU_OPS (vuint16m4_t, 0)
+DEF_RVV_WU_OPS (vuint16m8_t, 0)
+DEF_RVV_WU_OPS (vuint32mf2_t, RVV_REQUIRE_ZVE64)
+DEF_RVV_WU_OPS (vuint32m1_t, 0)
+DEF_RVV_WU_OPS (vuint32m2_t, 0)
+DEF_RVV_WU_OPS (vuint32m4_t, 0)
+DEF_RVV_WU_OPS (vuint32m8_t, 0)
+
+DEF_RVV_WF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ZVE64)
+DEF_RVV_WF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
+
 #undef DEF_RVV_I_OPS
 #undef DEF_RVV_U_OPS
 #undef DEF_RVV_F_OPS
@@ -363,3 +425,6 @@ DEF_RVV_WCONVERT_F_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64)
 #undef DEF_RVV_WCONVERT_I_OPS
 #undef DEF_RVV_WCONVERT_U_OPS
 #undef DEF_RVV_WCONVERT_F_OPS
+#undef DEF_RVV_WI_OPS
+#undef DEF_RVV_WU_OPS
+#undef DEF_RVV_WF_OPS
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 7858a6d0e86..2e92ece3b64 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -133,6 +133,27 @@ static const rvv_type_info i_ops[] = {
 #include "riscv-vector-builtins-types.def"
   {NUM_VECTOR_TYPES, 0}};
 
+/* A list of all signed integer can be widened will be registered for intrinsic
+ * functions.  */
+static const rvv_type_info wi_ops[] = {
+#define DEF_RVV_WI_OPS(TYPE, REQUIRE) {VECTOR_TYPE_##TYPE, REQUIRE},
+#include "riscv-vector-builtins-types.def"
+  {NUM_VECTOR_TYPES, 0}};
+
+/* A list of all unsigned integer can be widened will be registered for
+ * intrinsic functions.  */
+static const rvv_type_info wu_ops[] = {
+#define DEF_RVV_WU_OPS(TYPE, REQUIRE) {VECTOR_TYPE_##TYPE, REQUIRE},
+#include "riscv-vector-builtins-types.def"
+  {NUM_VECTOR_TYPES, 0}};
+
+/* A list of all floating-point can be widened will be registered for intrinsic
+ * functions.  */
+static const rvv_type_info wf_ops[] = {
+#define DEF_RVV_WF_OPS(TYPE, REQUIRE) {VECTOR_TYPE_##TYPE, REQUIRE},
+#include "riscv-vector-builtins-types.def"
+  {NUM_VECTOR_TYPES, 0}};
+
 /* A list of all signed integer that SEW = 64 require full 'V' extension will be
    registered for intrinsic functions.  */
 static const rvv_type_info full_v_i_ops[] = {
@@ -418,6 +439,17 @@ static CONSTEXPR const rvv_arg_type_info shift_wv_args[]
 static CONSTEXPR const rvv_arg_type_info v_args[]
   = {rvv_arg_type_info (RVV_BASE_vector), rvv_arg_type_info_end};
 
+/* A list of args for vector_type func (vector_type, lmul1_type) function.  */
+static CONSTEXPR const rvv_arg_type_info vs_args[]
+  = {rvv_arg_type_info (RVV_BASE_vector),
+     rvv_arg_type_info (RVV_BASE_lmul1_vector), rvv_arg_type_info_end};
+
+/* A list of args for vector_type func (vector_type, widen_lmul1_type) function.
+ */
+static CONSTEXPR const rvv_arg_type_info wvs_args[]
+  = {rvv_arg_type_info (RVV_BASE_vector),
+     rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), rvv_arg_type_info_end};
+
 /* A list of args for vector_type func (vector_type) function.  */
 static CONSTEXPR const rvv_arg_type_info f_v_args[]
   = {rvv_arg_type_info (RVV_BASE_float_vector), rvv_arg_type_info_end};
@@ -562,6 +594,10 @@ static CONSTEXPR const predication_type_index full_preds[]
   = {PRED_TYPE_none, PRED_TYPE_m,  PRED_TYPE_tu,  PRED_TYPE_tum,
      PRED_TYPE_tumu, PRED_TYPE_mu, NUM_PRED_TYPES};
 
+/* vop/vop_m/vop_tu/vop_tum/ will be registered.  */
+static CONSTEXPR const predication_type_index no_mu_preds[]
+  = {PRED_TYPE_none, PRED_TYPE_m, PRED_TYPE_tu, PRED_TYPE_tum, NUM_PRED_TYPES};
+
 /* vop/vop_tu will be registered.  */
 static CONSTEXPR const predication_type_index none_tu_preds[]
   = {PRED_TYPE_none, PRED_TYPE_tu, NUM_PRED_TYPES};
@@ -1070,6 +1106,46 @@ static CONSTEXPR const rvv_op_info iu_v_ops
      rvv_arg_type_info (RVV_BASE_vector), /* Return type */
      v_args /* Args */};
 
+/* A static operand information for vector_type func (vector_type)
+ * function registration. */
+static CONSTEXPR const rvv_op_info iu_vs_ops
+  = {iu_ops,					/* Types */
+     OP_TYPE_vs,				/* Suffix */
+     rvv_arg_type_info (RVV_BASE_lmul1_vector), /* Return type */
+     vs_args /* Args */};
+
+/* A static operand information for vector_type func (vector_type)
+ * function registration. */
+static CONSTEXPR const rvv_op_info f_vs_ops
+  = {f_ops,					/* Types */
+     OP_TYPE_vs,				/* Suffix */
+     rvv_arg_type_info (RVV_BASE_lmul1_vector), /* Return type */
+     vs_args /* Args */};
+
+/* A static operand information for vector_type func (vector_type)
+ * function registration. */
+static CONSTEXPR const rvv_op_info wi_vs_ops
+  = {wi_ops,					      /* Types */
+     OP_TYPE_vs,				      /* Suffix */
+     rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), /* Return type */
+     wvs_args /* Args */};
+
+/* A static operand information for vector_type func (vector_type)
+ * function registration. */
+static CONSTEXPR const rvv_op_info wu_vs_ops
+  = {wu_ops,					      /* Types */
+     OP_TYPE_vs,				      /* Suffix */
+     rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), /* Return type */
+     wvs_args /* Args */};
+
+/* A static operand information for vector_type func (vector_type)
+ * function registration. */
+static CONSTEXPR const rvv_op_info wf_vs_ops
+  = {wf_ops,					      /* Types */
+     OP_TYPE_vs,				      /* Suffix */
+     rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), /* Return type */
+     wvs_args /* Args */};
+
 /* A static operand information for vector_type func (vector_type)
  * function registration. */
 static CONSTEXPR const rvv_op_info f_v_ops
@@ -1707,7 +1783,8 @@ required_extensions_p (enum rvv_base_type type)
 	 || type == RVV_BASE_uint32_index || type == RVV_BASE_uint64_index
 	 || type == RVV_BASE_float_vector
 	 || type == RVV_BASE_double_trunc_float_vector
-	 || type == RVV_BASE_double_trunc_vector;
+	 || type == RVV_BASE_double_trunc_vector
+	 || type == RVV_BASE_widen_lmul1_vector;
 }
 
 /* Check whether all the RVV_REQUIRE_* values in REQUIRED_EXTENSIONS are
@@ -1822,6 +1899,7 @@ rvv_arg_type_info::get_base_vector_type (tree type) const
   poly_int64 nunits = GET_MODE_NUNITS (TYPE_MODE (type));
   machine_mode inner_mode = GET_MODE_INNER (TYPE_MODE (type));
   poly_int64 bitsize = GET_MODE_BITSIZE (inner_mode);
+  poly_int64 bytesize = GET_MODE_SIZE (inner_mode);
 
   bool unsigned_p = TYPE_UNSIGNED (type);
   if (unsigned_base_type_p (base_type))
@@ -1875,6 +1953,16 @@ rvv_arg_type_info::get_base_vector_type (tree type) const
     case RVV_BASE_unsigned_vector:
       inner_mode = int_mode_for_mode (inner_mode).require ();
       break;
+    case RVV_BASE_lmul1_vector:
+      nunits = exact_div (BYTES_PER_RISCV_VECTOR, bytesize);
+      break;
+    case RVV_BASE_widen_lmul1_vector:
+      inner_mode
+	= get_mode_for_bitsize (bitsize * 2, FLOAT_MODE_P (inner_mode));
+      if (BYTES_PER_RISCV_VECTOR.coeffs[0] < (bytesize * 2).coeffs[0])
+	return NUM_VECTOR_TYPES;
+      nunits = exact_div (BYTES_PER_RISCV_VECTOR, bytesize * 2);
+      break;
     default:
       return NUM_VECTOR_TYPES;
     }
@@ -1963,6 +2051,8 @@ rvv_arg_type_info::get_tree_type (vector_type_index type_idx) const
     case RVV_BASE_double_trunc_float_vector:
     case RVV_BASE_signed_vector:
     case RVV_BASE_unsigned_vector:
+    case RVV_BASE_lmul1_vector:
+    case RVV_BASE_widen_lmul1_vector:
       if (get_base_vector_type (builtin_types[type_idx].vector)
 	  != NUM_VECTOR_TYPES)
 	return builtin_types[get_base_vector_type (
diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h
index db6ab389e64..ede08c6a480 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -164,8 +164,10 @@ enum rvv_base_type
   RVV_BASE_double_trunc_signed_vector,
   RVV_BASE_double_trunc_unsigned_vector,
   RVV_BASE_double_trunc_unsigned_scalar,
-  RVV_BASE_float_vector,
   RVV_BASE_double_trunc_float_vector,
+  RVV_BASE_float_vector,
+  RVV_BASE_lmul1_vector,
+  RVV_BASE_widen_lmul1_vector,
   NUM_BASE_TYPES
 };
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index a5507fadc2d..05924e9bbf1 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -309,9 +309,9 @@
 ;; 14. Vector reduction operations
 ;; vired       vector single-width integer reduction instructions
 ;; viwred      vector widening integer reduction instructions
-;; vfred       vector single-width floating-point un-ordered reduction instruction
+;; vfredu      vector single-width floating-point un-ordered reduction instruction
 ;; vfredo      vector single-width floating-point ordered reduction instruction
-;; vfwred      vector widening floating-point un-ordered reduction instruction
+;; vfwredu      vector widening floating-point un-ordered reduction instruction
 ;; vfwredo     vector widening floating-point ordered reduction instruction
 ;; 15. Vector mask instructions
 ;; vmalu       vector mask-register logical instructions
@@ -344,7 +344,7 @@
    vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,
    vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,
    vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,
-   vired,viwred,vfred,vfredo,vfwred,vfwredo,
+   vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
    vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
    vislide,vislide1,vfslide1,vgather,vcompress,vmov"
   (cond [(eq_attr "got" "load") (const_string "load")
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 127e1b07fcf..cb817abcfde 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -66,6 +66,10 @@
   UNSPEC_VFCVT
   UNSPEC_UNSIGNED_VFCVT
   UNSPEC_ROD
+
+  UNSPEC_REDUC
+  UNSPEC_WREDUC_SUM
+  UNSPEC_WREDUC_USUM
 ])
 
 (define_mode_iterator V [
@@ -93,6 +97,23 @@
   (VNx4DI "TARGET_MIN_VLEN > 32") (VNx8DI "TARGET_MIN_VLEN > 32")
 ])
 
+(define_mode_iterator VI_ZVE32 [
+  VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI
+  VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI
+  VNx1SI VNx2SI VNx4SI VNx8SI
+])
+
+(define_mode_iterator VWI [
+  VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN > 32")
+  VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32")
+  VNx1SI VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32")
+])
+
+(define_mode_iterator VWI_ZVE32 [
+  VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI
+  VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI
+])
+
 (define_mode_iterator VF [
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
@@ -105,6 +126,17 @@
   (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
+(define_mode_iterator VF_ZVE32 [
+  (VNx1SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
+])
+
+(define_mode_iterator VWF [
+  VNx1SF VNx2SF VNx4SF VNx8SF (VNx16SF "TARGET_MIN_VLEN > 32")
+])
+
 (define_mode_iterator VFULLI [
   VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN > 32")
   VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32")
@@ -334,6 +366,96 @@
   (VNx1DF "VNx1SI") (VNx2DF "VNx2SI") (VNx4DF "VNx4SI") (VNx8DF "VNx8SI")
 ])
 
+(define_mode_attr VLMUL1 [
+  (VNx1QI "VNx8QI") (VNx2QI "VNx8QI") (VNx4QI "VNx8QI") 
+  (VNx8QI "VNx8QI") (VNx16QI "VNx8QI") (VNx32QI "VNx8QI") (VNx64QI "VNx8QI")
+  (VNx1HI "VNx4HI") (VNx2HI "VNx4HI") (VNx4HI "VNx4HI") 
+  (VNx8HI "VNx4HI") (VNx16HI "VNx4HI") (VNx32HI "VNx4HI")
+  (VNx1SI "VNx2SI") (VNx2SI "VNx2SI") (VNx4SI "VNx2SI") 
+  (VNx8SI "VNx2SI") (VNx16SI "VNx2SI")
+  (VNx1DI "VNx1DI") (VNx2DI "VNx1DI")
+  (VNx4DI "VNx1DI") (VNx8DI "VNx1DI")
+  (VNx1SF "VNx2SF") (VNx2SF "VNx2SF")
+  (VNx4SF "VNx2SF") (VNx8SF "VNx2SF") (VNx16SF "VNx2SF")
+  (VNx1DF "VNx1DF") (VNx2DF "VNx1DF")
+  (VNx4DF "VNx1DF") (VNx8DF "VNx1DF")
+])
+
+(define_mode_attr VLMUL1_ZVE32 [
+  (VNx1QI "VNx4QI") (VNx2QI "VNx4QI") (VNx4QI "VNx4QI") 
+  (VNx8QI "VNx4QI") (VNx16QI "VNx4QI") (VNx32QI "VNx4QI")
+  (VNx1HI "VNx2HI") (VNx2HI "VNx2HI") (VNx4HI "VNx2HI") 
+  (VNx8HI "VNx2HI") (VNx16HI "VNx2HI")
+  (VNx1SI "VNx1SI") (VNx2SI "VNx1SI") (VNx4SI "VNx1SI") 
+  (VNx8SI "VNx1SI")
+  (VNx1SF "VNx2SF") (VNx2SF "VNx2SF")
+  (VNx4SF "VNx2SF") (VNx8SF "VNx2SF")
+])
+
+(define_mode_attr VWLMUL1 [
+  (VNx1QI "VNx4HI") (VNx2QI "VNx4HI") (VNx4QI "VNx4HI") 
+  (VNx8QI "VNx4HI") (VNx16QI "VNx4HI") (VNx32QI "VNx4HI") (VNx64QI "VNx4HI")
+  (VNx1HI "VNx2SI") (VNx2HI "VNx2SI") (VNx4HI "VNx2SI") 
+  (VNx8HI "VNx2SI") (VNx16HI "VNx2SI") (VNx32HI "VNx2SI")
+  (VNx1SI "VNx1DI") (VNx2SI "VNx1DI") (VNx4SI "VNx1DI") 
+  (VNx8SI "VNx1DI") (VNx16SI "VNx1DI")
+  (VNx1SF "VNx1DF") (VNx2SF "VNx1DF")
+  (VNx4SF "VNx1DF") (VNx8SF "VNx1DF") (VNx16SF "VNx1DF")
+])
+
+(define_mode_attr VWLMUL1_ZVE32 [
+  (VNx1QI "VNx2HI") (VNx2QI "VNx2HI") (VNx4QI "VNx2HI") 
+  (VNx8QI "VNx2HI") (VNx16QI "VNx2HI") (VNx32QI "VNx2HI")
+  (VNx1HI "VNx1SI") (VNx2HI "VNx1SI") (VNx4HI "VNx1SI") 
+  (VNx8HI "VNx1SI") (VNx16HI "VNx1SI")
+])
+
+(define_mode_attr vlmul1 [
+  (VNx1QI "vnx8qi") (VNx2QI "vnx8qi") (VNx4QI "vnx8qi") 
+  (VNx8QI "vnx8qi") (VNx16QI "vnx8qi") (VNx32QI "vnx8qi") (VNx64QI "vnx8qi")
+  (VNx1HI "vnx4hi") (VNx2HI "vnx4hi") (VNx4HI "vnx4hi") 
+  (VNx8HI "vnx4hi") (VNx16HI "vnx4hi") (VNx32HI "vnx4hi")
+  (VNx1SI "vnx2si") (VNx2SI "vnx2si") (VNx4SI "vnx2si") 
+  (VNx8SI "vnx2si") (VNx16SI "vnx2si")
+  (VNx1DI "vnx1DI") (VNx2DI "vnx1DI")
+  (VNx4DI "vnx1DI") (VNx8DI "vnx1DI")
+  (VNx1SF "vnx2sf") (VNx2SF "vnx2sf")
+  (VNx4SF "vnx2sf") (VNx8SF "vnx2sf") (VNx16SF "vnx2sf")
+  (VNx1DF "vnx1df") (VNx2DF "vnx1df")
+  (VNx4DF "vnx1df") (VNx8DF "vnx1df")
+])
+
+(define_mode_attr vlmul1_zve32 [
+  (VNx1QI "vnx4qi") (VNx2QI "vnx4qi") (VNx4QI "vnx4qi") 
+  (VNx8QI "vnx4qi") (VNx16QI "vnx4qi") (VNx32QI "vnx4qi")
+  (VNx1HI "vnx2hi") (VNx2HI "vnx2hi") (VNx4HI "vnx2hi") 
+  (VNx8HI "vnx2hi") (VNx16HI "vnx2hi")
+  (VNx1SI "vnx1si") (VNx2SI "vnx1si") (VNx4SI "vnx1si") 
+  (VNx8SI "vnx1si")
+  (VNx1SF "vnx1sf") (VNx2SF "vnx1sf")
+  (VNx4SF "vnx1sf") (VNx8SF "vnx1sf")
+])
+
+(define_mode_attr vwlmul1 [
+  (VNx1QI "vnx4hi") (VNx2QI "vnx4hi") (VNx4QI "vnx4hi") 
+  (VNx8QI "vnx4hi") (VNx16QI "vnx4hi") (VNx32QI "vnx4hi") (VNx64QI "vnx4hi")
+  (VNx1HI "vnx2si") (VNx2HI "vnx2si") (VNx4HI "vnx2si") 
+  (VNx8HI "vnx2si") (VNx16HI "vnx2si") (VNx32HI "vnx2SI")
+  (VNx1SI "vnx2di") (VNx2SI "vnx2di") (VNx4SI "vnx2di") 
+  (VNx8SI "vnx2di") (VNx16SI "vnx2di")
+  (VNx1SF "vnx1df") (VNx2SF "vnx1df")
+  (VNx4SF "vnx1df") (VNx8SF "vnx1df") (VNx16SF "vnx1df")
+])
+
+(define_mode_attr vwlmul1_zve32 [
+  (VNx1QI "vnx2hi") (VNx2QI "vnx2hi") (VNx4QI "vnx2hi") 
+  (VNx8QI "vnx2hi") (VNx16QI "vnx2hi") (VNx32QI "vnx2hi")
+  (VNx1HI "vnx1si") (VNx2HI "vnx1si") (VNx4HI "vnx1si") 
+  (VNx8HI "vnx1si") (VNx16HI "vnx1SI")
+])
+
+(define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM])
+
 (define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED])
 
 (define_int_iterator VMULH [UNSPEC_VMULHS UNSPEC_VMULHU UNSPEC_VMULHSU])
@@ -360,7 +482,8 @@
 
 (define_int_attr v_su [(UNSPEC_VMULHS "") (UNSPEC_VMULHU "u") (UNSPEC_VMULHSU "su")
 		       (UNSPEC_VNCLIP "") (UNSPEC_VNCLIPU "u")
-		       (UNSPEC_VFCVT "") (UNSPEC_UNSIGNED_VFCVT "u")])
+		       (UNSPEC_VFCVT "") (UNSPEC_UNSIGNED_VFCVT "u")
+		       (UNSPEC_WREDUC_SUM "") (UNSPEC_WREDUC_USUM "u")])
 (define_int_attr sat_op [(UNSPEC_VAADDU "aaddu") (UNSPEC_VAADD "aadd")
 			 (UNSPEC_VASUBU "asubu") (UNSPEC_VASUB "asub")
 			 (UNSPEC_VSMUL "smul") (UNSPEC_VSSRL "ssrl")
@@ -418,6 +541,11 @@
 
 (define_code_iterator any_fix [fix unsigned_fix])
 (define_code_iterator any_float [float unsigned_float])
+(define_code_iterator any_reduc [plus umax smax umin smin and ior xor])
+(define_code_iterator any_freduc [smax smin])
+(define_code_attr reduc [(plus "sum") (umax "maxu") (smax "max") (umin "minu")
+			 (smin "min") (and "and") (ior "or") (xor "xor")])
+
 (define_code_attr fix_cvt [(fix "fix_trunc") (unsigned_fix "fixuns_trunc")])
 (define_code_attr float_cvt [(float "float") (unsigned_float "floatuns")])
 
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 51647386e0e..69b7cafbf17 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -48,7 +48,7 @@
 			  vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\
 			  vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,\
 			  vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\
-			  vired,viwred,vfred,vfredo,vfwred,vfwredo,\
+			  vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\
 			  vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,\
 			  vislide,vislide1,vfslide1,vgather,vcompress")
 	 (const_string "true")]
@@ -68,7 +68,7 @@
 			  vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\
 			  vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,\
 			  vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\
-			  vired,viwred,vfred,vfredo,vfwred,vfwredo,\
+			  vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\
 			  vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovxv,vfmovfv,\
 			  vislide,vislide1,vfslide1,vgather,vcompress")
 	 (const_string "true")]
@@ -151,7 +151,8 @@
 			  vfwalu,vfwmul,vfsqrt,vfrecp,vfsgnj,vfcmp,\
 			  vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\
 			  vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,\
-			  vfncvtftof,vfmuladd,vfwmuladd,vfclass")
+			  vfncvtftof,vfmuladd,vfwmuladd,vfclass,vired,
+			  viwred,vfredu,vfredo,vfwredu,vfwredo")
 	   (const_int INVALID_ATTRIBUTE)
 	 (eq_attr "mode" "VNx1QI,VNx1BI")
 	   (symbol_ref "riscv_vector::get_ratio(E_VNx1QImode)")
@@ -206,7 +207,8 @@
 				viwmul,vnshift,vaalu,vsmul,vsshift,vnclip,vmsfs,\
 				vmiota,vmidx,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\
 				vfsqrt,vfrecp,vfsgnj,vfcmp,vfcvtitof,vfcvtftoi,vfwcvtitof,\
-				vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass")
+				vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass,\
+				vired,viwred,vfredu,vfredo,vfwredu,vfwredo")
 	       (const_int 2)
 
 	       (eq_attr "type" "vimerge,vfmerge")
@@ -234,7 +236,7 @@
 	 (eq_attr "type" "vldux,vldox,vialu,vshift,viminmax,vimul,vidiv,vsalu,\
 			  viwalu,viwmul,vnshift,vimerge,vaalu,vsmul,\
 			  vsshift,vnclip,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\
-			  vfsgnj,vfmerge")
+			  vfsgnj,vfmerge,vired,viwred,vfredu,vfredo,vfwredu,vfwredo")
 	   (const_int 5)
 
 	 (eq_attr "type" "vicmp,vimuladd,viwmuladd,vfcmp,vfmuladd,vfwmuladd")
@@ -261,7 +263,8 @@
 	 (eq_attr "type" "vldux,vldox,vialu,vshift,viminmax,vimul,vidiv,vsalu,\
 			  viwalu,viwmul,vnshift,vimerge,vaalu,vsmul,\
 			  vsshift,vnclip,vfalu,vfmul,vfminmax,vfdiv,\
-			  vfwalu,vfwmul,vfsgnj,vfmerge")
+			  vfwalu,vfwmul,vfsgnj,vfmerge,vired,viwred,vfredu,\
+			  vfredo,vfwredu,vfwredo")
 	   (symbol_ref "riscv_vector::get_ta(operands[6])")
 
 	 (eq_attr "type" "vimuladd,viwmuladd,vfmuladd,vfwmuladd")
@@ -302,7 +305,8 @@
 (define_attr "avl_type" ""
   (cond [(eq_attr "type" "vlde,vlde,vste,vimov,vimov,vimov,vfmov,vext,vimerge,\
 			  vfsqrt,vfrecp,vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\
-			  vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass")
+			  vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\
+			  vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo")
 	   (symbol_ref "INTVAL (operands[7])")
 	 (eq_attr "type" "vldm,vstm,vimov,vmalu,vmalu")
 	   (symbol_ref "INTVAL (operands[5])")
@@ -6181,3 +6185,208 @@
   "vfncvt.rod.f.f.w\t%0,%3%p1"
   [(set_attr "type" "vfncvtftof")
    (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; -------------------------------------------------------------------------------
+;; ---- Predicated reduction operations
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - 14.1 Vector Single-Width Integer Reduction Instructions
+;; - 14.2 Vector Widening Integer Reduction Instructions
+;; - 14.3 Vector Single-Width Floating-Point Reduction Instructions
+;; - 14.4 Vector Widening Floating-Point Reduction Instructions
+;; -------------------------------------------------------------------------------
+
+;; For reduction operations, we should have seperate patterns for
+;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.
+;; Since reduction need LMUL = 1 scalar operand as the input operand
+;; and they are different.
+;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode
+;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*
+(define_insn "@pred_reduc_<reduc><mode><vlmul1>"
+  [(set (match_operand:<VLMUL1> 0 "register_operand"          "=vd, vr")
+	(unspec:<VLMUL1>
+	  [(unspec:<VM>
+	     [(match_operand:<VM> 1 "vector_mask_operand"     " vm,Wc1")
+	      (match_operand 5 "vector_length_operand"        " rK, rK")
+	      (match_operand 6 "const_int_operand"            "  i,  i")
+	      (match_operand 7 "const_int_operand"            "  i,  i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	   (any_reduc:VI
+	     (vec_duplicate:VI
+	       (vec_select:<VEL>
+	         (match_operand:<VLMUL1> 4 "register_operand" " vr, vr")
+	         (parallel [(const_int 0)])))
+	     (match_operand:VI 3 "register_operand"           " vr, vr"))
+	   (match_operand:<VLMUL1> 2 "vector_merge_operand"   "0vu,0vu")] UNSPEC_REDUC))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN > 32"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vired")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
+  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vr")
+	(unspec:<VLMUL1_ZVE32>
+	  [(unspec:<VM>
+	     [(match_operand:<VM> 1 "vector_mask_operand"           " vm,Wc1")
+	      (match_operand 5 "vector_length_operand"              " rK, rK")
+	      (match_operand 6 "const_int_operand"                  "  i,  i")
+	      (match_operand 7 "const_int_operand"                  "  i,  i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	   (any_reduc:VI_ZVE32
+	     (vec_duplicate:VI_ZVE32
+	       (vec_select:<VEL>
+	         (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr")
+	         (parallel [(const_int 0)])))
+	     (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr"))
+	   (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   "0vu,0vu")] UNSPEC_REDUC))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vired")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
+  [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr")
+	(unspec:<VWLMUL1>
+	  [(unspec:<VM>
+	     [(match_operand:<VM> 1 "vector_mask_operand"      "vmWc1")
+	      (match_operand 5 "vector_length_operand"         "   rK")
+	      (match_operand 6 "const_int_operand"             "    i")
+	      (match_operand 7 "const_int_operand"             "    i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	   (match_operand:VWI 3 "register_operand"             "   vr")
+	   (match_operand:<VWLMUL1> 4 "register_operand"       "   vr")
+	   (match_operand:<VWLMUL1> 2 "vector_merge_operand"   "  0vu")] WREDUC))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN > 32"
+  "vwredsum<v_su>.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "viwred")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1_zve32>"
+  [(set (match_operand:<VWLMUL1_ZVE32> 0 "register_operand"           "=&vr")
+	(unspec:<VWLMUL1_ZVE32>
+	  [(unspec:<VM>
+	     [(match_operand:<VM> 1 "vector_mask_operand"            "vmWc1")
+	      (match_operand 5 "vector_length_operand"               "   rK")
+	      (match_operand 6 "const_int_operand"                   "    i")
+	      (match_operand 7 "const_int_operand"                   "    i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	   (match_operand:VWI_ZVE32 3 "register_operand"             "   vr")
+	   (match_operand:<VWLMUL1_ZVE32> 4 "register_operand"       "   vr")
+	   (match_operand:<VWLMUL1_ZVE32> 2 "vector_merge_operand"   "  0vu")] WREDUC))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+  "vwredsum<v_su>.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "viwred")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_reduc_<reduc><mode><vlmul1>"
+  [(set (match_operand:<VLMUL1> 0 "register_operand"          "=vd, vr")
+	(unspec:<VLMUL1>
+	  [(unspec:<VM>
+	     [(match_operand:<VM> 1 "vector_mask_operand"      " vm,Wc1")
+	      (match_operand 5 "vector_length_operand"         " rK, rK")
+	      (match_operand 6 "const_int_operand"             "  i,  i")
+	      (match_operand 7 "const_int_operand"             "  i,  i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	   (any_freduc:VF
+	     (vec_duplicate:VF
+	       (vec_select:<VEL>
+	         (match_operand:<VLMUL1> 4 "register_operand" " vr, vr")
+	         (parallel [(const_int 0)])))
+	     (match_operand:VF 3 "register_operand"           " vr, vr"))
+	   (match_operand:<VLMUL1> 2 "vector_merge_operand"   "0vu,0vu")] UNSPEC_REDUC))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN > 32"
+  "vfred<reduc>.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vfredu")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
+  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vr")
+	(unspec:<VLMUL1_ZVE32>
+	  [(unspec:<VM>
+	     [(match_operand:<VM> 1 "vector_mask_operand"           " vm,Wc1")
+	      (match_operand 5 "vector_length_operand"              " rK, rK")
+	      (match_operand 6 "const_int_operand"                  "  i,  i")
+	      (match_operand 7 "const_int_operand"                  "  i,  i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	   (any_freduc:VF_ZVE32
+	     (vec_duplicate:VF_ZVE32
+	       (vec_select:<VEL>
+	         (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr")
+	         (parallel [(const_int 0)])))
+	     (match_operand:VF_ZVE32 3 "register_operand"           " vr, vr"))
+	   (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   "0vu,0vu")] UNSPEC_REDUC))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+  "vfred<reduc>.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vfredu")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_reduc_plus<order><mode><vlmul1>"
+  [(set (match_operand:<VLMUL1> 0 "register_operand"          "=vd, vr")
+	(unspec:<VLMUL1>
+	  [(unspec:<VLMUL1>
+	    [(unspec:<VM>
+	       [(match_operand:<VM> 1 "vector_mask_operand"      " vm,Wc1")
+	        (match_operand 5 "vector_length_operand"         " rK, rK")
+	        (match_operand 6 "const_int_operand"             "  i,  i")
+	        (match_operand 7 "const_int_operand"             "  i,  i")
+	        (reg:SI VL_REGNUM)
+	        (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	     (plus:VF
+	       (vec_duplicate:VF
+	         (vec_select:<VEL>
+	           (match_operand:<VLMUL1> 4 "register_operand" " vr, vr")
+	           (parallel [(const_int 0)])))
+	       (match_operand:VF 3 "register_operand"           " vr, vr"))
+	     (match_operand:<VLMUL1> 2 "vector_merge_operand"   "0vu,0vu")] UNSPEC_REDUC)] ORDER))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN > 32"
+  "vfred<order>sum.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vfred<order>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_reduc_plus<order><mode><vlmul1_zve32>"
+  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vr")
+	(unspec:<VLMUL1_ZVE32>
+	  [(unspec:<VLMUL1_ZVE32>
+	    [(unspec:<VM>
+	       [(match_operand:<VM> 1 "vector_mask_operand"           " vm,Wc1")
+	        (match_operand 5 "vector_length_operand"              " rK, rK")
+	        (match_operand 6 "const_int_operand"                  "  i,  i")
+	        (match_operand 7 "const_int_operand"                  "  i,  i")
+	        (reg:SI VL_REGNUM)
+	        (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	     (plus:VF_ZVE32
+	       (vec_duplicate:VF_ZVE32
+	         (vec_select:<VEL>
+	           (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr")
+	           (parallel [(const_int 0)])))
+	       (match_operand:VF_ZVE32 3 "register_operand"           " vr, vr"))
+	     (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   "0vu,0vu")] UNSPEC_REDUC)] ORDER))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+  "vfred<order>sum.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vfred<order>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "@pred_widen_reduc_plus<order><mode><vwlmul1>"
+  [(set (match_operand:<VWLMUL1> 0 "register_operand"             "=&vr")
+	(unspec:<VWLMUL1>
+	  [(unspec:<VWLMUL1>
+	    [(unspec:<VM>
+	       [(match_operand:<VM> 1 "vector_mask_operand"      "vmWc1")
+	        (match_operand 5 "vector_length_operand"         "   rK")
+	        (match_operand 6 "const_int_operand"             "    i")
+	        (match_operand 7 "const_int_operand"             "    i")
+	        (reg:SI VL_REGNUM)
+	        (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	     (match_operand:VWF 3 "register_operand"             "   vr")
+	     (match_operand:<VWLMUL1> 4 "register_operand"       "   vr")
+	     (match_operand:<VWLMUL1> 2 "vector_merge_operand"   "  0vu")] UNSPEC_WREDUC_SUM)] ORDER))]
+  "TARGET_VECTOR && TARGET_MIN_VLEN > 32"
+  "vfwred<order>sum.vs\t%0,%3,%4%p1"
+  [(set_attr "type" "vfwred<order>")
+   (set_attr "mode" "<MODE>")])

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-02-22 13:44 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-22 13:44 [gcc r13-6277] RISC-V: Add RVV reduction C/C++ intrinsics support Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).