From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbgau1.qq.com (smtpbgau1.qq.com [54.206.16.166]) by sourceware.org (Postfix) with ESMTPS id 03A853851AB8 for ; Mon, 20 Feb 2023 09:59:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 03A853851AB8 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp79t1676876088tl6klrac Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Mon, 20 Feb 2023 14:54:47 +0800 (CST) X-QQ-SSF: 01400000000000E0M000000A0000000 X-QQ-FEAT: +oIWmpEafD/g13L7KuQZwzqQNmQnuyTM60eQ1b8C6mMruF+NfnPCCsAiu/BTj NsdNni+xj0hly6QavXnMMnvLH61j7CroJMbHn5doigvYS+XdvInn03xu7/cGGeRLcdlPcQY TzchO0MkTancuo8kFXIWU2+Ro2pahQo3VLddUYoQbalC81RnmeC0T72laUH3qC4qrkt2wLi sUt//f2ugFFeoKV+7ThV00DnGLxPTZSYbXBqHWEQ6pJVajMVKfxrkBsPGUCflUviBCI/BmG +UPfCXMYieLM67m+VYLqp+q39t5PYANNgfYaU2drAx5vGC+ANj8G0A05clndg7wbwlNbMX9 XoTTC87WGj2GfzEauyO5tj4UoH6JP1pVWG3jDBY5geY5qC2FxGpRNwEP6sQiLriAWnXsk/7 ap9fyK7gCuQ= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, Ju-Zhe Zhong Subject: [PATCH] RISC-V: Add RVV reduction C/C++ intrinsics support Date: Mon, 20 Feb 2023 14:54:45 +0800 Message-Id: <20230220065445.207902-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_DMARC_STATUS,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Ju-Zhe Zhong gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class reducop): New class. (class widen_reducop): Ditto. (class freducop): Ditto. (class widen_freducop): Ditto. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vredsum): Add reduction support. (vredmaxu): Ditto. (vredmax): Ditto. (vredminu): Ditto. (vredmin): Ditto. (vredand): Ditto. (vredor): Ditto. (vredxor): Ditto. (vwredsum): Ditto. (vwredsumu): Ditto. (vfredusum): Ditto. (vfredosum): Ditto. (vfredmax): Ditto. (vfredmin): Ditto. (vfwredosum): Ditto. (vfwredusum): Ditto. * config/riscv/riscv-vector-builtins-shapes.cc (struct reduc_alu_def): Ditto. (SHAPE): Ditto. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. * config/riscv/riscv-vector-builtins-types.def (DEF_RVV_WI_OPS): New macro. (DEF_RVV_WU_OPS): Ditto. (DEF_RVV_WF_OPS): Ditto. (vint8mf8_t): Ditto. (vint8mf4_t): Ditto. (vint8mf2_t): Ditto. (vint8m1_t): Ditto. (vint8m2_t): Ditto. (vint8m4_t): Ditto. (vint8m8_t): Ditto. (vint16mf4_t): Ditto. (vint16mf2_t): Ditto. (vint16m1_t): Ditto. (vint16m2_t): Ditto. (vint16m4_t): Ditto. (vint16m8_t): Ditto. (vint32mf2_t): Ditto. (vint32m1_t): Ditto. (vint32m2_t): Ditto. (vint32m4_t): Ditto. (vint32m8_t): Ditto. (vuint8mf8_t): Ditto. (vuint8mf4_t): Ditto. (vuint8mf2_t): Ditto. (vuint8m1_t): Ditto. (vuint8m2_t): Ditto. (vuint8m4_t): Ditto. (vuint8m8_t): Ditto. (vuint16mf4_t): Ditto. (vuint16mf2_t): Ditto. (vuint16m1_t): Ditto. (vuint16m2_t): Ditto. (vuint16m4_t): Ditto. (vuint16m8_t): Ditto. (vuint32mf2_t): Ditto. (vuint32m1_t): Ditto. (vuint32m2_t): Ditto. (vuint32m4_t): Ditto. (vuint32m8_t): Ditto. (vfloat32mf2_t): Ditto. (vfloat32m1_t): Ditto. (vfloat32m2_t): Ditto. (vfloat32m4_t): Ditto. (vfloat32m8_t): Ditto. * config/riscv/riscv-vector-builtins.cc (DEF_RVV_WI_OPS): Ditto. (DEF_RVV_WU_OPS): Ditto. (DEF_RVV_WF_OPS): Ditto. (required_extensions_p): Add reduction support. (rvv_arg_type_info::get_base_vector_type): Ditto. (rvv_arg_type_info::get_tree_type): Ditto. * config/riscv/riscv-vector-builtins.h (enum rvv_base_type): Ditto. * config/riscv/riscv.md: Ditto. * config/riscv/vector-iterators.md (minu): Ditto. * config/riscv/vector.md (@pred_reduc_): New patern. (@pred_reduc_): Ditto. (@pred_widen_reduc_plus): Ditto. (@pred_widen_reduc_plus):Ditto. (@pred_reduc_plus): Ditto. (@pred_reduc_plus): Ditto. (@pred_widen_reduc_plus): Ditto. --- .../riscv/riscv-vector-builtins-bases.cc | 90 +++++++ .../riscv/riscv-vector-builtins-bases.h | 16 ++ .../riscv/riscv-vector-builtins-functions.def | 26 +- .../riscv/riscv-vector-builtins-shapes.cc | 29 +++ .../riscv/riscv-vector-builtins-shapes.h | 1 + .../riscv/riscv-vector-builtins-types.def | 65 +++++ gcc/config/riscv/riscv-vector-builtins.cc | 92 +++++++- gcc/config/riscv/riscv-vector-builtins.h | 4 +- gcc/config/riscv/riscv.md | 6 +- gcc/config/riscv/vector-iterators.md | 130 +++++++++- gcc/config/riscv/vector.md | 223 +++++++++++++++++- 11 files changed, 668 insertions(+), 14 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index bfcfab55bb9..f6ed2e53453 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -1283,6 +1283,64 @@ public: } }; +/* Implements reduction instructions. */ +template +class reducop : public function_base +{ +public: + bool apply_mask_policy_p () const override { return false; } + + rtx expand (function_expander &e) const override + { + return e.use_exact_insn ( + code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ())); + } +}; + +/* Implements widen reduction instructions. */ +template +class widen_reducop : public function_base +{ +public: + bool apply_mask_policy_p () const override { return false; } + + rtx expand (function_expander &e) const override + { + return e.use_exact_insn (code_for_pred_widen_reduc_plus (UNSPEC, + e.vector_mode (), + e.vector_mode ())); + } +}; + +/* Implements floating-point reduction instructions. */ +template +class freducop : public function_base +{ +public: + bool apply_mask_policy_p () const override { return false; } + + rtx expand (function_expander &e) const override + { + return e.use_exact_insn ( + code_for_pred_reduc_plus (UNSPEC, e.vector_mode (), e.vector_mode ())); + } +}; + +/* Implements widening floating-point reduction instructions. */ +template +class widen_freducop : public function_base +{ +public: + bool apply_mask_policy_p () const override { return false; } + + rtx expand (function_expander &e) const override + { + return e.use_exact_insn (code_for_pred_widen_reduc_plus (UNSPEC, + e.vector_mode (), + e.vector_mode ())); + } +}; + static CONSTEXPR const vsetvl vsetvl_obj; static CONSTEXPR const vsetvl vsetvlmax_obj; static CONSTEXPR const loadstore vle_obj; @@ -1456,6 +1514,22 @@ static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_x_obj; static CONSTEXPR const vfncvt_rtz_x vfncvt_rtz_xu_obj; static CONSTEXPR const vfncvt_f vfncvt_f_obj; static CONSTEXPR const vfncvt_rod_f vfncvt_rod_f_obj; +static CONSTEXPR const reducop vredsum_obj; +static CONSTEXPR const reducop vredmaxu_obj; +static CONSTEXPR const reducop vredmax_obj; +static CONSTEXPR const reducop vredminu_obj; +static CONSTEXPR const reducop vredmin_obj; +static CONSTEXPR const reducop vredand_obj; +static CONSTEXPR const reducop vredor_obj; +static CONSTEXPR const reducop vredxor_obj; +static CONSTEXPR const widen_reducop vwredsum_obj; +static CONSTEXPR const widen_reducop vwredsumu_obj; +static CONSTEXPR const freducop vfredusum_obj; +static CONSTEXPR const freducop vfredosum_obj; +static CONSTEXPR const reducop vfredmax_obj; +static CONSTEXPR const reducop vfredmin_obj; +static CONSTEXPR const widen_freducop vfwredusum_obj; +static CONSTEXPR const widen_freducop vfwredosum_obj; /* Declare the function base NAME, pointing it to an instance of class _obj. */ @@ -1635,5 +1709,21 @@ BASE (vfncvt_rtz_x) BASE (vfncvt_rtz_xu) BASE (vfncvt_f) BASE (vfncvt_rod_f) +BASE (vredsum) +BASE (vredmaxu) +BASE (vredmax) +BASE (vredminu) +BASE (vredmin) +BASE (vredand) +BASE (vredor) +BASE (vredxor) +BASE (vwredsum) +BASE (vwredsumu) +BASE (vfredusum) +BASE (vfredosum) +BASE (vfredmax) +BASE (vfredmin) +BASE (vfwredosum) +BASE (vfwredusum) } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h b/gcc/config/riscv/riscv-vector-builtins-bases.h index 5583dda3a08..9f0e4675f81 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.h +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h @@ -203,6 +203,22 @@ extern const function_base *const vfncvt_rtz_x; extern const function_base *const vfncvt_rtz_xu; extern const function_base *const vfncvt_f; extern const function_base *const vfncvt_rod_f; +extern const function_base *const vredsum; +extern const function_base *const vredmaxu; +extern const function_base *const vredmax; +extern const function_base *const vredminu; +extern const function_base *const vredmin; +extern const function_base *const vredand; +extern const function_base *const vredor; +extern const function_base *const vredxor; +extern const function_base *const vwredsum; +extern const function_base *const vwredsumu; +extern const function_base *const vfredusum; +extern const function_base *const vfredosum; +extern const function_base *const vfredmax; +extern const function_base *const vfredmin; +extern const function_base *const vfwredosum; +extern const function_base *const vfwredusum; } } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index 1ca0537216b..230b76cd0f2 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -408,7 +408,31 @@ DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, u_to_nf_xu_w_ops) DEF_RVV_FUNCTION (vfncvt_f, narrow_alu, full_preds, f_to_nf_f_w_ops) DEF_RVV_FUNCTION (vfncvt_rod_f, narrow_alu, full_preds, f_to_nf_f_w_ops) -/* TODO: 14. Vector Reduction Operations. */ +/* 14. Vector Reduction Operations. */ + +// 14.1. Vector Single-Width Integer Reduction Instructions +DEF_RVV_FUNCTION (vredsum, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredmaxu, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredmax, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredminu, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredmin, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredand, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredor, reduc_alu, no_mu_preds, iu_vs_ops) +DEF_RVV_FUNCTION (vredxor, reduc_alu, no_mu_preds, iu_vs_ops) + +// 14.2. Vector Widening Integer Reduction Instructions +DEF_RVV_FUNCTION (vwredsum, reduc_alu, no_mu_preds, wi_vs_ops) +DEF_RVV_FUNCTION (vwredsumu, reduc_alu, no_mu_preds, wu_vs_ops) + +// 14.3. Vector Single-Width Floating-Point Reduction Instructions +DEF_RVV_FUNCTION (vfredusum, reduc_alu, no_mu_preds, f_vs_ops) +DEF_RVV_FUNCTION (vfredosum, reduc_alu, no_mu_preds, f_vs_ops) +DEF_RVV_FUNCTION (vfredmax, reduc_alu, no_mu_preds, f_vs_ops) +DEF_RVV_FUNCTION (vfredmin, reduc_alu, no_mu_preds, f_vs_ops) + +// 14.4. Vector Widening Floating-Point Reduction Instructions +DEF_RVV_FUNCTION (vfwredosum, reduc_alu, no_mu_preds, wf_vs_ops) +DEF_RVV_FUNCTION (vfwredusum, reduc_alu, no_mu_preds, wf_vs_ops) /* 15. Vector Mask Instructions. */ diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc index 1fbf0f4e902..b3f5951087d 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc @@ -374,6 +374,34 @@ struct mask_alu_def : public build_base } }; +/* reduc_alu_def class. */ +struct reduc_alu_def : public build_base +{ + char *get_name (function_builder &b, const function_instance &instance, + bool overloaded_p) const override + { + b.append_base_name (instance.base_name); + + /* vop_ --> vop__. */ + if (!overloaded_p) + { + b.append_name (operand_suffixes[instance.op_info->op]); + b.append_name (type_suffixes[instance.type.index].vector); + vector_type_index ret_type_idx + = instance.op_info->ret.get_base_vector_type ( + builtin_types[instance.type.index].vector); + b.append_name (type_suffixes[ret_type_idx].vector); + } + + /* According to rvv-intrinsic-doc, it does not add "_m" suffix + for vop_m C++ overloaded API. */ + if (overloaded_p && instance.pred == PRED_TYPE_m) + return b.finish_name (); + b.append_name (predication_suffixes[instance.pred]); + return b.finish_name (); + } +}; + SHAPE(vsetvl, vsetvl) SHAPE(vsetvl, vsetvlmax) SHAPE(loadstore, loadstore) @@ -385,5 +413,6 @@ SHAPE(return_mask, return_mask) SHAPE(narrow_alu, narrow_alu) SHAPE(move, move) SHAPE(mask_alu, mask_alu) +SHAPE(reduc_alu, reduc_alu) } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.h b/gcc/config/riscv/riscv-vector-builtins-shapes.h index 406abefdb10..85769ea024a 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.h +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.h @@ -35,6 +35,7 @@ extern const function_shape *const return_mask; extern const function_shape *const narrow_alu; extern const function_shape *const move; extern const function_shape *const mask_alu; +extern const function_shape *const reduc_alu; } } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def index bb3811d2d90..a15e54c1572 100644 --- a/gcc/config/riscv/riscv-vector-builtins-types.def +++ b/gcc/config/riscv/riscv-vector-builtins-types.def @@ -133,6 +133,24 @@ along with GCC; see the file COPYING3. If not see #define DEF_RVV_WCONVERT_F_OPS(TYPE, REQUIRE) #endif +/* Use "DEF_RVV_WI_OPS" macro include all signed integer can be widened which + will be iterated and registered as intrinsic functions. */ +#ifndef DEF_RVV_WI_OPS +#define DEF_RVV_WI_OPS(TYPE, REQUIRE) +#endif + +/* Use "DEF_RVV_WU_OPS" macro include all unsigned integer can be widened which + will be iterated and registered as intrinsic functions. */ +#ifndef DEF_RVV_WU_OPS +#define DEF_RVV_WU_OPS(TYPE, REQUIRE) +#endif + +/* Use "DEF_RVV_WF_OPS" macro include all floating-point can be widened which + will be iterated and registered as intrinsic functions. */ +#ifndef DEF_RVV_WF_OPS +#define DEF_RVV_WF_OPS(TYPE, REQUIRE) +#endif + DEF_RVV_I_OPS (vint8mf8_t, RVV_REQUIRE_ZVE64) DEF_RVV_I_OPS (vint8mf4_t, 0) DEF_RVV_I_OPS (vint8mf2_t, 0) @@ -345,6 +363,50 @@ DEF_RVV_WCONVERT_F_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64) DEF_RVV_WCONVERT_F_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64) DEF_RVV_WCONVERT_F_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64) +DEF_RVV_WI_OPS (vint8mf8_t, RVV_REQUIRE_ZVE64) +DEF_RVV_WI_OPS (vint8mf4_t, 0) +DEF_RVV_WI_OPS (vint8mf2_t, 0) +DEF_RVV_WI_OPS (vint8m1_t, 0) +DEF_RVV_WI_OPS (vint8m2_t, 0) +DEF_RVV_WI_OPS (vint8m4_t, 0) +DEF_RVV_WI_OPS (vint8m8_t, 0) +DEF_RVV_WI_OPS (vint16mf4_t, RVV_REQUIRE_ZVE64) +DEF_RVV_WI_OPS (vint16mf2_t, 0) +DEF_RVV_WI_OPS (vint16m1_t, 0) +DEF_RVV_WI_OPS (vint16m2_t, 0) +DEF_RVV_WI_OPS (vint16m4_t, 0) +DEF_RVV_WI_OPS (vint16m8_t, 0) +DEF_RVV_WI_OPS (vint32mf2_t, RVV_REQUIRE_ZVE64) +DEF_RVV_WI_OPS (vint32m1_t, 0) +DEF_RVV_WI_OPS (vint32m2_t, 0) +DEF_RVV_WI_OPS (vint32m4_t, 0) +DEF_RVV_WI_OPS (vint32m8_t, 0) + +DEF_RVV_WU_OPS (vuint8mf8_t, RVV_REQUIRE_ZVE64) +DEF_RVV_WU_OPS (vuint8mf4_t, 0) +DEF_RVV_WU_OPS (vuint8mf2_t, 0) +DEF_RVV_WU_OPS (vuint8m1_t, 0) +DEF_RVV_WU_OPS (vuint8m2_t, 0) +DEF_RVV_WU_OPS (vuint8m4_t, 0) +DEF_RVV_WU_OPS (vuint8m8_t, 0) +DEF_RVV_WU_OPS (vuint16mf4_t, RVV_REQUIRE_ZVE64) +DEF_RVV_WU_OPS (vuint16mf2_t, 0) +DEF_RVV_WU_OPS (vuint16m1_t, 0) +DEF_RVV_WU_OPS (vuint16m2_t, 0) +DEF_RVV_WU_OPS (vuint16m4_t, 0) +DEF_RVV_WU_OPS (vuint16m8_t, 0) +DEF_RVV_WU_OPS (vuint32mf2_t, RVV_REQUIRE_ZVE64) +DEF_RVV_WU_OPS (vuint32m1_t, 0) +DEF_RVV_WU_OPS (vuint32m2_t, 0) +DEF_RVV_WU_OPS (vuint32m4_t, 0) +DEF_RVV_WU_OPS (vuint32m8_t, 0) + +DEF_RVV_WF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ZVE64) +DEF_RVV_WF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32) +DEF_RVV_WF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32) +DEF_RVV_WF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32) +DEF_RVV_WF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32) + #undef DEF_RVV_I_OPS #undef DEF_RVV_U_OPS #undef DEF_RVV_F_OPS @@ -363,3 +425,6 @@ DEF_RVV_WCONVERT_F_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64) #undef DEF_RVV_WCONVERT_I_OPS #undef DEF_RVV_WCONVERT_U_OPS #undef DEF_RVV_WCONVERT_F_OPS +#undef DEF_RVV_WI_OPS +#undef DEF_RVV_WU_OPS +#undef DEF_RVV_WF_OPS diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index 7858a6d0e86..2e92ece3b64 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -133,6 +133,27 @@ static const rvv_type_info i_ops[] = { #include "riscv-vector-builtins-types.def" {NUM_VECTOR_TYPES, 0}}; +/* A list of all signed integer can be widened will be registered for intrinsic + * functions. */ +static const rvv_type_info wi_ops[] = { +#define DEF_RVV_WI_OPS(TYPE, REQUIRE) {VECTOR_TYPE_##TYPE, REQUIRE}, +#include "riscv-vector-builtins-types.def" + {NUM_VECTOR_TYPES, 0}}; + +/* A list of all unsigned integer can be widened will be registered for + * intrinsic functions. */ +static const rvv_type_info wu_ops[] = { +#define DEF_RVV_WU_OPS(TYPE, REQUIRE) {VECTOR_TYPE_##TYPE, REQUIRE}, +#include "riscv-vector-builtins-types.def" + {NUM_VECTOR_TYPES, 0}}; + +/* A list of all floating-point can be widened will be registered for intrinsic + * functions. */ +static const rvv_type_info wf_ops[] = { +#define DEF_RVV_WF_OPS(TYPE, REQUIRE) {VECTOR_TYPE_##TYPE, REQUIRE}, +#include "riscv-vector-builtins-types.def" + {NUM_VECTOR_TYPES, 0}}; + /* A list of all signed integer that SEW = 64 require full 'V' extension will be registered for intrinsic functions. */ static const rvv_type_info full_v_i_ops[] = { @@ -418,6 +439,17 @@ static CONSTEXPR const rvv_arg_type_info shift_wv_args[] static CONSTEXPR const rvv_arg_type_info v_args[] = {rvv_arg_type_info (RVV_BASE_vector), rvv_arg_type_info_end}; +/* A list of args for vector_type func (vector_type, lmul1_type) function. */ +static CONSTEXPR const rvv_arg_type_info vs_args[] + = {rvv_arg_type_info (RVV_BASE_vector), + rvv_arg_type_info (RVV_BASE_lmul1_vector), rvv_arg_type_info_end}; + +/* A list of args for vector_type func (vector_type, widen_lmul1_type) function. + */ +static CONSTEXPR const rvv_arg_type_info wvs_args[] + = {rvv_arg_type_info (RVV_BASE_vector), + rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), rvv_arg_type_info_end}; + /* A list of args for vector_type func (vector_type) function. */ static CONSTEXPR const rvv_arg_type_info f_v_args[] = {rvv_arg_type_info (RVV_BASE_float_vector), rvv_arg_type_info_end}; @@ -562,6 +594,10 @@ static CONSTEXPR const predication_type_index full_preds[] = {PRED_TYPE_none, PRED_TYPE_m, PRED_TYPE_tu, PRED_TYPE_tum, PRED_TYPE_tumu, PRED_TYPE_mu, NUM_PRED_TYPES}; +/* vop/vop_m/vop_tu/vop_tum/ will be registered. */ +static CONSTEXPR const predication_type_index no_mu_preds[] + = {PRED_TYPE_none, PRED_TYPE_m, PRED_TYPE_tu, PRED_TYPE_tum, NUM_PRED_TYPES}; + /* vop/vop_tu will be registered. */ static CONSTEXPR const predication_type_index none_tu_preds[] = {PRED_TYPE_none, PRED_TYPE_tu, NUM_PRED_TYPES}; @@ -1070,6 +1106,46 @@ static CONSTEXPR const rvv_op_info iu_v_ops rvv_arg_type_info (RVV_BASE_vector), /* Return type */ v_args /* Args */}; +/* A static operand information for vector_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info iu_vs_ops + = {iu_ops, /* Types */ + OP_TYPE_vs, /* Suffix */ + rvv_arg_type_info (RVV_BASE_lmul1_vector), /* Return type */ + vs_args /* Args */}; + +/* A static operand information for vector_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info f_vs_ops + = {f_ops, /* Types */ + OP_TYPE_vs, /* Suffix */ + rvv_arg_type_info (RVV_BASE_lmul1_vector), /* Return type */ + vs_args /* Args */}; + +/* A static operand information for vector_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info wi_vs_ops + = {wi_ops, /* Types */ + OP_TYPE_vs, /* Suffix */ + rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), /* Return type */ + wvs_args /* Args */}; + +/* A static operand information for vector_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info wu_vs_ops + = {wu_ops, /* Types */ + OP_TYPE_vs, /* Suffix */ + rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), /* Return type */ + wvs_args /* Args */}; + +/* A static operand information for vector_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info wf_vs_ops + = {wf_ops, /* Types */ + OP_TYPE_vs, /* Suffix */ + rvv_arg_type_info (RVV_BASE_widen_lmul1_vector), /* Return type */ + wvs_args /* Args */}; + /* A static operand information for vector_type func (vector_type) * function registration. */ static CONSTEXPR const rvv_op_info f_v_ops @@ -1707,7 +1783,8 @@ required_extensions_p (enum rvv_base_type type) || type == RVV_BASE_uint32_index || type == RVV_BASE_uint64_index || type == RVV_BASE_float_vector || type == RVV_BASE_double_trunc_float_vector - || type == RVV_BASE_double_trunc_vector; + || type == RVV_BASE_double_trunc_vector + || type == RVV_BASE_widen_lmul1_vector; } /* Check whether all the RVV_REQUIRE_* values in REQUIRED_EXTENSIONS are @@ -1822,6 +1899,7 @@ rvv_arg_type_info::get_base_vector_type (tree type) const poly_int64 nunits = GET_MODE_NUNITS (TYPE_MODE (type)); machine_mode inner_mode = GET_MODE_INNER (TYPE_MODE (type)); poly_int64 bitsize = GET_MODE_BITSIZE (inner_mode); + poly_int64 bytesize = GET_MODE_SIZE (inner_mode); bool unsigned_p = TYPE_UNSIGNED (type); if (unsigned_base_type_p (base_type)) @@ -1875,6 +1953,16 @@ rvv_arg_type_info::get_base_vector_type (tree type) const case RVV_BASE_unsigned_vector: inner_mode = int_mode_for_mode (inner_mode).require (); break; + case RVV_BASE_lmul1_vector: + nunits = exact_div (BYTES_PER_RISCV_VECTOR, bytesize); + break; + case RVV_BASE_widen_lmul1_vector: + inner_mode + = get_mode_for_bitsize (bitsize * 2, FLOAT_MODE_P (inner_mode)); + if (BYTES_PER_RISCV_VECTOR.coeffs[0] < (bytesize * 2).coeffs[0]) + return NUM_VECTOR_TYPES; + nunits = exact_div (BYTES_PER_RISCV_VECTOR, bytesize * 2); + break; default: return NUM_VECTOR_TYPES; } @@ -1963,6 +2051,8 @@ rvv_arg_type_info::get_tree_type (vector_type_index type_idx) const case RVV_BASE_double_trunc_float_vector: case RVV_BASE_signed_vector: case RVV_BASE_unsigned_vector: + case RVV_BASE_lmul1_vector: + case RVV_BASE_widen_lmul1_vector: if (get_base_vector_type (builtin_types[type_idx].vector) != NUM_VECTOR_TYPES) return builtin_types[get_base_vector_type ( diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h index db6ab389e64..ede08c6a480 100644 --- a/gcc/config/riscv/riscv-vector-builtins.h +++ b/gcc/config/riscv/riscv-vector-builtins.h @@ -164,8 +164,10 @@ enum rvv_base_type RVV_BASE_double_trunc_signed_vector, RVV_BASE_double_trunc_unsigned_vector, RVV_BASE_double_trunc_unsigned_scalar, - RVV_BASE_float_vector, RVV_BASE_double_trunc_float_vector, + RVV_BASE_float_vector, + RVV_BASE_lmul1_vector, + RVV_BASE_widen_lmul1_vector, NUM_BASE_TYPES }; diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 487059ebe97..55f7b12aaa9 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -309,9 +309,9 @@ ;; 14. Vector reduction operations ;; vired vector single-width integer reduction instructions ;; viwred vector widening integer reduction instructions -;; vfred vector single-width floating-point un-ordered reduction instruction +;; vfredu vector single-width floating-point un-ordered reduction instruction ;; vfredo vector single-width floating-point ordered reduction instruction -;; vfwred vector widening floating-point un-ordered reduction instruction +;; vfwredu vector widening floating-point un-ordered reduction instruction ;; vfwredo vector widening floating-point ordered reduction instruction ;; 15. Vector mask instructions ;; vmalu vector mask-register logical instructions @@ -344,7 +344,7 @@ vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov, vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi, vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof, - vired,viwred,vfred,vfredo,vfwred,vfwredo, + vired,viwred,vfredu,vfredo,vfwredu,vfwredo, vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv, vislide,vislide1,vfslide1,vgather,vcompress,vmov" (cond [(eq_attr "got" "load") (const_string "load") diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 127e1b07fcf..cb817abcfde 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -66,6 +66,10 @@ UNSPEC_VFCVT UNSPEC_UNSIGNED_VFCVT UNSPEC_ROD + + UNSPEC_REDUC + UNSPEC_WREDUC_SUM + UNSPEC_WREDUC_USUM ]) (define_mode_iterator V [ @@ -93,6 +97,23 @@ (VNx4DI "TARGET_MIN_VLEN > 32") (VNx8DI "TARGET_MIN_VLEN > 32") ]) +(define_mode_iterator VI_ZVE32 [ + VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI + VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI + VNx1SI VNx2SI VNx4SI VNx8SI +]) + +(define_mode_iterator VWI [ + VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN > 32") + VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") + VNx1SI VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") +]) + +(define_mode_iterator VWI_ZVE32 [ + VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI + VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI +]) + (define_mode_iterator VF [ (VNx1SF "TARGET_VECTOR_ELEN_FP_32") (VNx2SF "TARGET_VECTOR_ELEN_FP_32") @@ -105,6 +126,17 @@ (VNx8DF "TARGET_VECTOR_ELEN_FP_64") ]) +(define_mode_iterator VF_ZVE32 [ + (VNx1SF "TARGET_VECTOR_ELEN_FP_32") + (VNx2SF "TARGET_VECTOR_ELEN_FP_32") + (VNx4SF "TARGET_VECTOR_ELEN_FP_32") + (VNx8SF "TARGET_VECTOR_ELEN_FP_32") +]) + +(define_mode_iterator VWF [ + VNx1SF VNx2SF VNx4SF VNx8SF (VNx16SF "TARGET_MIN_VLEN > 32") +]) + (define_mode_iterator VFULLI [ VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN > 32") VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") @@ -334,6 +366,96 @@ (VNx1DF "VNx1SI") (VNx2DF "VNx2SI") (VNx4DF "VNx4SI") (VNx8DF "VNx8SI") ]) +(define_mode_attr VLMUL1 [ + (VNx1QI "VNx8QI") (VNx2QI "VNx8QI") (VNx4QI "VNx8QI") + (VNx8QI "VNx8QI") (VNx16QI "VNx8QI") (VNx32QI "VNx8QI") (VNx64QI "VNx8QI") + (VNx1HI "VNx4HI") (VNx2HI "VNx4HI") (VNx4HI "VNx4HI") + (VNx8HI "VNx4HI") (VNx16HI "VNx4HI") (VNx32HI "VNx4HI") + (VNx1SI "VNx2SI") (VNx2SI "VNx2SI") (VNx4SI "VNx2SI") + (VNx8SI "VNx2SI") (VNx16SI "VNx2SI") + (VNx1DI "VNx1DI") (VNx2DI "VNx1DI") + (VNx4DI "VNx1DI") (VNx8DI "VNx1DI") + (VNx1SF "VNx2SF") (VNx2SF "VNx2SF") + (VNx4SF "VNx2SF") (VNx8SF "VNx2SF") (VNx16SF "VNx2SF") + (VNx1DF "VNx1DF") (VNx2DF "VNx1DF") + (VNx4DF "VNx1DF") (VNx8DF "VNx1DF") +]) + +(define_mode_attr VLMUL1_ZVE32 [ + (VNx1QI "VNx4QI") (VNx2QI "VNx4QI") (VNx4QI "VNx4QI") + (VNx8QI "VNx4QI") (VNx16QI "VNx4QI") (VNx32QI "VNx4QI") + (VNx1HI "VNx2HI") (VNx2HI "VNx2HI") (VNx4HI "VNx2HI") + (VNx8HI "VNx2HI") (VNx16HI "VNx2HI") + (VNx1SI "VNx1SI") (VNx2SI "VNx1SI") (VNx4SI "VNx1SI") + (VNx8SI "VNx1SI") + (VNx1SF "VNx2SF") (VNx2SF "VNx2SF") + (VNx4SF "VNx2SF") (VNx8SF "VNx2SF") +]) + +(define_mode_attr VWLMUL1 [ + (VNx1QI "VNx4HI") (VNx2QI "VNx4HI") (VNx4QI "VNx4HI") + (VNx8QI "VNx4HI") (VNx16QI "VNx4HI") (VNx32QI "VNx4HI") (VNx64QI "VNx4HI") + (VNx1HI "VNx2SI") (VNx2HI "VNx2SI") (VNx4HI "VNx2SI") + (VNx8HI "VNx2SI") (VNx16HI "VNx2SI") (VNx32HI "VNx2SI") + (VNx1SI "VNx1DI") (VNx2SI "VNx1DI") (VNx4SI "VNx1DI") + (VNx8SI "VNx1DI") (VNx16SI "VNx1DI") + (VNx1SF "VNx1DF") (VNx2SF "VNx1DF") + (VNx4SF "VNx1DF") (VNx8SF "VNx1DF") (VNx16SF "VNx1DF") +]) + +(define_mode_attr VWLMUL1_ZVE32 [ + (VNx1QI "VNx2HI") (VNx2QI "VNx2HI") (VNx4QI "VNx2HI") + (VNx8QI "VNx2HI") (VNx16QI "VNx2HI") (VNx32QI "VNx2HI") + (VNx1HI "VNx1SI") (VNx2HI "VNx1SI") (VNx4HI "VNx1SI") + (VNx8HI "VNx1SI") (VNx16HI "VNx1SI") +]) + +(define_mode_attr vlmul1 [ + (VNx1QI "vnx8qi") (VNx2QI "vnx8qi") (VNx4QI "vnx8qi") + (VNx8QI "vnx8qi") (VNx16QI "vnx8qi") (VNx32QI "vnx8qi") (VNx64QI "vnx8qi") + (VNx1HI "vnx4hi") (VNx2HI "vnx4hi") (VNx4HI "vnx4hi") + (VNx8HI "vnx4hi") (VNx16HI "vnx4hi") (VNx32HI "vnx4hi") + (VNx1SI "vnx2si") (VNx2SI "vnx2si") (VNx4SI "vnx2si") + (VNx8SI "vnx2si") (VNx16SI "vnx2si") + (VNx1DI "vnx1DI") (VNx2DI "vnx1DI") + (VNx4DI "vnx1DI") (VNx8DI "vnx1DI") + (VNx1SF "vnx2sf") (VNx2SF "vnx2sf") + (VNx4SF "vnx2sf") (VNx8SF "vnx2sf") (VNx16SF "vnx2sf") + (VNx1DF "vnx1df") (VNx2DF "vnx1df") + (VNx4DF "vnx1df") (VNx8DF "vnx1df") +]) + +(define_mode_attr vlmul1_zve32 [ + (VNx1QI "vnx4qi") (VNx2QI "vnx4qi") (VNx4QI "vnx4qi") + (VNx8QI "vnx4qi") (VNx16QI "vnx4qi") (VNx32QI "vnx4qi") + (VNx1HI "vnx2hi") (VNx2HI "vnx2hi") (VNx4HI "vnx2hi") + (VNx8HI "vnx2hi") (VNx16HI "vnx2hi") + (VNx1SI "vnx1si") (VNx2SI "vnx1si") (VNx4SI "vnx1si") + (VNx8SI "vnx1si") + (VNx1SF "vnx1sf") (VNx2SF "vnx1sf") + (VNx4SF "vnx1sf") (VNx8SF "vnx1sf") +]) + +(define_mode_attr vwlmul1 [ + (VNx1QI "vnx4hi") (VNx2QI "vnx4hi") (VNx4QI "vnx4hi") + (VNx8QI "vnx4hi") (VNx16QI "vnx4hi") (VNx32QI "vnx4hi") (VNx64QI "vnx4hi") + (VNx1HI "vnx2si") (VNx2HI "vnx2si") (VNx4HI "vnx2si") + (VNx8HI "vnx2si") (VNx16HI "vnx2si") (VNx32HI "vnx2SI") + (VNx1SI "vnx2di") (VNx2SI "vnx2di") (VNx4SI "vnx2di") + (VNx8SI "vnx2di") (VNx16SI "vnx2di") + (VNx1SF "vnx1df") (VNx2SF "vnx1df") + (VNx4SF "vnx1df") (VNx8SF "vnx1df") (VNx16SF "vnx1df") +]) + +(define_mode_attr vwlmul1_zve32 [ + (VNx1QI "vnx2hi") (VNx2QI "vnx2hi") (VNx4QI "vnx2hi") + (VNx8QI "vnx2hi") (VNx16QI "vnx2hi") (VNx32QI "vnx2hi") + (VNx1HI "vnx1si") (VNx2HI "vnx1si") (VNx4HI "vnx1si") + (VNx8HI "vnx1si") (VNx16HI "vnx1SI") +]) + +(define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM]) + (define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED]) (define_int_iterator VMULH [UNSPEC_VMULHS UNSPEC_VMULHU UNSPEC_VMULHSU]) @@ -360,7 +482,8 @@ (define_int_attr v_su [(UNSPEC_VMULHS "") (UNSPEC_VMULHU "u") (UNSPEC_VMULHSU "su") (UNSPEC_VNCLIP "") (UNSPEC_VNCLIPU "u") - (UNSPEC_VFCVT "") (UNSPEC_UNSIGNED_VFCVT "u")]) + (UNSPEC_VFCVT "") (UNSPEC_UNSIGNED_VFCVT "u") + (UNSPEC_WREDUC_SUM "") (UNSPEC_WREDUC_USUM "u")]) (define_int_attr sat_op [(UNSPEC_VAADDU "aaddu") (UNSPEC_VAADD "aadd") (UNSPEC_VASUBU "asubu") (UNSPEC_VASUB "asub") (UNSPEC_VSMUL "smul") (UNSPEC_VSSRL "ssrl") @@ -418,6 +541,11 @@ (define_code_iterator any_fix [fix unsigned_fix]) (define_code_iterator any_float [float unsigned_float]) +(define_code_iterator any_reduc [plus umax smax umin smin and ior xor]) +(define_code_iterator any_freduc [smax smin]) +(define_code_attr reduc [(plus "sum") (umax "maxu") (smax "max") (umin "minu") + (smin "min") (and "and") (ior "or") (xor "xor")]) + (define_code_attr fix_cvt [(fix "fix_trunc") (unsigned_fix "fixuns_trunc")]) (define_code_attr float_cvt [(float "float") (unsigned_float "floatuns")]) diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 715a63a40de..8a7bfab8272 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -48,7 +48,7 @@ vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\ vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,\ vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\ - vired,viwred,vfred,vfredo,vfwred,vfwredo,\ + vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\ vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,\ vislide,vislide1,vfslide1,vgather,vcompress") (const_string "true")] @@ -68,7 +68,7 @@ vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\ vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,\ vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\ - vired,viwred,vfred,vfredo,vfwred,vfwredo,\ + vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\ vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovxv,vfmovfv,\ vislide,vislide1,vfslide1,vgather,vcompress") (const_string "true")] @@ -151,7 +151,8 @@ vfwalu,vfwmul,vfsqrt,vfrecp,vfsgnj,vfcmp,\ vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\ vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,\ - vfncvtftof,vfmuladd,vfwmuladd,vfclass") + vfncvtftof,vfmuladd,vfwmuladd,vfclass,vired, + viwred,vfredu,vfredo,vfwredu,vfwredo") (const_int INVALID_ATTRIBUTE) (eq_attr "mode" "VNx1QI,VNx1BI") (symbol_ref "riscv_vector::get_ratio(E_VNx1QImode)") @@ -206,7 +207,8 @@ viwmul,vnshift,vaalu,vsmul,vsshift,vnclip,vmsfs,\ vmiota,vmidx,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\ vfsqrt,vfrecp,vfsgnj,vfcmp,vfcvtitof,vfcvtftoi,vfwcvtitof,\ - vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass") + vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass,\ + vired,viwred,vfredu,vfredo,vfwredu,vfwredo") (const_int 2) (eq_attr "type" "vimerge,vfmerge") @@ -234,7 +236,7 @@ (eq_attr "type" "vldux,vldox,vialu,vshift,viminmax,vimul,vidiv,vsalu,\ viwalu,viwmul,vnshift,vimerge,vaalu,vsmul,\ vsshift,vnclip,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\ - vfsgnj,vfmerge") + vfsgnj,vfmerge,vired,viwred,vfredu,vfredo,vfwredu,vfwredo") (const_int 5) (eq_attr "type" "vicmp,vimuladd,viwmuladd,vfcmp,vfmuladd,vfwmuladd") @@ -261,7 +263,8 @@ (eq_attr "type" "vldux,vldox,vialu,vshift,viminmax,vimul,vidiv,vsalu,\ viwalu,viwmul,vnshift,vimerge,vaalu,vsmul,\ vsshift,vnclip,vfalu,vfmul,vfminmax,vfdiv,\ - vfwalu,vfwmul,vfsgnj,vfmerge") + vfwalu,vfwmul,vfsgnj,vfmerge,vired,viwred,vfredu,\ + vfredo,vfwredu,vfwredo") (symbol_ref "riscv_vector::get_ta(operands[6])") (eq_attr "type" "vimuladd,viwmuladd,vfmuladd,vfwmuladd") @@ -302,7 +305,8 @@ (define_attr "avl_type" "" (cond [(eq_attr "type" "vlde,vlde,vste,vimov,vimov,vimov,vfmov,vext,vimerge,\ vfsqrt,vfrecp,vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\ - vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass") + vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\ + vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo") (symbol_ref "INTVAL (operands[7])") (eq_attr "type" "vldm,vstm,vimov,vmalu,vmalu") (symbol_ref "INTVAL (operands[5])") @@ -6181,3 +6185,208 @@ "vfncvt.rod.f.f.w\t%0,%3%p1" [(set_attr "type" "vfncvtftof") (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- Predicated reduction operations +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 14.1 Vector Single-Width Integer Reduction Instructions +;; - 14.2 Vector Widening Integer Reduction Instructions +;; - 14.3 Vector Single-Width Floating-Point Reduction Instructions +;; - 14.4 Vector Widening Floating-Point Reduction Instructions +;; ------------------------------------------------------------------------------- + +;; For reduction operations, we should have seperate patterns for +;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32. +;; Since reduction need LMUL = 1 scalar operand as the input operand +;; and they are different. +;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode +;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64* +(define_insn "@pred_reduc_" + [(set (match_operand: 0 "register_operand" "=vd, vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" " vm,Wc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (any_reduc:VI + (vec_duplicate:VI + (vec_select: + (match_operand: 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]))) + (match_operand:VI 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" "0vu,0vu")] UNSPEC_REDUC))] + "TARGET_VECTOR && TARGET_MIN_VLEN > 32" + "vred.vs\t%0,%3,%4%p1" + [(set_attr "type" "vired") + (set_attr "mode" "")]) + +(define_insn "@pred_reduc_" + [(set (match_operand: 0 "register_operand" "=vd, vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" " vm,Wc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (any_reduc:VI_ZVE32 + (vec_duplicate:VI_ZVE32 + (vec_select: + (match_operand: 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]))) + (match_operand:VI_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" "0vu,0vu")] UNSPEC_REDUC))] + "TARGET_VECTOR && TARGET_MIN_VLEN == 32" + "vred.vs\t%0,%3,%4%p1" + [(set_attr "type" "vired") + (set_attr "mode" "")]) + +(define_insn "@pred_widen_reduc_plus" + [(set (match_operand: 0 "register_operand" "=&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" "vmWc1") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operand:VWI 3 "register_operand" " vr") + (match_operand: 4 "register_operand" " vr") + (match_operand: 2 "vector_merge_operand" " 0vu")] WREDUC))] + "TARGET_VECTOR && TARGET_MIN_VLEN > 32" + "vwredsum.vs\t%0,%3,%4%p1" + [(set_attr "type" "viwred") + (set_attr "mode" "")]) + +(define_insn "@pred_widen_reduc_plus" + [(set (match_operand: 0 "register_operand" "=&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" "vmWc1") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operand:VWI_ZVE32 3 "register_operand" " vr") + (match_operand: 4 "register_operand" " vr") + (match_operand: 2 "vector_merge_operand" " 0vu")] WREDUC))] + "TARGET_VECTOR && TARGET_MIN_VLEN == 32" + "vwredsum.vs\t%0,%3,%4%p1" + [(set_attr "type" "viwred") + (set_attr "mode" "")]) + +(define_insn "@pred_reduc_" + [(set (match_operand: 0 "register_operand" "=vd, vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" " vm,Wc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (any_freduc:VF + (vec_duplicate:VF + (vec_select: + (match_operand: 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]))) + (match_operand:VF 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" "0vu,0vu")] UNSPEC_REDUC))] + "TARGET_VECTOR && TARGET_MIN_VLEN > 32" + "vfred.vs\t%0,%3,%4%p1" + [(set_attr "type" "vfredu") + (set_attr "mode" "")]) + +(define_insn "@pred_reduc_" + [(set (match_operand: 0 "register_operand" "=vd, vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" " vm,Wc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (any_freduc:VF_ZVE32 + (vec_duplicate:VF_ZVE32 + (vec_select: + (match_operand: 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]))) + (match_operand:VF_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" "0vu,0vu")] UNSPEC_REDUC))] + "TARGET_VECTOR && TARGET_MIN_VLEN == 32" + "vfred.vs\t%0,%3,%4%p1" + [(set_attr "type" "vfredu") + (set_attr "mode" "")]) + +(define_insn "@pred_reduc_plus" + [(set (match_operand: 0 "register_operand" "=vd, vr") + (unspec: + [(unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" " vm,Wc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (plus:VF + (vec_duplicate:VF + (vec_select: + (match_operand: 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]))) + (match_operand:VF 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" "0vu,0vu")] UNSPEC_REDUC)] ORDER))] + "TARGET_VECTOR && TARGET_MIN_VLEN > 32" + "vfredsum.vs\t%0,%3,%4%p1" + [(set_attr "type" "vfred") + (set_attr "mode" "")]) + +(define_insn "@pred_reduc_plus" + [(set (match_operand: 0 "register_operand" "=vd, vr") + (unspec: + [(unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" " vm,Wc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (plus:VF_ZVE32 + (vec_duplicate:VF_ZVE32 + (vec_select: + (match_operand: 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]))) + (match_operand:VF_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" "0vu,0vu")] UNSPEC_REDUC)] ORDER))] + "TARGET_VECTOR && TARGET_MIN_VLEN == 32" + "vfredsum.vs\t%0,%3,%4%p1" + [(set_attr "type" "vfred") + (set_attr "mode" "")]) + +(define_insn "@pred_widen_reduc_plus" + [(set (match_operand: 0 "register_operand" "=&vr") + (unspec: + [(unspec: + [(unspec: + [(match_operand: 1 "vector_mask_operand" "vmWc1") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operand:VWF 3 "register_operand" " vr") + (match_operand: 4 "register_operand" " vr") + (match_operand: 2 "vector_merge_operand" " 0vu")] UNSPEC_WREDUC_SUM)] ORDER))] + "TARGET_VECTOR && TARGET_MIN_VLEN > 32" + "vfwredsum.vs\t%0,%3,%4%p1" + [(set_attr "type" "vfwred") + (set_attr "mode" "")]) -- 2.36.1