From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2093) id BEC1D385840A; Sun, 5 Mar 2023 09:17:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BEC1D385840A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678007833; bh=EXaBTStM4cQsEVg8TZhpsBgDZWj4bWBTXbK70LUpqGw=; h=From:To:Subject:Date:From; b=tCptVZsuMdBQM+FNWrQ/pyzaqxsPAPG7C086Dk6vvbfeUlBdzA4rpriky4FDLtJ13 KkhQQDo0rQqPG2A2CVdc+9LbRBnCO+QlHZ1+0pqIIoKgcyIgCDRJiQmGQamw1UGvfT DKDOKeKqPXaNYTI4QA6yf4n6pZ15qWq/2hhAuXJw= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Kito Cheng To: gcc-cvs@gcc.gnu.org Subject: [gcc r13-6484] RISC-V: Add scalar move support and fix VSETVL bugs X-Act-Checkin: gcc X-Git-Author: Ju-Zhe Zhong X-Git-Refname: refs/heads/master X-Git-Oldrev: 602cfc746e9e0447221896a3d93608c6db3a89e5 X-Git-Newrev: ec99ffabc3d32bbc0cce164e84942e176c13e75c Message-Id: <20230305091713.BEC1D385840A@sourceware.org> Date: Sun, 5 Mar 2023 09:17:13 +0000 (GMT) List-Id: https://gcc.gnu.org/g:ec99ffabc3d32bbc0cce164e84942e176c13e75c commit r13-6484-gec99ffabc3d32bbc0cce164e84942e176c13e75c Author: Ju-Zhe Zhong Date: Fri Feb 24 23:18:02 2023 +0800 RISC-V: Add scalar move support and fix VSETVL bugs gcc/ChangeLog: * config/riscv/constraints.md (Wb1): New constraint. * config/riscv/predicates.md (vector_least_significant_set_mask_operand): New predicate. (vector_broadcast_mask_operand): Ditto. * config/riscv/riscv-protos.h (enum vlmul_type): Adjust. (gen_scalar_move_mask): New function. * config/riscv/riscv-v.cc (gen_scalar_move_mask): Ditto. * config/riscv/riscv-vector-builtins-bases.cc (class vmv): New class. (class vmv_s): Ditto. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vmv_x): Ditto. (vmv_s): Ditto. (vfmv_f): Ditto. (vfmv_s): Ditto. * config/riscv/riscv-vector-builtins-shapes.cc (struct scalar_move_def): Ditto. (SHAPE): Ditto. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. * config/riscv/riscv-vector-builtins.cc (function_expander::mask_mode): Ditto. (function_expander::use_exact_insn): New function. (function_expander::use_contiguous_load_insn): New function. (function_expander::use_contiguous_store_insn): New function. (function_expander::use_ternop_insn): New function. (function_expander::use_widen_ternop_insn): New function. (function_expander::use_scalar_move_insn): New function. * config/riscv/riscv-vector-builtins.def (s): New operand suffix. * config/riscv/riscv-vector-builtins.h (function_expander::add_scalar_move_mask_operand): New class. * config/riscv/riscv-vsetvl.cc (ignore_vlmul_insn_p): New function. (scalar_move_insn_p): Ditto. (has_vsetvl_killed_avl_p): Ditto. (anticipatable_occurrence_p): Ditto. (insert_vsetvl): Ditto. (get_vl_vtype_info): Ditto. (calculate_sew): Ditto. (calculate_vlmul): Ditto. (incompatible_avl_p): Ditto. (different_sew_p): Ditto. (different_lmul_p): Ditto. (different_ratio_p): Ditto. (different_tail_policy_p): Ditto. (different_mask_policy_p): Ditto. (possible_zero_avl_p): Ditto. (first_ratio_invalid_for_second_sew_p): Ditto. (first_ratio_invalid_for_second_lmul_p): Ditto. (second_ratio_invalid_for_first_sew_p): Ditto. (second_ratio_invalid_for_first_lmul_p): Ditto. (second_sew_less_than_first_sew_p): Ditto. (first_sew_less_than_second_sew_p): Ditto. (compare_lmul): Ditto. (second_lmul_less_than_first_lmul_p): Ditto. (first_lmul_less_than_second_lmul_p): Ditto. (first_ratio_less_than_second_ratio_p): Ditto. (second_ratio_less_than_first_ratio_p): Ditto. (DEF_INCOMPATIBLE_COND): Ditto. (greatest_sew): Ditto. (first_sew): Ditto. (second_sew): Ditto. (first_vlmul): Ditto. (second_vlmul): Ditto. (first_ratio): Ditto. (second_ratio): Ditto. (vlmul_for_first_sew_second_ratio): Ditto. (ratio_for_second_sew_first_vlmul): Ditto. (DEF_SEW_LMUL_FUSE_RULE): Ditto. (always_unavailable): Ditto. (avl_unavailable_p): Ditto. (sew_unavailable_p): Ditto. (lmul_unavailable_p): Ditto. (ge_sew_unavailable_p): Ditto. (ge_sew_lmul_unavailable_p): Ditto. (ge_sew_ratio_unavailable_p): Ditto. (DEF_UNAVAILABLE_COND): Ditto. (same_sew_lmul_demand_p): Ditto. (propagate_avl_across_demands_p): Ditto. (reg_available_p): Ditto. (avl_info::has_non_zero_avl): Ditto. (vl_vtype_info::has_non_zero_avl): Ditto. (vector_insn_info::operator>=): Refactor. (vector_insn_info::parse_insn): Adjust for scalar move. (vector_insn_info::demand_vl_vtype): Remove. (vector_insn_info::compatible_p): New function. (vector_insn_info::compatible_avl_p): Ditto. (vector_insn_info::compatible_vtype_p): Ditto. (vector_insn_info::available_p): Ditto. (vector_insn_info::merge): Ditto. (vector_insn_info::fuse_avl): Ditto. (vector_insn_info::fuse_sew_lmul): Ditto. (vector_insn_info::fuse_tail_policy): Ditto. (vector_insn_info::fuse_mask_policy): Ditto. (vector_insn_info::dump): Ditto. (vector_infos_manager::release): Ditto. (pass_vsetvl::compute_local_backward_infos): Adjust for scalar move support. (pass_vsetvl::get_backward_fusion_type): Adjust for scalar move support. (pass_vsetvl::hard_empty_block_p): Ditto. (pass_vsetvl::backward_demand_fusion): Ditto. (pass_vsetvl::forward_demand_fusion): Ditto. (pass_vsetvl::refine_vsetvls): Ditto. (pass_vsetvl::cleanup_vsetvls): Ditto. (pass_vsetvl::commit_vsetvls): Ditto. (pass_vsetvl::propagate_avl): Ditto. * config/riscv/riscv-vsetvl.h (enum demand_status): New class. (struct demands_pair): Ditto. (struct demands_cond): Ditto. (struct demands_fuse_rule): Ditto. * config/riscv/vector-iterators.md: New iterator. * config/riscv/vector.md (@pred_broadcast): New pattern. (*pred_broadcast): Ditto. (*pred_broadcast_extended_scalar): Ditto. (@pred_extract_first): Ditto. (*pred_extract_first): Ditto. (@pred_extract_first_trunc): Ditto. * config/riscv/riscv-vsetvl.def: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c: Adjust test. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c: Ditto. Diff: --- gcc/config/riscv/constraints.md | 6 + gcc/config/riscv/predicates.md | 20 +- gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 11 + gcc/config/riscv/riscv-vector-builtins-bases.cc | 34 + gcc/config/riscv/riscv-vector-builtins-bases.h | 4 + .../riscv/riscv-vector-builtins-functions.def | 18 +- gcc/config/riscv/riscv-vector-builtins-shapes.cc | 17 + gcc/config/riscv/riscv-vector-builtins-shapes.h | 1 + gcc/config/riscv/riscv-vector-builtins.cc | 87 +- gcc/config/riscv/riscv-vector-builtins.def | 1 + gcc/config/riscv/riscv-vector-builtins.h | 10 + gcc/config/riscv/riscv-vsetvl.cc | 931 ++++++++++++++++----- gcc/config/riscv/riscv-vsetvl.def | 684 +++++++++++++++ gcc/config/riscv/riscv-vsetvl.h | 86 +- gcc/config/riscv/vector-iterators.md | 44 +- gcc/config/riscv/vector.md | 243 +++++- .../gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c | 4 +- .../gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c | 2 +- 23 files changed, 1911 insertions(+), 302 deletions(-) diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index a051d466ae2..9d7ca487db7 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -162,6 +162,12 @@ (and (match_code "const_vector") (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) +(define_constraint "Wb1" + "@internal + A constraint that matches a BOOL vector of {...,0,...0,1}" + (and (match_code "const_vector") + (match_test "rtx_equal_p (op, riscv_vector::gen_scalar_move_mask (GET_MODE (op)))"))) + (define_memory_constraint "Wdm" "Vector duplicate memory operand" (and (match_code "mem") diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 7bc7c0b4f4d..06a51325537 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -284,13 +284,22 @@ || satisfies_constraint_Wc0 (op)")))) (define_predicate "vector_all_trues_mask_operand" - (ior (match_operand 0 "register_operand") + (and (match_code "const_vector") (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) +(define_predicate "vector_least_significant_set_mask_operand" + (and (match_code "const_vector") + (match_test "rtx_equal_p (op, riscv_vector::gen_scalar_move_mask (GET_MODE (op)))"))) + (define_predicate "vector_mask_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "vector_all_trues_mask_operand"))) +(define_predicate "vector_broadcast_mask_operand" + (ior (match_operand 0 "vector_least_significant_set_mask_operand") + (ior (match_operand 0 "register_operand") + (match_operand 0 "vector_all_trues_mask_operand")))) + (define_predicate "vector_undef_operand" (match_test "rtx_equal_p (op, RVV_VUNDEF (GET_MODE (op)))")) @@ -340,10 +349,13 @@ ;; The scalar operand can be directly broadcast by RVV instructions. (define_predicate "direct_broadcast_operand" (and (match_test "!(reload_completed && !FLOAT_MODE_P (GET_MODE (op)) - && register_operand (op, GET_MODE (op)) + && (register_operand (op, GET_MODE (op)) || CONST_INT_P (op) + || rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))) && maybe_gt (GET_MODE_BITSIZE (GET_MODE (op)), GET_MODE_BITSIZE (Pmode)))") - (ior (match_operand 0 "register_operand") - (match_test "satisfies_constraint_Wdm (op)")))) + (ior (match_test "rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))") + (ior (match_operand 0 "const_int_operand") + (ior (match_operand 0 "register_operand") + (match_test "satisfies_constraint_Wdm (op)")))))) ;; A CONST_INT operand that has exactly two bits cleared. (define_predicate "const_nottwobits_operand" diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 37c634eca1d..9e017b49c19 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -133,6 +133,7 @@ enum vlmul_type LMUL_F8 = 5, LMUL_F4 = 6, LMUL_F2 = 7, + NUM_LMUL = 8 }; enum avl_type @@ -183,6 +184,7 @@ bool has_vi_variant_p (rtx_code, rtx); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, machine_mode, bool, void (*)(rtx *, rtx)); +rtx gen_scalar_move_mask (machine_mode); } /* We classify builtin types into two classes: diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 59c25c65cd5..c2209990882 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -42,6 +42,7 @@ #include "expr.h" #include "optabs.h" #include "tm-constrs.h" +#include "rtx-vector-builder.h" using namespace riscv_vector; @@ -484,4 +485,14 @@ sew64_scalar_helper (rtx *operands, rtx *scalar_op, rtx vl, return true; } +/* Get { ... ,0, 0, 0, ..., 0, 0, 0, 1 } mask. */ +rtx +gen_scalar_move_mask (machine_mode mode) +{ + rtx_vector_builder builder (mode, 1, 2); + builder.quick_push (const1_rtx); + builder.quick_push (const0_rtx); + return builder.build (); +} + } // namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index f6ed2e53453..7b27cc31fc7 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -1341,6 +1341,32 @@ public: } }; +/* Implements vmv/vfmv instructions. */ +class vmv : public function_base +{ +public: + bool apply_vl_p () const override { return false; } + bool apply_tail_policy_p () const override { return false; } + bool apply_mask_policy_p () const override { return false; } + bool use_mask_predication_p () const override { return false; } + bool has_merge_operand_p () const override { return false; } + + rtx expand (function_expander &e) const override + { + return e.use_exact_insn (code_for_pred_extract_first (e.vector_mode ())); + } +}; + +/* Implements vmv.s.x/vfmv.s.f. */ +class vmv_s : public function_base +{ +public: + rtx expand (function_expander &e) const override + { + return e.use_scalar_move_insn (code_for_pred_broadcast (e.vector_mode ())); + } +}; + static CONSTEXPR const vsetvl vsetvl_obj; static CONSTEXPR const vsetvl vsetvlmax_obj; static CONSTEXPR const loadstore vle_obj; @@ -1530,6 +1556,10 @@ static CONSTEXPR const reducop vfredmax_obj; static CONSTEXPR const reducop vfredmin_obj; static CONSTEXPR const widen_freducop vfwredusum_obj; static CONSTEXPR const widen_freducop vfwredosum_obj; +static CONSTEXPR const vmv vmv_x_obj; +static CONSTEXPR const vmv_s vmv_s_obj; +static CONSTEXPR const vmv vfmv_f_obj; +static CONSTEXPR const vmv_s vfmv_s_obj; /* Declare the function base NAME, pointing it to an instance of class _obj. */ @@ -1725,5 +1755,9 @@ BASE (vfredmax) BASE (vfredmin) BASE (vfwredosum) BASE (vfwredusum) +BASE (vmv_x) +BASE (vmv_s) +BASE (vfmv_f) +BASE (vfmv_s) } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h b/gcc/config/riscv/riscv-vector-builtins-bases.h index 9f0e4675f81..ad1ee207d2f 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.h +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h @@ -219,6 +219,10 @@ extern const function_base *const vfredmax; extern const function_base *const vfredmin; extern const function_base *const vfwredosum; extern const function_base *const vfwredusum; +extern const function_base *const vmv_x; +extern const function_base *const vmv_s; +extern const function_base *const vfmv_f; +extern const function_base *const vfmv_s; } } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index 230b76cd0f2..cad98f6230d 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -464,6 +464,22 @@ DEF_RVV_FUNCTION (viota, mask_alu, full_preds, u_vm_ops) // 15.9. Vector Element Index Instruction DEF_RVV_FUNCTION (vid, alu, full_preds, u_v_ops) -/* TODO: 16. Vector Permutation Instructions. */ +/* 16. Vector Permutation Instructions. */ + +// 16.1. Integer Scalar Move Instructions +DEF_RVV_FUNCTION (vmv_x, scalar_move, none_preds, iu_x_s_ops) +DEF_RVV_FUNCTION (vmv_s, move, none_tu_preds, iu_s_x_ops) + +// 16.2. Floating-Point Scalar Move Instructions +DEF_RVV_FUNCTION (vfmv_f, scalar_move, none_preds, f_f_s_ops) +DEF_RVV_FUNCTION (vfmv_s, move, none_tu_preds, f_s_f_ops) + +// 16.3. Vector Slide Instructions + +// 16.4. Vector Register Gather Instructions + +// 16.5. Vector Compress Instruction + +// 16.6. Whole Vector Register Move #undef DEF_RVV_FUNCTION diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc index b3f5951087d..d08a96c0764 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc @@ -402,6 +402,22 @@ struct reduc_alu_def : public build_base } }; +/* scalar_move_def class. */ +struct scalar_move_def : public build_base +{ + char *get_name (function_builder &b, const function_instance &instance, + bool overloaded_p) const override + { + b.append_base_name (instance.base_name); + if (overloaded_p) + return b.finish_name (); + b.append_name (operand_suffixes[instance.op_info->op]); + b.append_name (type_suffixes[instance.type.index].vector); + b.append_name (type_suffixes[instance.type.index].scalar); + return b.finish_name (); + } +}; + SHAPE(vsetvl, vsetvl) SHAPE(vsetvl, vsetvlmax) SHAPE(loadstore, loadstore) @@ -414,5 +430,6 @@ SHAPE(narrow_alu, narrow_alu) SHAPE(move, move) SHAPE(mask_alu, mask_alu) SHAPE(reduc_alu, reduc_alu) +SHAPE(scalar_move, scalar_move) } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.h b/gcc/config/riscv/riscv-vector-builtins-shapes.h index 85769ea024a..a192b941fd8 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.h +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.h @@ -36,6 +36,7 @@ extern const function_shape *const narrow_alu; extern const function_shape *const move; extern const function_shape *const mask_alu; extern const function_shape *const reduc_alu; +extern const function_shape *const scalar_move; } } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index 2e92ece3b64..a430104f1e7 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -1106,6 +1106,22 @@ static CONSTEXPR const rvv_op_info iu_v_ops rvv_arg_type_info (RVV_BASE_vector), /* Return type */ v_args /* Args */}; +/* A static operand information for scalar_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info iu_x_s_ops + = {iu_ops, /* Types */ + OP_TYPE_s, /* Suffix */ + rvv_arg_type_info (RVV_BASE_scalar), /* Return type */ + v_args /* Args */}; + +/* A static operand information for scalar_type func (vector_type) + * function registration. */ +static CONSTEXPR const rvv_op_info f_f_s_ops + = {f_ops, /* Types */ + OP_TYPE_s, /* Suffix */ + rvv_arg_type_info (RVV_BASE_scalar), /* Return type */ + v_args /* Args */}; + /* A static operand information for vector_type func (vector_type) * function registration. */ static CONSTEXPR const rvv_op_info iu_vs_ops @@ -1291,6 +1307,14 @@ static CONSTEXPR const rvv_op_info iu_x_ops rvv_arg_type_info (RVV_BASE_vector), /* Return type */ x_args /* Args */}; +/* A static operand information for vector_type func (scalar_type) + * function registration. */ +static CONSTEXPR const rvv_op_info iu_s_x_ops + = {iu_ops, /* Types */ + OP_TYPE_x, /* Suffix */ + rvv_arg_type_info (RVV_BASE_vector), /* Return type */ + x_args /* Args */}; + /* A static operand information for vector_type func (scalar_type) * function registration. */ static CONSTEXPR const rvv_op_info f_f_ops @@ -1299,6 +1323,14 @@ static CONSTEXPR const rvv_op_info f_f_ops rvv_arg_type_info (RVV_BASE_vector), /* Return type */ x_args /* Args */}; +/* A static operand information for vector_type func (scalar_type) + * function registration. */ +static CONSTEXPR const rvv_op_info f_s_f_ops + = {f_ops, /* Types */ + OP_TYPE_f, /* Suffix */ + rvv_arg_type_info (RVV_BASE_vector), /* Return type */ + x_args /* Args */}; + /* A static operand information for vector_type func (double demote type) * function registration. */ static CONSTEXPR const rvv_op_info i_vf2_ops @@ -2448,14 +2480,19 @@ function_expander::add_mem_operand (machine_mode mode, unsigned argno) add_fixed_operand (mem); } +/* Return the machine_mode of the corresponding mask type. */ +machine_mode +function_expander::mask_mode (void) const +{ + return TYPE_MODE (builtin_types[mask_types[type.index]].vector); +} + /* Implement the call using instruction ICODE, with a 1:1 mapping between arguments and input operands. */ rtx function_expander::use_exact_insn (insn_code icode) { machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); - tree mask_type = builtin_types[mask_types[type.index]].vector; - machine_mode mask_mode = TYPE_MODE (mask_type); /* Record the offset to get the argument. */ int arg_offset = 0; @@ -2465,7 +2502,7 @@ function_expander::use_exact_insn (insn_code icode) if (use_real_mask_p (pred)) add_input_operand (arg_offset++); else - add_all_one_mask_operand (mask_mode); + add_all_one_mask_operand (mask_mode ()); } /* Store operation doesn't have merge operand. */ @@ -2485,7 +2522,8 @@ function_expander::use_exact_insn (insn_code icode) if (base->apply_mask_policy_p ()) add_input_operand (Pmode, get_mask_policy_for_pred (pred)); - add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX)); + if (base->apply_vl_p ()) + add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX)); return generate_insn (icode); } @@ -2495,8 +2533,6 @@ function_expander::use_contiguous_load_insn (insn_code icode) { gcc_assert (call_expr_nargs (exp) > 0); machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); - tree mask_type = builtin_types[mask_types[type.index]].vector; - machine_mode mask_mode = TYPE_MODE (mask_type); /* Record the offset to get the argument. */ int arg_offset = 0; @@ -2504,7 +2540,7 @@ function_expander::use_contiguous_load_insn (insn_code icode) if (use_real_mask_p (pred)) add_input_operand (arg_offset++); else - add_all_one_mask_operand (mask_mode); + add_all_one_mask_operand (mask_mode ()); if (use_real_merge_p (pred)) add_input_operand (arg_offset++); @@ -2534,8 +2570,6 @@ function_expander::use_contiguous_store_insn (insn_code icode) { gcc_assert (call_expr_nargs (exp) > 0); machine_mode mode = TYPE_MODE (builtin_types[type.index].vector); - tree mask_type = builtin_types[mask_types[type.index]].vector; - machine_mode mask_mode = TYPE_MODE (mask_type); /* Record the offset to get the argument. */ int arg_offset = 0; @@ -2545,7 +2579,7 @@ function_expander::use_contiguous_store_insn (insn_code icode) if (use_real_mask_p (pred)) add_input_operand (arg_offset++); else - add_all_one_mask_operand (mask_mode); + add_all_one_mask_operand (mask_mode ()); arg_offset++; for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++) @@ -2601,8 +2635,6 @@ rtx function_expander::use_ternop_insn (bool vd_accum_p, insn_code icode) { machine_mode mode = TYPE_MODE (builtin_types[type.index].vector); - tree mask_type = builtin_types[mask_types[type.index]].vector; - machine_mode mask_mode = TYPE_MODE (mask_type); /* Record the offset to get the argument. */ int arg_offset = 0; @@ -2610,7 +2642,7 @@ function_expander::use_ternop_insn (bool vd_accum_p, insn_code icode) if (use_real_mask_p (pred)) add_input_operand (arg_offset++); else - add_all_one_mask_operand (mask_mode); + add_all_one_mask_operand (mask_mode ()); rtx vd = expand_normal (CALL_EXPR_ARG (exp, arg_offset++)); rtx vs1 = expand_normal (CALL_EXPR_ARG (exp, arg_offset++)); @@ -2668,8 +2700,6 @@ rtx function_expander::use_widen_ternop_insn (insn_code icode) { machine_mode mode = TYPE_MODE (builtin_types[type.index].vector); - tree mask_type = builtin_types[mask_types[type.index]].vector; - machine_mode mask_mode = TYPE_MODE (mask_type); /* Record the offset to get the argument. */ int arg_offset = 0; @@ -2677,7 +2707,7 @@ function_expander::use_widen_ternop_insn (insn_code icode) if (use_real_mask_p (pred)) add_input_operand (arg_offset++); else - add_all_one_mask_operand (mask_mode); + add_all_one_mask_operand (mask_mode ()); rtx merge = RVV_VUNDEF (mode); if (use_real_merge_p (pred)) @@ -2706,6 +2736,31 @@ function_expander::use_widen_ternop_insn (insn_code icode) return m_ops[0].value; } +/* Implement the call using instruction ICODE, with a 1:1 mapping between + arguments and input operands. */ +rtx +function_expander::use_scalar_move_insn (insn_code icode) +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + + /* Record the offset to get the argument. */ + int arg_offset = 0; + add_scalar_move_mask_operand (mask_mode ()); + + if (use_real_merge_p (pred)) + add_input_operand (arg_offset++); + else + add_vundef_operand (mode); + + for (int argno = arg_offset; argno < call_expr_nargs (exp); argno++) + add_input_operand (argno); + + add_input_operand (Pmode, get_tail_policy_for_pred (pred)); + add_input_operand (Pmode, get_mask_policy_for_pred (pred)); + add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX)); + return generate_insn (icode); +} + /* Generate instruction ICODE, given that its operands have already been added to M_OPS. Return the value of the first operand. */ rtx diff --git a/gcc/config/riscv/riscv-vector-builtins.def b/gcc/config/riscv/riscv-vector-builtins.def index bb672f3b449..5094f041f66 100644 --- a/gcc/config/riscv/riscv-vector-builtins.def +++ b/gcc/config/riscv/riscv-vector-builtins.def @@ -293,6 +293,7 @@ DEF_RVV_OP_TYPE (f_v) DEF_RVV_OP_TYPE (xu_v) DEF_RVV_OP_TYPE (f_w) DEF_RVV_OP_TYPE (xu_w) +DEF_RVV_OP_TYPE (s) DEF_RVV_PRED_TYPE (ta) DEF_RVV_PRED_TYPE (tu) diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h index ede08c6a480..8707f7366d9 100644 --- a/gcc/config/riscv/riscv-vector-builtins.h +++ b/gcc/config/riscv/riscv-vector-builtins.h @@ -342,6 +342,7 @@ public: void add_input_operand (unsigned); void add_output_operand (machine_mode, rtx); void add_all_one_mask_operand (machine_mode); + void add_scalar_move_mask_operand (machine_mode); void add_vundef_operand (machine_mode); void add_fixed_operand (rtx); void add_integer_operand (rtx); @@ -350,6 +351,7 @@ public: machine_mode vector_mode (void) const; machine_mode index_mode (void) const; machine_mode arg_mode (int) const; + machine_mode mask_mode (void) const; rtx use_exact_insn (insn_code); rtx use_contiguous_load_insn (insn_code); @@ -357,6 +359,7 @@ public: rtx use_compare_insn (rtx_code, insn_code); rtx use_ternop_insn (bool, insn_code); rtx use_widen_ternop_insn (insn_code); + rtx use_scalar_move_insn (insn_code); rtx generate_insn (insn_code); /* The function call expression. */ @@ -470,6 +473,13 @@ function_expander::add_all_one_mask_operand (machine_mode mode) add_input_operand (mode, CONSTM1_RTX (mode)); } +/* Add mask operand for scalar move instruction. */ +inline void +function_expander::add_scalar_move_mask_operand (machine_mode mode) +{ + add_input_operand (mode, gen_scalar_move_mask (mode)); +} + /* Add an operand that must be X. The only way of legitimizing an invalid X is to reload the address of a MEM. */ inline void diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 3fbdd862242..9e25102a4f2 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -63,7 +63,16 @@ along with GCC; see the file COPYING3. If not see - The subroutine of optimize > 0 is lazy_vsetvl. This function optimize vsetvl insertion process by - lazy code motion (LCM) layering on RTL_SSA. */ + lazy code motion (LCM) layering on RTL_SSA. + + - get_avl (), get_insn (), get_avl_source (): + + 1. get_insn () is the current instruction, find_access (get_insn + ())->def is the same as get_avl_source () if get_insn () demand VL. + 2. If get_avl () is non-VLMAX REG, get_avl () == get_avl_source + ()->regno (). + 3. get_avl_source ()->regno () is the REGNO that we backward propagate. + */ #define IN_TARGET_CODE 1 #define INCLUDE_ALGORITHM @@ -94,6 +103,12 @@ along with GCC; see the file COPYING3. If not see using namespace rtl_ssa; using namespace riscv_vector; +static CONSTEXPR const unsigned ALL_SEW[] = {8, 16, 32, 64}; +static CONSTEXPR const vlmul_type ALL_LMUL[] + = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; +static CONSTEXPR const demand_type SEW_LMUL_RELATED_DEMAND[] + = {DEMAND_SEW, DEMAND_LMUL, DEMAND_RATIO, DEMAND_GE_SEW}; + DEBUG_FUNCTION void debug (const vector_insn_info *info) { @@ -165,6 +180,24 @@ valid_sew_p (size_t sew) return exact_log2 (sew) && sew >= 8 && sew <= 64; } +/* Return true if the instruction ignores VLMUL field of VTYPE. */ +static bool +ignore_vlmul_insn_p (rtx_insn *rinsn) +{ + return get_attr_type (rinsn) == TYPE_VIMOVVX + || get_attr_type (rinsn) == TYPE_VFMOVVF + || get_attr_type (rinsn) == TYPE_VIMOVXV + || get_attr_type (rinsn) == TYPE_VFMOVFV; +} + +/* Return true if the instruction is scalar move instruction. */ +static bool +scalar_move_insn_p (rtx_insn *rinsn) +{ + return get_attr_type (rinsn) == TYPE_VIMOVXV + || get_attr_type (rinsn) == TYPE_VFMOVFV; +} + /* Return true if it is a vsetvl instruction. */ static bool vector_config_insn_p (rtx_insn *rinsn) @@ -234,8 +267,7 @@ has_vsetvl_killed_avl_p (const bb_info *bb, const vector_insn_info &info) { rtx avl = info.get_avl (); if (vlmax_avl_p (avl)) - return find_reg_killed_by (bb, get_vl (info.get_insn ()->rtl ())) - != nullptr; + return find_reg_killed_by (bb, info.get_avl_reg_rtx ()) != nullptr; for (const insn_info *insn : bb->reverse_real_nondebug_insns ()) { def_info *def = find_access (insn->defs (), REGNO (avl)); @@ -288,8 +320,7 @@ anticipatable_occurrence_p (const bb_info *bb, const vector_insn_info dem) /* rs1 (avl) are not modified in the basic block prior to the VSETVL. */ if (!vlmax_avl_p (dem.get_avl ())) { - set_info *set - = find_access (insn->uses (), REGNO (dem.get_avl ()))->def (); + set_info *set = dem.get_avl_source (); /* If it's undefined, it's not anticipatable conservatively. */ if (!set) return false; @@ -748,7 +779,7 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, if (vlmax_avl_p (info.get_avl ())) { gcc_assert (has_vtype_op (rinsn) || vsetvl_insn_p (rinsn)); - rtx vl_op = get_vl (rinsn); + rtx vl_op = info.get_avl_reg_rtx (); gcc_assert (!vlmax_avl_p (vl_op)); emit_vsetvl_insn (VSETVL_NORMAL, emit_type, info, vl_op, rinsn); return; @@ -890,8 +921,16 @@ get_vl_vtype_info (const insn_info *insn) { set_info *set = nullptr; rtx avl = ::get_avl (insn->rtl ()); - if (avl && REG_P (avl) && !vlmax_avl_p (avl)) - set = find_access (insn->uses (), REGNO (avl))->def (); + if (avl && REG_P (avl)) + { + if (vlmax_avl_p (avl) && has_vl_op (insn->rtl ())) + set + = find_access (insn->uses (), REGNO (get_vl (insn->rtl ())))->def (); + else if (!vlmax_avl_p (avl)) + set = find_access (insn->uses (), REGNO (avl))->def (); + else + set = nullptr; + } uint8_t sew = get_sew (insn->rtl ()); enum vlmul_type vlmul = get_vlmul (insn->rtl ()); @@ -1113,6 +1152,391 @@ extract_single_source (set_info *set) return first_insn; } +static unsigned +calculate_sew (vlmul_type vlmul, unsigned int ratio) +{ + for (const unsigned sew : ALL_SEW) + if (calculate_ratio (sew, vlmul) == ratio) + return sew; + return 0; +} + +static vlmul_type +calculate_vlmul (unsigned int sew, unsigned int ratio) +{ + for (const vlmul_type vlmul : ALL_LMUL) + if (calculate_ratio (sew, vlmul) == ratio) + return vlmul; + return LMUL_RESERVED; +} + +static bool +incompatible_avl_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return !info1.compatible_avl_p (info2) && !info2.compatible_avl_p (info1); +} + +static bool +different_sew_p (const vector_insn_info &info1, const vector_insn_info &info2) +{ + return info1.get_sew () != info2.get_sew (); +} + +static bool +different_lmul_p (const vector_insn_info &info1, const vector_insn_info &info2) +{ + return info1.get_vlmul () != info2.get_vlmul (); +} + +static bool +different_ratio_p (const vector_insn_info &info1, const vector_insn_info &info2) +{ + return info1.get_ratio () != info2.get_ratio (); +} + +static bool +different_tail_policy_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return info1.get_ta () != info2.get_ta (); +} + +static bool +different_mask_policy_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return info1.get_ma () != info2.get_ma (); +} + +static bool +possible_zero_avl_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return !info1.has_non_zero_avl () || !info2.has_non_zero_avl (); +} + +static bool +first_ratio_invalid_for_second_sew_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return calculate_vlmul (info2.get_sew (), info1.get_ratio ()) + == LMUL_RESERVED; +} + +static bool +first_ratio_invalid_for_second_lmul_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return calculate_sew (info2.get_vlmul (), info1.get_ratio ()) == 0; +} + +static bool +second_ratio_invalid_for_first_sew_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return calculate_vlmul (info1.get_sew (), info2.get_ratio ()) + == LMUL_RESERVED; +} + +static bool +second_ratio_invalid_for_first_lmul_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return calculate_sew (info1.get_vlmul (), info2.get_ratio ()) == 0; +} + +static bool +second_sew_less_than_first_sew_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return info2.get_sew () < info1.get_sew (); +} + +static bool +first_sew_less_than_second_sew_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return info1.get_sew () < info2.get_sew (); +} + +/* return 0 if LMUL1 == LMUL2. + return -1 if LMUL1 < LMUL2. + return 1 if LMUL1 > LMUL2. */ +static int +compare_lmul (vlmul_type vlmul1, vlmul_type vlmul2) +{ + if (vlmul1 == vlmul2) + return 0; + + switch (vlmul1) + { + case LMUL_1: + if (vlmul2 == LMUL_2 || vlmul2 == LMUL_4 || vlmul2 == LMUL_8) + return 1; + else + return -1; + case LMUL_2: + if (vlmul2 == LMUL_4 || vlmul2 == LMUL_8) + return 1; + else + return -1; + case LMUL_4: + if (vlmul2 == LMUL_8) + return 1; + else + return -1; + case LMUL_8: + return -1; + case LMUL_F2: + if (vlmul2 == LMUL_1 || vlmul2 == LMUL_2 || vlmul2 == LMUL_4 + || vlmul2 == LMUL_8) + return 1; + else + return -1; + case LMUL_F4: + if (vlmul2 == LMUL_F2 || vlmul2 == LMUL_1 || vlmul2 == LMUL_2 + || vlmul2 == LMUL_4 || vlmul2 == LMUL_8) + return 1; + else + return -1; + case LMUL_F8: + return 0; + default: + gcc_unreachable (); + } +} + +static bool +second_lmul_less_than_first_lmul_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return compare_lmul (info2.get_vlmul (), info1.get_vlmul ()) == -1; +} + +static bool +first_lmul_less_than_second_lmul_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return compare_lmul (info1.get_vlmul (), info2.get_vlmul ()) == -1; +} + +static bool +first_ratio_less_than_second_ratio_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return info1.get_ratio () < info2.get_ratio (); +} + +static bool +second_ratio_less_than_first_ratio_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return info2.get_ratio () < info1.get_ratio (); +} + +static CONSTEXPR const demands_cond incompatible_conds[] = { +#define DEF_INCOMPATIBLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, \ + GE_SEW1, TAIL_POLICTY1, MASK_POLICY1, AVL2, \ + SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, \ + TAIL_POLICTY2, MASK_POLICY2, COND) \ + {{{AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, TAIL_POLICTY1, \ + MASK_POLICY1}, \ + {AVL2, SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ + MASK_POLICY2}}, \ + COND}, +#include "riscv-vsetvl.def" +}; + +static unsigned +greatest_sew (const vector_insn_info &info1, const vector_insn_info &info2) +{ + return std::max (info1.get_sew (), info2.get_sew ()); +} + +static unsigned +first_sew (const vector_insn_info &info1, const vector_insn_info &) +{ + return info1.get_sew (); +} + +static unsigned +second_sew (const vector_insn_info &, const vector_insn_info &info2) +{ + return info2.get_sew (); +} + +static vlmul_type +first_vlmul (const vector_insn_info &info1, const vector_insn_info &) +{ + return info1.get_vlmul (); +} + +static vlmul_type +second_vlmul (const vector_insn_info &, const vector_insn_info &info2) +{ + return info2.get_vlmul (); +} + +static unsigned +first_ratio (const vector_insn_info &info1, const vector_insn_info &) +{ + return info1.get_ratio (); +} + +static unsigned +second_ratio (const vector_insn_info &, const vector_insn_info &info2) +{ + return info2.get_ratio (); +} + +static vlmul_type +vlmul_for_first_sew_second_ratio (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return calculate_vlmul (info1.get_sew (), info2.get_ratio ()); +} + +static unsigned +ratio_for_second_sew_first_vlmul (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + return calculate_ratio (info2.get_sew (), info1.get_vlmul ()); +} + +static CONSTEXPR const demands_fuse_rule fuse_rules[] = { +#define DEF_SEW_LMUL_FUSE_RULE(DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, \ + DEMAND_GE_SEW1, DEMAND_SEW2, DEMAND_LMUL2, \ + DEMAND_RATIO2, DEMAND_GE_SEW2, NEW_DEMAND_SEW, \ + NEW_DEMAND_LMUL, NEW_DEMAND_RATIO, \ + NEW_DEMAND_GE_SEW, NEW_SEW, NEW_VLMUL, \ + NEW_RATIO) \ + {{{DEMAND_ANY, DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, DEMAND_ANY, \ + DEMAND_GE_SEW1, DEMAND_ANY, DEMAND_ANY}, \ + {DEMAND_ANY, DEMAND_SEW2, DEMAND_LMUL2, DEMAND_RATIO2, DEMAND_ANY, \ + DEMAND_GE_SEW2, DEMAND_ANY, DEMAND_ANY}}, \ + NEW_DEMAND_SEW, \ + NEW_DEMAND_LMUL, \ + NEW_DEMAND_RATIO, \ + NEW_DEMAND_GE_SEW, \ + NEW_SEW, \ + NEW_VLMUL, \ + NEW_RATIO}, +#include "riscv-vsetvl.def" +}; + +static bool +always_unavailable (const vector_insn_info &, const vector_insn_info &) +{ + return true; +} + +static bool +avl_unavailable_p (const vector_insn_info &info1, const vector_insn_info &info2) +{ + return !info2.compatible_avl_p (info1.get_avl_info ()); +} + +static bool +sew_unavailable_p (const vector_insn_info &info1, const vector_insn_info &info2) +{ + if (!info2.demand_p (DEMAND_LMUL) && !info2.demand_p (DEMAND_RATIO)) + { + if (info2.demand_p (DEMAND_GE_SEW)) + return info1.get_sew () < info2.get_sew (); + return info1.get_sew () != info2.get_sew (); + } + return true; +} + +static bool +lmul_unavailable_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (info1.get_vlmul () == info2.get_vlmul () && !info2.demand_p (DEMAND_SEW) + && !info2.demand_p (DEMAND_RATIO)) + return false; + return true; +} + +static bool +ge_sew_unavailable_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (!info2.demand_p (DEMAND_LMUL) && !info2.demand_p (DEMAND_RATIO) + && info2.demand_p (DEMAND_GE_SEW)) + return info1.get_sew () < info2.get_sew (); + return true; +} + +static bool +ge_sew_lmul_unavailable_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (!info2.demand_p (DEMAND_RATIO) && info2.demand_p (DEMAND_GE_SEW)) + return info1.get_sew () < info2.get_sew (); + return true; +} + +static bool +ge_sew_ratio_unavailable_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (!info2.demand_p (DEMAND_LMUL) && info2.demand_p (DEMAND_GE_SEW)) + return info1.get_sew () < info2.get_sew (); + return true; +} + +static CONSTEXPR const demands_cond unavailable_conds[] = { +#define DEF_UNAVAILABLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, \ + TAIL_POLICTY1, MASK_POLICY1, AVL2, SEW2, LMUL2, \ + RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ + MASK_POLICY2, COND) \ + {{{AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, TAIL_POLICTY1, \ + MASK_POLICY1}, \ + {AVL2, SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ + MASK_POLICY2}}, \ + COND}, +#include "riscv-vsetvl.def" +}; + +static bool +same_sew_lmul_demand_p (const bool *dems1, const bool *dems2) +{ + return dems1[DEMAND_SEW] == dems2[DEMAND_SEW] + && dems1[DEMAND_LMUL] == dems2[DEMAND_LMUL] + && dems1[DEMAND_RATIO] == dems2[DEMAND_RATIO] && !dems1[DEMAND_GE_SEW] + && !dems2[DEMAND_GE_SEW]; +} + +static bool +propagate_avl_across_demands_p (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (info2.demand_p (DEMAND_AVL)) + { + if (info2.demand_p (DEMAND_NONZERO_AVL)) + return info1.demand_p (DEMAND_AVL) + && !info1.demand_p (DEMAND_NONZERO_AVL) && info1.has_avl_reg (); + } + else + return info1.demand_p (DEMAND_AVL) && info1.has_avl_reg (); + return false; +} + +static bool +reg_available_p (const bb_info *bb, const vector_insn_info &info) +{ + if (!info.get_avl_source ()) + return true; + insn_info *insn = info.get_avl_source ()->insn (); + if (insn->bb () == bb) + return before_p (insn, info.get_insn ()); + else + return dominated_by_p (CDI_DOMINATORS, bb->cfg_bb (), + insn->bb ()->cfg_bb ()); +} + avl_info::avl_info (const avl_info &other) { m_value = other.get_value (); @@ -1251,6 +1675,16 @@ avl_info::operator!= (const avl_info &other) const return !(*this == other); } +bool +avl_info::has_non_zero_avl () const +{ + if (has_avl_imm ()) + return INTVAL (get_value ()) > 0; + if (has_avl_reg ()) + return vlmax_avl_p (get_value ()); + return false; +} + /* Initialize VL/VTYPE information. */ vl_vtype_info::vl_vtype_info (avl_info avl_in, uint8_t sew_in, enum vlmul_type vlmul_in, uint8_t ratio_in, @@ -1275,16 +1709,6 @@ vl_vtype_info::operator!= (const vl_vtype_info &other) const return !(*this == other); } -bool -vl_vtype_info::has_non_zero_avl () const -{ - if (has_avl_imm ()) - return INTVAL (get_avl ()) > 0; - if (has_avl_reg ()) - return vlmax_avl_p (get_avl ()); - return false; -} - bool vl_vtype_info::same_avl_p (const vl_vtype_info &other) const { @@ -1335,28 +1759,10 @@ vector_insn_info::operator>= (const vector_insn_info &other) const if (!compatible_p (other)) return false; - if (!demand_p (DEMAND_AVL) && other.demand_p (DEMAND_AVL)) - return false; - - if (same_vlmax_p (other)) - { - if (demand_p (DEMAND_RATIO) && !other.demand_p (DEMAND_RATIO) - && (get_sew () != other.get_sew () - || get_vlmul () != other.get_vlmul ())) - return false; - - if (get_sew () == other.get_sew () && get_vlmul () == other.get_vlmul ()) - { - if (demand_p (DEMAND_RATIO) && !other.demand_p (DEMAND_RATIO)) - return false; - } - } - - if (!demand_p (DEMAND_TAIL_POLICY) && other.demand_p (DEMAND_TAIL_POLICY)) - return false; - - if (!demand_p (DEMAND_MASK_POLICY) && other.demand_p (DEMAND_MASK_POLICY)) - return false; + for (const auto &cond : unavailable_conds) + if (cond.pair.match_cond_p (this->get_demands (), other.get_demands ()) + && cond.incompatible_p (*this, other)) + return false; return true; } @@ -1467,7 +1873,8 @@ vector_insn_info::parse_insn (insn_info *insn) demand SEW && LMUL both. Some instructions may demand SEW only and ignore LMUL, will fix it later. */ m_demands[DEMAND_SEW] = true; - m_demands[DEMAND_LMUL] = true; + if (!ignore_vlmul_insn_p (insn->rtl ())) + m_demands[DEMAND_LMUL] = true; } if (get_attr_ta (insn->rtl ()) != INVALID_ATTRIBUTE) @@ -1478,32 +1885,34 @@ vector_insn_info::parse_insn (insn_info *insn) if (vector_config_insn_p (insn->rtl ())) return; - if (!has_avl_reg () || !m_avl.get_source () - || !m_avl.get_source ()->insn ()->is_phi ()) + if (scalar_move_insn_p (insn->rtl ())) + { + if (m_avl.has_non_zero_avl ()) + m_demands[DEMAND_NONZERO_AVL] = true; + if (m_ta) + m_demands[DEMAND_GE_SEW] = true; + } + + if (!m_avl.has_avl_reg () || vlmax_avl_p (get_avl ()) || !m_avl.get_source ()) + return; + if (!m_avl.get_source ()->insn ()->is_real () + && !m_avl.get_source ()->insn ()->is_phi ()) return; insn_info *def_insn = extract_single_source (m_avl.get_source ()); - if (def_insn) - { - vector_insn_info new_info; - new_info.parse_insn (def_insn); - if (!same_vlmax_p (new_info)) - return; - /* TODO: Currently, we don't forward AVL for non-VLMAX vsetvl. */ - if (vlmax_avl_p (new_info.get_avl ())) - set_avl_info (new_info.get_avl_info ()); - } -} + if (!def_insn || !vsetvl_insn_p (def_insn->rtl ())) + return; -void -vector_insn_info::demand_vl_vtype () -{ - m_state = VALID; - m_demands[DEMAND_AVL] = true; - m_demands[DEMAND_SEW] = true; - m_demands[DEMAND_LMUL] = true; - m_demands[DEMAND_TAIL_POLICY] = true; - m_demands[DEMAND_MASK_POLICY] = true; + vector_insn_info new_info; + new_info.parse_insn (def_insn); + if (!same_vlmax_p (new_info) && !scalar_move_insn_p (insn->rtl ())) + return; + /* TODO: Currently, we don't forward AVL for non-VLMAX vsetvl. */ + if (vlmax_avl_p (new_info.get_avl ())) + set_avl_info (avl_info (new_info.get_avl (), get_avl_source ())); + + if (scalar_move_insn_p (insn->rtl ()) && m_avl.has_non_zero_avl ()) + m_demands[DEMAND_NONZERO_AVL] = true; } bool @@ -1512,37 +1921,10 @@ vector_insn_info::compatible_p (const vector_insn_info &other) const gcc_assert (valid_or_dirty_p () && other.valid_or_dirty_p () && "Can't compare invalid demanded infos"); - /* Check SEW. */ - if (demand_p (DEMAND_SEW) && other.demand_p (DEMAND_SEW) - && get_sew () != other.get_sew ()) - return false; - - /* Check LMUL. */ - if (demand_p (DEMAND_LMUL) && other.demand_p (DEMAND_LMUL) - && get_vlmul () != other.get_vlmul ()) - return false; - - /* Check RATIO. */ - if (demand_p (DEMAND_RATIO) && other.demand_p (DEMAND_RATIO) - && get_ratio () != other.get_ratio ()) - return false; - if (demand_p (DEMAND_RATIO) && (other.get_sew () || other.get_vlmul ()) - && get_ratio () != other.get_ratio ()) - return false; - if (other.demand_p (DEMAND_RATIO) && (get_sew () || get_vlmul ()) - && get_ratio () != other.get_ratio ()) - return false; - - if (demand_p (DEMAND_TAIL_POLICY) && other.demand_p (DEMAND_TAIL_POLICY) - && get_ta () != other.get_ta ()) - return false; - if (demand_p (DEMAND_MASK_POLICY) && other.demand_p (DEMAND_MASK_POLICY) - && get_ma () != other.get_ma ()) - return false; - - if (demand_p (DEMAND_AVL) && other.demand_p (DEMAND_AVL)) - return compatible_avl_p (other); - + for (const auto &cond : incompatible_conds) + if (cond.pair.match_cond_p (this->get_demands (), other.get_demands ()) + && cond.incompatible_p (*this, other)) + return false; return true; } @@ -1553,6 +1935,8 @@ vector_insn_info::compatible_avl_p (const vl_vtype_info &other) const gcc_assert (!unknown_p () && "Can't compare AVL in unknown state"); if (!demand_p (DEMAND_AVL)) return true; + if (demand_p (DEMAND_NONZERO_AVL) && other.has_non_zero_avl ()) + return true; return get_avl_info () == other.get_avl_info (); } @@ -1562,6 +1946,10 @@ vector_insn_info::compatible_avl_p (const avl_info &other) const gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); gcc_assert (!unknown_p () && "Can't compare AVL in unknown state"); gcc_assert (demand_p (DEMAND_AVL) && "Can't compare AVL undemand state"); + if (!demand_p (DEMAND_AVL)) + return true; + if (demand_p (DEMAND_NONZERO_AVL) && other.has_non_zero_avl ()) + return true; return get_avl_info () == other; } @@ -1570,8 +1958,13 @@ vector_insn_info::compatible_vtype_p (const vl_vtype_info &other) const { gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); gcc_assert (!unknown_p () && "Can't compare VTYPE in unknown state"); - if (demand_p (DEMAND_SEW) && m_sew != other.get_sew ()) - return false; + if (demand_p (DEMAND_SEW)) + { + if (!demand_p (DEMAND_GE_SEW) && m_sew != other.get_sew ()) + return false; + if (demand_p (DEMAND_GE_SEW) && m_sew > other.get_sew ()) + return false; + } if (demand_p (DEMAND_LMUL) && m_vlmul != other.get_vlmul ()) return false; if (demand_p (DEMAND_RATIO) && m_ratio != other.get_ratio ()) @@ -1609,114 +2002,155 @@ vector_insn_info::compatible_p (const vl_vtype_info &curr_info) const bool vector_insn_info::available_p (const vector_insn_info &other) const { - if (*this >= other) - return true; - return false; + return *this >= other; } -vector_insn_info -vector_insn_info::merge (const vector_insn_info &merge_info, - enum merge_type type = LOCAL_MERGE) const +void +vector_insn_info::fuse_avl (const vector_insn_info &info1, + const vector_insn_info &info2) { - if (!vsetvl_insn_p (get_insn ()->rtl ())) - gcc_assert (this->compatible_p (merge_info) - && "Can't merge incompatible demanded infos"); + set_insn (info1.get_insn ()); + if (info1.demand_p (DEMAND_AVL)) + { + if (info1.demand_p (DEMAND_NONZERO_AVL)) + { + if (info2.demand_p (DEMAND_AVL) + && !info2.demand_p (DEMAND_NONZERO_AVL)) + { + set_avl_info (info2.get_avl_info ()); + set_demand (DEMAND_AVL, true); + set_demand (DEMAND_NONZERO_AVL, false); + return; + } + } + set_avl_info (info1.get_avl_info ()); + set_demand (DEMAND_NONZERO_AVL, info1.demand_p (DEMAND_NONZERO_AVL)); + } + else + { + set_avl_info (info2.get_avl_info ()); + set_demand (DEMAND_NONZERO_AVL, info2.demand_p (DEMAND_NONZERO_AVL)); + } + set_demand (DEMAND_AVL, + info1.demand_p (DEMAND_AVL) || info2.demand_p (DEMAND_AVL)); +} - vector_insn_info new_info; - new_info.demand_vl_vtype (); +void +vector_insn_info::fuse_sew_lmul (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + /* We need to fuse sew && lmul according to demand info: - if (type == LOCAL_MERGE) + 1. GE_SEW. + 2. SEW. + 3. LMUL. + 4. RATIO. */ + if (same_sew_lmul_demand_p (info1.get_demands (), info2.get_demands ())) { - /* For local backward data flow, we always update INSN && AVL as the - latest INSN and AVL so that we can keep track status of each INSN.*/ - new_info.set_insn (merge_info.get_insn ()); - if (merge_info.demand_p (DEMAND_AVL)) - new_info.set_avl_info (merge_info.get_avl_info ()); - else if (demand_p (DEMAND_AVL)) - new_info.set_avl_info (get_avl_info ()); + set_demand (DEMAND_SEW, info2.demand_p (DEMAND_SEW)); + set_demand (DEMAND_LMUL, info2.demand_p (DEMAND_LMUL)); + set_demand (DEMAND_RATIO, info2.demand_p (DEMAND_RATIO)); + set_demand (DEMAND_GE_SEW, info2.demand_p (DEMAND_GE_SEW)); + set_sew (info2.get_sew ()); + set_vlmul (info2.get_vlmul ()); + set_ratio (info2.get_ratio ()); + return; } - else + for (const auto &rule : fuse_rules) { - /* For global data flow, we should keep original INSN and AVL if they - valid since we should keep the life information of each block. - - For example: - bb 0 -> bb 1. - We should keep INSN && AVL of bb 1 since we will eventually emit - vsetvl instruction according to INSN and AVL of bb 1. */ - new_info.set_insn (get_insn ()); - if (demand_p (DEMAND_AVL)) - new_info.set_avl_info (get_avl_info ()); - else if (merge_info.demand_p (DEMAND_AVL)) - new_info.set_avl_info (merge_info.get_avl_info ()); - } - - if (!demand_p (DEMAND_AVL) && !merge_info.demand_p (DEMAND_AVL)) - new_info.undemand (DEMAND_AVL); - if (!demand_p (DEMAND_SEW) && !merge_info.demand_p (DEMAND_SEW)) - new_info.undemand (DEMAND_SEW); - if (!demand_p (DEMAND_LMUL) && !merge_info.demand_p (DEMAND_LMUL)) - new_info.undemand (DEMAND_LMUL); - - if (!demand_p (DEMAND_TAIL_POLICY) - && !merge_info.demand_p (DEMAND_TAIL_POLICY)) - new_info.undemand (DEMAND_TAIL_POLICY); - if (!demand_p (DEMAND_MASK_POLICY) - && !merge_info.demand_p (DEMAND_MASK_POLICY)) - new_info.undemand (DEMAND_MASK_POLICY); - - if (merge_info.demand_p (DEMAND_SEW)) - new_info.set_sew (merge_info.get_sew ()); - else if (demand_p (DEMAND_SEW)) - new_info.set_sew (get_sew ()); - - if (merge_info.demand_p (DEMAND_LMUL)) - new_info.set_vlmul (merge_info.get_vlmul ()); - else if (demand_p (DEMAND_LMUL)) - new_info.set_vlmul (get_vlmul ()); - - if (!new_info.demand_p (DEMAND_SEW) && !new_info.demand_p (DEMAND_LMUL)) - { - if (demand_p (DEMAND_RATIO) || merge_info.demand_p (DEMAND_RATIO)) - new_info.demand (DEMAND_RATIO); - /* Even though we don't demand_p SEW && VLMUL in this case, we still - * need them. */ - if (merge_info.demand_p (DEMAND_RATIO)) + if (rule.pair.match_cond_p (info1.get_demands (), info2.get_demands ())) { - new_info.set_sew (merge_info.get_sew ()); - new_info.set_vlmul (merge_info.get_vlmul ()); - new_info.set_ratio (merge_info.get_ratio ()); + set_demand (DEMAND_SEW, rule.demand_sew_p); + set_demand (DEMAND_LMUL, rule.demand_lmul_p); + set_demand (DEMAND_RATIO, rule.demand_ratio_p); + set_demand (DEMAND_GE_SEW, rule.demand_ge_sew_p); + set_sew (rule.new_sew (info1, info2)); + set_vlmul (rule.new_vlmul (info1, info2)); + set_ratio (rule.new_ratio (info1, info2)); + return; } - else if (demand_p (DEMAND_RATIO)) + if (rule.pair.match_cond_p (info2.get_demands (), info1.get_demands ())) { - new_info.set_sew (get_sew ()); - new_info.set_vlmul (get_vlmul ()); - new_info.set_ratio (get_ratio ()); + set_demand (DEMAND_SEW, rule.demand_sew_p); + set_demand (DEMAND_LMUL, rule.demand_lmul_p); + set_demand (DEMAND_RATIO, rule.demand_ratio_p); + set_demand (DEMAND_GE_SEW, rule.demand_ge_sew_p); + set_sew (rule.new_sew (info2, info1)); + set_vlmul (rule.new_vlmul (info2, info1)); + set_ratio (rule.new_ratio (info2, info1)); + return; } } - else + gcc_unreachable (); +} + +void +vector_insn_info::fuse_tail_policy (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (info1.demand_p (DEMAND_TAIL_POLICY)) + { + set_ta (info1.get_ta ()); + demand (DEMAND_TAIL_POLICY); + } + else if (info2.demand_p (DEMAND_TAIL_POLICY)) { - /* when get_attr_ratio is invalid, this kind of instructions - doesn't care about ratio. However, we still need this value - in demand_p info backward analysis. */ - new_info.set_ratio ( - calculate_ratio (new_info.get_sew (), new_info.get_vlmul ())); + set_ta (info2.get_ta ()); + demand (DEMAND_TAIL_POLICY); } + else + set_ta (get_default_ta ()); +} - if (merge_info.demand_p (DEMAND_TAIL_POLICY)) - new_info.set_ta (merge_info.get_ta ()); - else if (demand_p (DEMAND_TAIL_POLICY)) - new_info.set_ta (get_ta ()); +void +vector_insn_info::fuse_mask_policy (const vector_insn_info &info1, + const vector_insn_info &info2) +{ + if (info1.demand_p (DEMAND_MASK_POLICY)) + { + set_ma (info1.get_ma ()); + demand (DEMAND_MASK_POLICY); + } + else if (info2.demand_p (DEMAND_MASK_POLICY)) + { + set_ma (info2.get_ma ()); + demand (DEMAND_MASK_POLICY); + } else - new_info.set_ta (get_default_ta ()); + set_ma (get_default_ma ()); +} - if (merge_info.demand_p (DEMAND_MASK_POLICY)) - new_info.set_ma (merge_info.get_ma ()); - else if (demand_p (DEMAND_MASK_POLICY)) - new_info.set_ma (get_ma ()); +vector_insn_info +vector_insn_info::merge (const vector_insn_info &merge_info, + enum merge_type type = LOCAL_MERGE) const +{ + if (!vsetvl_insn_p (get_insn ()->rtl ())) + gcc_assert (this->compatible_p (merge_info) + && "Can't merge incompatible demanded infos"); + + vector_insn_info new_info; + new_info.set_valid (); + if (type == LOCAL_MERGE) + { + /* For local backward data flow, we always update INSN && AVL as the + latest INSN and AVL so that we can keep track status of each INSN. */ + new_info.fuse_avl (merge_info, *this); + } else - new_info.set_ma (get_default_ma ()); + { + /* For global data flow, we should keep original INSN and AVL if they + valid since we should keep the life information of each block. + For example: + bb 0 -> bb 1. + We should keep INSN && AVL of bb 1 since we will eventually emit + vsetvl instruction according to INSN and AVL of bb 1. */ + new_info.fuse_avl (*this, merge_info); + } + + new_info.fuse_sew_lmul (*this, merge_info); + new_info.fuse_tail_policy (*this, merge_info); + new_info.fuse_mask_policy (*this, merge_info); return new_info; } @@ -1740,7 +2174,9 @@ vector_insn_info::dump (FILE *file) const fprintf (file, "DIRTY,"); fprintf (file, "Demand field={%d(VL),", demand_p (DEMAND_AVL)); + fprintf (file, "%d(DEMAND_NONZERO_AVL),", demand_p (DEMAND_NONZERO_AVL)); fprintf (file, "%d(SEW),", demand_p (DEMAND_SEW)); + fprintf (file, "%d(DEMAND_GE_SEW),", demand_p (DEMAND_GE_SEW)); fprintf (file, "%d(LMUL),", demand_p (DEMAND_LMUL)); fprintf (file, "%d(RATIO),", demand_p (DEMAND_RATIO)); fprintf (file, "%d(TAIL_POLICY),", demand_p (DEMAND_TAIL_POLICY)); @@ -1895,6 +2331,8 @@ vector_infos_manager::release (void) if (!vector_exprs.is_empty ()) vector_exprs.release (); + gcc_assert (to_refine_vsetvls.is_empty ()); + gcc_assert (to_delete_vsetvls.is_empty ()); if (optimize > 0) free_bitmap_vectors (); } @@ -2167,8 +2605,13 @@ pass_vsetvl::compute_local_backward_infos (const bb_info *bb) else { gcc_assert (info.valid_p () && "Unexpected Invalid demanded info"); - if (change.valid_p () && change.compatible_p (info)) - info = change.merge (info); + if (change.valid_p ()) + { + if (!(propagate_avl_across_demands_p (info, change) + && !reg_available_p (bb, info)) + && change.compatible_p (info)) + info = change.merge (info); + } change = info; } } @@ -2282,7 +2725,7 @@ pass_vsetvl::get_backward_fusion_type (const bb_info *bb, rtx reg = NULL_RTX; /* Case 1: Don't need VL. Just let it backward propagate. */ - if (!has_vl_op (insn->rtl ())) + if (!prop.demand_p (DEMAND_AVL)) return VALID_AVL_FUSION; else { @@ -2296,16 +2739,16 @@ pass_vsetvl::get_backward_fusion_type (const bb_info *bb, gcc_assert (prop.has_avl_reg ()); if (vlmax_avl_p (prop.get_avl ())) /* Check VL operand for vsetvl vl,zero. */ - reg = get_vl (insn->rtl ()); + reg = prop.get_avl_reg_rtx (); else /* Check AVL operand for vsetvl zero,avl. */ - reg = get_avl (insn->rtl ()); + reg = prop.get_avl (); } } gcc_assert (reg); - def_info *def = find_access (insn->uses (), REGNO (reg))->def (); - if (!def->insn ()->is_phi () && def->insn ()->bb () == insn->bb ()) + if (!prop.get_avl_source ()->insn ()->is_phi () + && prop.get_avl_source ()->insn ()->bb () == insn->bb ()) return INVALID_FUSION; hash_set sets = get_all_sets (prop.get_avl_source (), true, true, true); @@ -2340,10 +2783,8 @@ pass_vsetvl::hard_empty_block_p (const bb_info *bb, basic_block cfg_bb = bb->cfg_bb (); sbitmap avin = m_vector_manager->vector_avin[cfg_bb->index]; - rtx avl = vlmax_avl_p (info.get_avl ()) ? get_vl (info.get_insn ()->rtl ()) - : get_avl (info.get_insn ()->rtl ()); - insn_info *insn = info.get_insn (); - set_info *set = find_access (insn->uses (), REGNO (avl))->def (); + set_info *set = info.get_avl_source (); + rtx avl = gen_rtx_REG (Pmode, set->regno ()); hash_set sets = get_all_sets (set, true, false, false); hash_set pred_cfg_bbs = get_all_predecessors (cfg_bb); @@ -2635,7 +3076,7 @@ pass_vsetvl::backward_demand_fusion (void) if (block_info.reaching_out.compatible_p (prop)) { - if (block_info.reaching_out >= prop) + if (block_info.reaching_out.available_p (prop)) continue; new_info = block_info.reaching_out.merge (prop, GLOBAL_MERGE); new_info.set_dirty ( @@ -2659,6 +3100,14 @@ pass_vsetvl::backward_demand_fusion (void) continue; } + if (propagate_avl_across_demands_p (prop, + block_info.reaching_out)) + { + rtx reg = new_info.get_avl_reg_rtx (); + if (find_reg_killed_by (crtl->ssa->bb (e->src), reg)) + new_info.set_dirty (true); + } + block_info.local_dem = new_info; block_info.reaching_out = new_info; changed_p = true; @@ -2683,13 +3132,11 @@ pass_vsetvl::backward_demand_fusion (void) if (set->insn () != block_info.reaching_out.get_insn ()) continue; } - else - { - if (!block_info.reaching_out.compatible_p (prop)) - continue; - if (block_info.reaching_out >= prop) - continue; - } + + if (!block_info.reaching_out.compatible_p (prop)) + continue; + if (block_info.reaching_out.available_p (prop)) + continue; vector_insn_info be_merged = block_info.reaching_out; if (block_info.local_dem == block_info.reaching_out) @@ -2699,6 +3146,10 @@ pass_vsetvl::backward_demand_fusion (void) if (curr_block_info.probability > block_info.probability) block_info.probability = curr_block_info.probability; + if (propagate_avl_across_demands_p (prop, block_info.reaching_out) + && !reg_available_p (crtl->ssa->bb (e->src), new_info)) + continue; + change_vsetvl_insn (new_info.get_insn (), new_info); if (block_info.local_dem == block_info.reaching_out) block_info.local_dem = new_info; @@ -2745,6 +3196,9 @@ pass_vsetvl::forward_demand_fusion (void) if (cfg_bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)) continue; + if (vsetvl_insn_p (prop.get_insn ()->rtl ())) + continue; + edge e; edge_iterator ei; /* Forward propagate to each successor. */ @@ -2767,10 +3221,12 @@ pass_vsetvl::forward_demand_fusion (void) /* If there is nothing to propagate, just skip it. */ if (!local_dem.valid_or_dirty_p ()) continue; - if (local_dem >= prop) + if (local_dem.available_p (prop)) continue; if (!local_dem.compatible_p (prop)) continue; + if (propagate_avl_across_demands_p (prop, local_dem)) + continue; vector_insn_info new_info = local_dem.merge (prop, GLOBAL_MERGE); new_info.set_insn (local_dem.get_insn ()); @@ -3156,8 +3612,14 @@ pass_vsetvl::refine_vsetvls (void) const if (!can_refine_vsetvl_p (cfg_bb, info)) continue; - if (!vector_config_insn_p (rinsn)) - rinsn = PREV_INSN (rinsn); + /* We can't refine user vsetvl into vsetvl zero,zero since the dest + will be used by the following instructions. */ + if (vector_config_insn_p (rinsn)) + { + m_vector_manager->to_refine_vsetvls.add (rinsn); + continue; + } + rinsn = PREV_INSN (rinsn); rtx new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, info, NULL_RTX); change_insn (rinsn, new_pat); } @@ -3189,15 +3651,17 @@ pass_vsetvl::cleanup_vsetvls () insn_info *insn = dem.get_insn (); gcc_assert (insn && insn->rtl ()); rtx_insn *rinsn; + /* We can't eliminate user vsetvl since the dest will be used + * by the following instructions. */ if (vector_config_insn_p (insn->rtl ())) - rinsn = insn->rtl (); - else { - gcc_assert (has_vtype_op (insn->rtl ())); - rinsn = PREV_INSN (insn->rtl ()); - gcc_assert ( - vector_config_insn_p (PREV_INSN (insn->rtl ()))); + m_vector_manager->to_delete_vsetvls.add (insn->rtl ()); + continue; } + + gcc_assert (has_vtype_op (insn->rtl ())); + rinsn = PREV_INSN (insn->rtl ()); + gcc_assert (vector_config_insn_p (PREV_INSN (insn->rtl ()))); eliminate_insn (rinsn); } } @@ -3265,7 +3729,8 @@ pass_vsetvl::commit_vsetvls (void) bool available_p = false; EXECUTE_IF_SET_IN_BITMAP (avin, 0, bb_index, sbi) { - if (*m_vector_manager->vector_exprs[bb_index] >= reaching_out) + if (m_vector_manager->vector_exprs[bb_index]->available_p ( + reaching_out)) { available_p = true; break; @@ -3276,12 +3741,18 @@ pass_vsetvl::commit_vsetvls (void) } rtx new_pat; - if (can_refine_vsetvl_p (cfg_bb, reaching_out)) + if (!reaching_out.demand_p (DEMAND_AVL)) + { + vl_vtype_info new_info = reaching_out; + new_info.set_avl_info (avl_info (const0_rtx, nullptr)); + new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL_RTX); + } + else if (can_refine_vsetvl_p (cfg_bb, reaching_out)) new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, reaching_out, NULL_RTX); else if (vlmax_avl_p (reaching_out.get_avl ())) new_pat = gen_vsetvl_pat (VSETVL_NORMAL, reaching_out, - get_vl (reaching_out.get_insn ()->rtl ())); + reaching_out.get_avl_reg_rtx ()); else new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, reaching_out, NULL_RTX); @@ -3459,12 +3930,36 @@ pass_vsetvl::propagate_avl (void) const { rtx vl = get_vl (insn->rtl ()); rtx avl = get_avl (insn->rtl ()); - if (vlmax_avl_p (avl)) - continue; def_info *def = find_access (insn->defs (), REGNO (vl)); set_info *set = safe_dyn_cast (def); + vector_insn_info info; + info.parse_insn (insn); gcc_assert (set); - const vl_vtype_info info = get_vl_vtype_info (insn); + if (m_vector_manager->to_delete_vsetvls.contains (insn->rtl ())) + { + m_vector_manager->to_delete_vsetvls.remove (insn->rtl ()); + if (m_vector_manager->to_refine_vsetvls.contains ( + insn->rtl ())) + m_vector_manager->to_refine_vsetvls.remove (insn->rtl ()); + if (!set->has_nondebug_insn_uses ()) + { + to_delete.add (insn->rtl ()); + continue; + } + } + if (m_vector_manager->to_refine_vsetvls.contains (insn->rtl ())) + { + m_vector_manager->to_refine_vsetvls.remove (insn->rtl ()); + if (!set->has_nondebug_insn_uses ()) + { + rtx new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, + info, NULL_RTX); + change_insn (insn->rtl (), new_pat); + continue; + } + } + if (vlmax_avl_p (avl)) + continue; rtx new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, info, NULL_RTX); if (!set->has_nondebug_insn_uses ()) diff --git a/gcc/config/riscv/riscv-vsetvl.def b/gcc/config/riscv/riscv-vsetvl.def new file mode 100644 index 00000000000..e3b494f99be --- /dev/null +++ b/gcc/config/riscv/riscv-vsetvl.def @@ -0,0 +1,684 @@ +/* VSETVL pass def for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2023-2023 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or(at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY, WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef DEF_INCOMPATIBLE_COND +#define DEF_INCOMPATIBLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, \ + GE_SEW1, TAIL_POLICTY1, MASK_POLICY1, AVL2, \ + SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, \ + TAIL_POLICTY2, MASK_POLICY2, COND) +#endif + +#ifndef DEF_SEW_LMUL_FUSE_RULE +#define DEF_SEW_LMUL_FUSE_RULE(DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, \ + DEMAND_GE_SEW1, DEMAND_SEW2, DEMAND_LMUL2, \ + DEMAND_RATIO2, DEMAND_GE_SEW2, NEW_DEMAND_SEW, \ + NEW_DEMAND_LMUL, NEW_DEMAND_RATIO, \ + NEW_DEMAND_GE_SEW, NEW_SEW, NEW_VLMUL, \ + NEW_RATIO) +#endif + +#ifndef DEF_UNAVAILABLE_COND +#define DEF_UNAVAILABLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, \ + TAIL_POLICTY1, MASK_POLICY1, AVL2, SEW2, LMUL2, \ + RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ + MASK_POLICY2, COND) +#endif + +/* Case 1: Demand compatible AVL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ incompatible_avl_p) + +/* Case 2: Demand same SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_sew_p) + +/* Case 3: Demand same LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) + +/* Case 4: Demand same RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) + +/* Case 5: Demand same TAIL_POLICY. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_TRUE, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_TRUE, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_tail_policy_p) + +/* Case 6: Demand same MASK_POLICY. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_TRUE, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_TRUE, + /*COND*/ different_mask_policy_p) + +/* Case 7: Demand non zero AVL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_TRUE, /*GE_SEW*/ DEMAND_ANY, + DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY, + DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ possible_zero_avl_p) +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_TRUE, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ possible_zero_avl_p) + +/* Case 8: First SEW/LMUL/GE_SEW <-> Second RATIO/SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_ratio_invalid_for_first_sew_p) +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_ratio_invalid_for_first_lmul_p) +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_sew_less_than_first_sew_p) + +/* Case 9: Second SEW/LMUL/GE_SEW <-> First RATIO/SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_ratio_invalid_for_second_sew_p) +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_ratio_invalid_for_second_lmul_p) +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_sew_less_than_second_sew_p) + +/* Case 10: First (GE_SEW + LMUL) <-> Second RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_ratio_less_than_first_ratio_p) +/* Case 11: First (SEW + LMUL) <-> Second RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) +/* Case 13: First (GE_SEW/SEW + RATIO) <-> Second LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) +/* Case 14: First (LMUL + RATIO) <-> Second SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_sew_p) +/* Case 15: First (LMUL + RATIO) <-> Second GE_SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_sew_less_than_second_sew_p) + +/* Case 16: Second (GE_SEW + LMUL) <-> First RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_ratio_less_than_second_ratio_p) +/* Case 17: Second (SEW + LMUL) <-> First RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) +/* Case 18: Second (GE_SEW/SEW + RATIO) <-> First LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) +/* Case 19: Second (LMUL + RATIO) <-> First SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_sew_p) +/* Case 20: Second (LMUL + RATIO) <-> First GE_SEW. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_sew_less_than_first_sew_p) + +/* Case 18: First SEW + Second LMUL <-> First RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) +/* Case 19: First SEW + Second LMUL <-> Second RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_sew_p) +/* Case 20: Second SEW + First LMUL <-> First RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_sew_p) +/* Case 21: Second SEW + First LMUL <-> Second RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) + +/* Case 22: First SEW + Second RATIO <-> First LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) +/* Case 23: Second SEW + First RATIO <-> Second LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) + +/* Case 24: First GE_SEW + Second LMUL <-> First RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_lmul_less_than_first_lmul_p) +/* Case 25: First GE_SEW + Second LMUL <-> Second RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_sew_less_than_first_sew_p) +/* Case 26: Second GE_SEW + First LMUL <-> First RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_sew_less_than_second_sew_p) +/* Case 27: Second GE_SEW + First LMUL <-> Second RATIO. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_lmul_less_than_second_lmul_p) + +/* Case 28: First GE_SEW + Second RATIO <-> First LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ second_ratio_less_than_first_ratio_p) +/* Case 29: Second GE_SEW + First RATIO <-> Second LMUL. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ first_ratio_less_than_second_ratio_p) + +/* Case 31: First GE_SEW + Second SEW + First LMUL + Second ratio. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) + +/* Case 32: First GE_SEW + Second SEW + Second LMUL + First ratio. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) + +/* Case 33: Second GE_SEW + First SEW + First LMUL + Second ratio. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_ratio_p) + +/* Case 34: Second GE_SEW + First SEW + Second LMUL + First ratio. */ +DEF_INCOMPATIBLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ different_lmul_p) + +/* Merge rules. */ +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_TRUE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ false, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ true, greatest_sew, first_vlmul, + first_ratio) + +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_ANY, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*SEW*/ DEMAND_ANY, /*LMUL*/ DEMAND_ANY, + /*RATIO*/ DEMAND_TRUE, /*GE_SEW*/ DEMAND_ANY, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ true, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ false, first_sew, + vlmul_for_first_sew_second_ratio, second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_ANY, /*LMUL*/ DEMAND_TRUE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_ANY, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_ANY, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ true, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ false, second_sew, first_vlmul, + ratio_for_second_sew_first_vlmul) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_FALSE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_TRUE, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ false, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ true, first_sew, + vlmul_for_first_sew_second_ratio, second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_TRUE, /*GE_SEW*/ DEMAND_TRUE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ false, + /*NEW_DEMAND_RATIO*/ true, + /*NEW_DEMAND_GE_SEW*/ true, greatest_sew, first_vlmul, + second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_FALSE, /*LMUL*/ DEMAND_TRUE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ true, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ true, first_sew, second_vlmul, + second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_TRUE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ true, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ false, second_sew, second_vlmul, + second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_TRUE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_TRUE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ true, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ false, greatest_sew, second_vlmul, + second_ratio) + +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ false, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ false, second_sew, second_vlmul, + second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_TRUE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ true, + /*NEW_DEMAND_RATIO*/ false, + /*NEW_DEMAND_GE_SEW*/ false, second_sew, first_vlmul, + second_ratio) +DEF_SEW_LMUL_FUSE_RULE (/*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_TRUE, /*GE_SEW*/ DEMAND_TRUE, + /*SEW*/ DEMAND_TRUE, /*LMUL*/ DEMAND_FALSE, + /*RATIO*/ DEMAND_FALSE, /*GE_SEW*/ DEMAND_FALSE, + /*NEW_DEMAND_SEW*/ true, + /*NEW_DEMAND_LMUL*/ false, + /*NEW_DEMAND_RATIO*/ true, + /*NEW_DEMAND_GE_SEW*/ false, second_sew, first_vlmul, + first_ratio) + +/* Define the unavailable cases for LCM. */ + +/* Case 1: Dem1 (Not demand AVL) is unavailable to Dem2 (Demand AVL). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_FALSE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ always_unavailable) +/* Case 2: Dem1 (Demand AVL) is unavailable to Dem2 (Demand normal AVL). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_TRUE, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ avl_unavailable_p) + +/* Case 3: Dem1 (Not demand TAIL) is unavailable to Dem2 (Demand TAIL). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_FALSE, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_TRUE, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ always_unavailable) + +/* Case 4: Dem1 (Not demand MASK) is unavailable to Dem2 (Demand MASK). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_FALSE, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_TRUE, + /*COND*/ always_unavailable) + +/* Case 5: Dem1 (Demand RATIO) is unavailable to Dem2 (Demand SEW/GE_SEW/LMUL). + */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_FALSE, + /*LMUL*/ DEMAND_FALSE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ always_unavailable) +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_FALSE, + /*LMUL*/ DEMAND_FALSE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ always_unavailable) + +/* Case 6: Dem1 (Demand SEW). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_FALSE, /*RATIO*/ DEMAND_FALSE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ sew_unavailable_p) + +/* Case 7: Dem1 (Demand LMUL). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_FALSE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_FALSE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_FALSE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ lmul_unavailable_p) + +/* Case 8: Dem1 (Demand GE_SEW). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_FALSE, /*RATIO*/ DEMAND_FALSE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ ge_sew_unavailable_p) + +/* Case 9: Dem1 (Demand GE_SEW + LMUL). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_TRUE, /*RATIO*/ DEMAND_FALSE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ ge_sew_lmul_unavailable_p) + +/* Case 10: Dem1 (Demand GE_SEW + RATIO). */ +DEF_UNAVAILABLE_COND (/*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_TRUE, + /*LMUL*/ DEMAND_FALSE, /*RATIO*/ DEMAND_TRUE, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_TRUE, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*AVL*/ DEMAND_ANY, /*SEW*/ DEMAND_ANY, + /*LMUL*/ DEMAND_ANY, /*RATIO*/ DEMAND_ANY, + /*NONZERO_AVL*/ DEMAND_ANY, /*GE_SEW*/ DEMAND_ANY, + /*TAIL_POLICTY*/ DEMAND_ANY, /*MASK_POLICY*/ DEMAND_ANY, + /*COND*/ ge_sew_ratio_unavailable_p) + +#undef DEF_INCOMPATIBLE_COND +#undef DEF_SEW_LMUL_FUSE_RULE +#undef DEF_UNAVAILABLE_COND diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h index 09df4695518..7b6fadf6269 100644 --- a/gcc/config/riscv/riscv-vsetvl.h +++ b/gcc/config/riscv/riscv-vsetvl.h @@ -47,11 +47,20 @@ enum demand_type DEMAND_SEW, DEMAND_LMUL, DEMAND_RATIO, + DEMAND_NONZERO_AVL, + DEMAND_GE_SEW, DEMAND_TAIL_POLICY, DEMAND_MASK_POLICY, NUM_DEMAND }; +enum demand_status +{ + DEMAND_FALSE, + DEMAND_TRUE, + DEMAND_ANY, +}; + enum fusion_type { INVALID_FUSION, @@ -162,6 +171,14 @@ public: avl_info &operator= (const avl_info &); bool operator== (const avl_info &) const; bool operator!= (const avl_info &) const; + + bool has_avl_imm () const + { + return get_value () && CONST_INT_P (get_value ()); + } + bool has_avl_reg () const { return get_value () && REG_P (get_value ()); } + bool has_avl_no_reg () const { return !get_value (); } + bool has_non_zero_avl () const; }; /* Basic structure to save VL/VTYPE information. */ @@ -197,10 +214,10 @@ public: bool operator== (const vl_vtype_info &) const; bool operator!= (const vl_vtype_info &) const; - bool has_avl_imm () const { return get_avl () && CONST_INT_P (get_avl ()); } - bool has_avl_reg () const { return get_avl () && REG_P (get_avl ()); } - bool has_avl_no_reg () const { return !get_avl (); } - bool has_non_zero_avl () const; + bool has_avl_imm () const { return m_avl.has_avl_imm (); } + bool has_avl_reg () const { return m_avl.has_avl_reg (); } + bool has_avl_no_reg () const { return m_avl.has_avl_no_reg (); } + bool has_non_zero_avl () const { return m_avl.has_non_zero_avl (); }; rtx get_avl () const { return m_avl.get_value (); } const avl_info &get_avl_info () const { return m_avl; } @@ -353,8 +370,14 @@ public: bool demand_p (enum demand_type type) const { return m_demands[type]; } void demand (enum demand_type type) { m_demands[type] = true; } - void demand_vl_vtype (); - void undemand (enum demand_type type) { m_demands[type] = false; } + void set_demand (enum demand_type type, bool value) + { + m_demands[type] = value; + } + void fuse_avl (const vector_insn_info &, const vector_insn_info &); + void fuse_sew_lmul (const vector_insn_info &, const vector_insn_info &); + void fuse_tail_policy (const vector_insn_info &, const vector_insn_info &); + void fuse_mask_policy (const vector_insn_info &, const vector_insn_info &); bool compatible_p (const vector_insn_info &) const; bool compatible_avl_p (const vl_vtype_info &) const; @@ -364,6 +387,11 @@ public: vector_insn_info merge (const vector_insn_info &, enum merge_type) const; rtl_ssa::insn_info *get_insn () const { return m_insn; } + const bool *get_demands (void) const { return m_demands; } + rtx get_avl_reg_rtx (void) const + { + return gen_rtx_REG (Pmode, get_avl_source ()->regno ()); + } void dump (FILE *) const; }; @@ -388,6 +416,8 @@ public: auto_vec vector_insn_infos; auto_vec vector_block_infos; auto_vec vector_exprs; + hash_set to_refine_vsetvls; + hash_set to_delete_vsetvls; struct edge_list *vector_edge_list; sbitmap *vector_kill; @@ -426,5 +456,49 @@ public: void dump (FILE *) const; }; +struct demands_pair +{ + demand_status first[NUM_DEMAND]; + demand_status second[NUM_DEMAND]; + bool match_cond_p (const bool *dems1, const bool *dems2) const + { + for (unsigned i = 0; i < NUM_DEMAND; i++) + { + if (first[i] != DEMAND_ANY && first[i] != dems1[i]) + return false; + if (second[i] != DEMAND_ANY && second[i] != dems2[i]) + return false; + } + return true; + } +}; + +struct demands_cond +{ + demands_pair pair; + using CONDITION_TYPE + = bool (*) (const vector_insn_info &, const vector_insn_info &); + CONDITION_TYPE incompatible_p; +}; + +struct demands_fuse_rule +{ + demands_pair pair; + bool demand_sew_p; + bool demand_lmul_p; + bool demand_ratio_p; + bool demand_ge_sew_p; + + using NEW_SEW + = unsigned (*) (const vector_insn_info &, const vector_insn_info &); + using NEW_VLMUL + = vlmul_type (*) (const vector_insn_info &, const vector_insn_info &); + using NEW_RATIO + = unsigned (*) (const vector_insn_info &, const vector_insn_info &); + NEW_SEW new_sew; + NEW_VLMUL new_vlmul; + NEW_RATIO new_ratio; +}; + } // namespace riscv_vector #endif diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index cb817abcfde..a4211c70e51 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -367,11 +367,11 @@ ]) (define_mode_attr VLMUL1 [ - (VNx1QI "VNx8QI") (VNx2QI "VNx8QI") (VNx4QI "VNx8QI") + (VNx1QI "VNx8QI") (VNx2QI "VNx8QI") (VNx4QI "VNx8QI") (VNx8QI "VNx8QI") (VNx16QI "VNx8QI") (VNx32QI "VNx8QI") (VNx64QI "VNx8QI") - (VNx1HI "VNx4HI") (VNx2HI "VNx4HI") (VNx4HI "VNx4HI") + (VNx1HI "VNx4HI") (VNx2HI "VNx4HI") (VNx4HI "VNx4HI") (VNx8HI "VNx4HI") (VNx16HI "VNx4HI") (VNx32HI "VNx4HI") - (VNx1SI "VNx2SI") (VNx2SI "VNx2SI") (VNx4SI "VNx2SI") + (VNx1SI "VNx2SI") (VNx2SI "VNx2SI") (VNx4SI "VNx2SI") (VNx8SI "VNx2SI") (VNx16SI "VNx2SI") (VNx1DI "VNx1DI") (VNx2DI "VNx1DI") (VNx4DI "VNx1DI") (VNx8DI "VNx1DI") @@ -382,40 +382,40 @@ ]) (define_mode_attr VLMUL1_ZVE32 [ - (VNx1QI "VNx4QI") (VNx2QI "VNx4QI") (VNx4QI "VNx4QI") + (VNx1QI "VNx4QI") (VNx2QI "VNx4QI") (VNx4QI "VNx4QI") (VNx8QI "VNx4QI") (VNx16QI "VNx4QI") (VNx32QI "VNx4QI") - (VNx1HI "VNx2HI") (VNx2HI "VNx2HI") (VNx4HI "VNx2HI") + (VNx1HI "VNx2HI") (VNx2HI "VNx2HI") (VNx4HI "VNx2HI") (VNx8HI "VNx2HI") (VNx16HI "VNx2HI") - (VNx1SI "VNx1SI") (VNx2SI "VNx1SI") (VNx4SI "VNx1SI") + (VNx1SI "VNx1SI") (VNx2SI "VNx1SI") (VNx4SI "VNx1SI") (VNx8SI "VNx1SI") (VNx1SF "VNx2SF") (VNx2SF "VNx2SF") (VNx4SF "VNx2SF") (VNx8SF "VNx2SF") ]) (define_mode_attr VWLMUL1 [ - (VNx1QI "VNx4HI") (VNx2QI "VNx4HI") (VNx4QI "VNx4HI") + (VNx1QI "VNx4HI") (VNx2QI "VNx4HI") (VNx4QI "VNx4HI") (VNx8QI "VNx4HI") (VNx16QI "VNx4HI") (VNx32QI "VNx4HI") (VNx64QI "VNx4HI") - (VNx1HI "VNx2SI") (VNx2HI "VNx2SI") (VNx4HI "VNx2SI") + (VNx1HI "VNx2SI") (VNx2HI "VNx2SI") (VNx4HI "VNx2SI") (VNx8HI "VNx2SI") (VNx16HI "VNx2SI") (VNx32HI "VNx2SI") - (VNx1SI "VNx1DI") (VNx2SI "VNx1DI") (VNx4SI "VNx1DI") + (VNx1SI "VNx1DI") (VNx2SI "VNx1DI") (VNx4SI "VNx1DI") (VNx8SI "VNx1DI") (VNx16SI "VNx1DI") (VNx1SF "VNx1DF") (VNx2SF "VNx1DF") (VNx4SF "VNx1DF") (VNx8SF "VNx1DF") (VNx16SF "VNx1DF") ]) (define_mode_attr VWLMUL1_ZVE32 [ - (VNx1QI "VNx2HI") (VNx2QI "VNx2HI") (VNx4QI "VNx2HI") + (VNx1QI "VNx2HI") (VNx2QI "VNx2HI") (VNx4QI "VNx2HI") (VNx8QI "VNx2HI") (VNx16QI "VNx2HI") (VNx32QI "VNx2HI") - (VNx1HI "VNx1SI") (VNx2HI "VNx1SI") (VNx4HI "VNx1SI") + (VNx1HI "VNx1SI") (VNx2HI "VNx1SI") (VNx4HI "VNx1SI") (VNx8HI "VNx1SI") (VNx16HI "VNx1SI") ]) (define_mode_attr vlmul1 [ - (VNx1QI "vnx8qi") (VNx2QI "vnx8qi") (VNx4QI "vnx8qi") + (VNx1QI "vnx8qi") (VNx2QI "vnx8qi") (VNx4QI "vnx8qi") (VNx8QI "vnx8qi") (VNx16QI "vnx8qi") (VNx32QI "vnx8qi") (VNx64QI "vnx8qi") - (VNx1HI "vnx4hi") (VNx2HI "vnx4hi") (VNx4HI "vnx4hi") + (VNx1HI "vnx4hi") (VNx2HI "vnx4hi") (VNx4HI "vnx4hi") (VNx8HI "vnx4hi") (VNx16HI "vnx4hi") (VNx32HI "vnx4hi") - (VNx1SI "vnx2si") (VNx2SI "vnx2si") (VNx4SI "vnx2si") + (VNx1SI "vnx2si") (VNx2SI "vnx2si") (VNx4SI "vnx2si") (VNx8SI "vnx2si") (VNx16SI "vnx2si") (VNx1DI "vnx1DI") (VNx2DI "vnx1DI") (VNx4DI "vnx1DI") (VNx8DI "vnx1DI") @@ -426,31 +426,31 @@ ]) (define_mode_attr vlmul1_zve32 [ - (VNx1QI "vnx4qi") (VNx2QI "vnx4qi") (VNx4QI "vnx4qi") + (VNx1QI "vnx4qi") (VNx2QI "vnx4qi") (VNx4QI "vnx4qi") (VNx8QI "vnx4qi") (VNx16QI "vnx4qi") (VNx32QI "vnx4qi") - (VNx1HI "vnx2hi") (VNx2HI "vnx2hi") (VNx4HI "vnx2hi") + (VNx1HI "vnx2hi") (VNx2HI "vnx2hi") (VNx4HI "vnx2hi") (VNx8HI "vnx2hi") (VNx16HI "vnx2hi") - (VNx1SI "vnx1si") (VNx2SI "vnx1si") (VNx4SI "vnx1si") + (VNx1SI "vnx1si") (VNx2SI "vnx1si") (VNx4SI "vnx1si") (VNx8SI "vnx1si") (VNx1SF "vnx1sf") (VNx2SF "vnx1sf") (VNx4SF "vnx1sf") (VNx8SF "vnx1sf") ]) (define_mode_attr vwlmul1 [ - (VNx1QI "vnx4hi") (VNx2QI "vnx4hi") (VNx4QI "vnx4hi") + (VNx1QI "vnx4hi") (VNx2QI "vnx4hi") (VNx4QI "vnx4hi") (VNx8QI "vnx4hi") (VNx16QI "vnx4hi") (VNx32QI "vnx4hi") (VNx64QI "vnx4hi") - (VNx1HI "vnx2si") (VNx2HI "vnx2si") (VNx4HI "vnx2si") + (VNx1HI "vnx2si") (VNx2HI "vnx2si") (VNx4HI "vnx2si") (VNx8HI "vnx2si") (VNx16HI "vnx2si") (VNx32HI "vnx2SI") - (VNx1SI "vnx2di") (VNx2SI "vnx2di") (VNx4SI "vnx2di") + (VNx1SI "vnx2di") (VNx2SI "vnx2di") (VNx4SI "vnx2di") (VNx8SI "vnx2di") (VNx16SI "vnx2di") (VNx1SF "vnx1df") (VNx2SF "vnx1df") (VNx4SF "vnx1df") (VNx8SF "vnx1df") (VNx16SF "vnx1df") ]) (define_mode_attr vwlmul1_zve32 [ - (VNx1QI "vnx2hi") (VNx2QI "vnx2hi") (VNx4QI "vnx2hi") + (VNx1QI "vnx2hi") (VNx2QI "vnx2hi") (VNx4QI "vnx2hi") (VNx8QI "vnx2hi") (VNx16QI "vnx2hi") (VNx32QI "vnx2hi") - (VNx1HI "vnx1si") (VNx2HI "vnx1si") (VNx4HI "vnx1si") + (VNx1HI "vnx1si") (VNx2HI "vnx1si") (VNx4HI "vnx1si") (VNx8HI "vnx1si") (VNx16HI "vnx1SI") ]) diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 69b7cafbf17..4d5b7c6e8f2 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -151,8 +151,9 @@ vfwalu,vfwmul,vfsqrt,vfrecp,vfsgnj,vfcmp,\ vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\ vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,\ - vfncvtftof,vfmuladd,vfwmuladd,vfclass,vired, - viwred,vfredu,vfredo,vfwredu,vfwredo") + vfncvtftof,vfmuladd,vfwmuladd,vfclass,vired,\ + viwred,vfredu,vfredo,vfwredu,vfwredo,vimovvx,\ + vimovxv,vfmovvf,vfmovfv") (const_int INVALID_ATTRIBUTE) (eq_attr "mode" "VNx1QI,VNx1BI") (symbol_ref "riscv_vector::get_ratio(E_VNx1QImode)") @@ -208,7 +209,7 @@ vmiota,vmidx,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\ vfsqrt,vfrecp,vfsgnj,vfcmp,vfcvtitof,vfcvtftoi,vfwcvtitof,\ vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,vfclass,\ - vired,viwred,vfredu,vfredo,vfwredu,vfwredo") + vired,viwred,vfredu,vfredo,vfwredu,vfwredo,vimovxv,vfmovfv") (const_int 2) (eq_attr "type" "vimerge,vfmerge") @@ -223,7 +224,7 @@ (cond [(eq_attr "type" "vlde,vste,vimov,vfmov,vldm,vstm,vmalu,vsts,vstux,\ vstox,vext,vmsfs,vmiota,vfsqrt,vfrecp,vfcvtitof,\ vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\ - vfncvtftoi,vfncvtftof,vfclass") + vfncvtftoi,vfncvtftof,vfclass,vimovxv,vfmovfv") (const_int 4) ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast. @@ -250,7 +251,7 @@ (define_attr "ta" "" (cond [(eq_attr "type" "vlde,vimov,vfmov,vext,vmiota,vfsqrt,vfrecp,\ vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,\ - vfncvtitof,vfncvtftoi,vfncvtftof,vfclass") + vfncvtitof,vfncvtftoi,vfncvtftof,vfclass,vimovxv,vfmovfv") (symbol_ref "riscv_vector::get_ta(operands[5])") ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast. @@ -306,7 +307,8 @@ (cond [(eq_attr "type" "vlde,vlde,vste,vimov,vimov,vimov,vfmov,vext,vimerge,\ vfsqrt,vfrecp,vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\ vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\ - vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo") + vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\ + vimovxv,vfmovfv") (symbol_ref "INTVAL (operands[7])") (eq_attr "type" "vldm,vstm,vimov,vmalu,vmalu") (symbol_ref "INTVAL (operands[5])") @@ -985,6 +987,8 @@ ;; - 7.5. Vector Strided Instructions (zero stride) ;; - 11.16 Vector Integer Move Instructions (vmv.v.x) ;; - 13.16 Vector Floating-Point Move Instruction (vfmv.v.f) +;; - 16.1 Integer Scalar Move Instructions (vmv.s.x) +;; - 16.2 Floating-Point Scalar Move Instructions (vfmv.s.f) ;; ------------------------------------------------------------------------------- ;; According to RVV ISA, vector-scalar instruction doesn't support @@ -995,25 +999,70 @@ ;; expand --> LICM (Loop invariant) --> split. ;; To use LICM optimization, we postpone generation of vlse.v to split stage since ;; a memory access instruction can not be optimized by LICM (Loop invariant). -(define_insn_and_split "@pred_broadcast" - [(set (match_operand:VI 0 "register_operand" "=vr, vr, vr") +(define_expand "@pred_broadcast" + [(set (match_operand:V 0 "register_operand") + (if_then_else:V + (unspec: + [(match_operand: 1 "vector_broadcast_mask_operand") + (match_operand 4 "vector_length_operand") + (match_operand 5 "const_int_operand") + (match_operand 6 "const_int_operand") + (match_operand 7 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (vec_duplicate:V + (match_operand: 3 "direct_broadcast_operand")) + (match_operand:V 2 "vector_merge_operand")))] + "TARGET_VECTOR" +{ + /* Handle vmv.s.x instruction which has memory scalar. */ + if (satisfies_constraint_Wdm (operands[3]) || riscv_vector::simm5_p (operands[3]) + || rtx_equal_p (operands[3], CONST0_RTX (mode))) + { + if (satisfies_constraint_Wb1 (operands[1])) + { + // Case 1: vmv.s.x (TA) ==> vlse.v (TA) + if (satisfies_constraint_vu (operands[2])) + operands[1] = CONSTM1_RTX (mode); + else if (GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (Pmode)) + { + // Case 2: vmv.s.x (TU) ==> andi vl + vlse.v (TU) in RV32 system. + rtx tmp = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (Pmode, operands[4], const1_rtx))); + operands[4] = tmp; + operands[1] = CONSTM1_RTX (mode); + } + else + operands[3] = force_reg (mode, operands[3]); + } + } + else if (GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (Pmode) + && immediate_operand (operands[3], Pmode)) + operands[3] = gen_rtx_SIGN_EXTEND (mode, force_reg (Pmode, operands[3])); + else + operands[3] = force_reg (mode, operands[3]); +}) + +(define_insn_and_split "*pred_broadcast" + [(set (match_operand:VI 0 "register_operand" "=vr, vd, vr, vr") (if_then_else:VI (unspec: - [(match_operand: 1 "vector_mask_operand" " Wc1, vm, Wc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_broadcast_mask_operand" " Wc1, vm, Wc1, Wb1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (vec_duplicate:VI - (match_operand: 3 "direct_broadcast_operand" " r, Wdm, Wdm")) - (match_operand:VI 2 "vector_merge_operand" "0vu, 0vu, 0vu")))] + (match_operand: 3 "direct_broadcast_operand" " r, Wdm, Wdm, r")) + (match_operand:VI 2 "vector_merge_operand" "0vu, 0vu, 0vu, 0vu")))] "TARGET_VECTOR" "@ vmv.v.x\t%0,%3 vlse.v\t%0,%3,zero,%1.t - vlse.v\t%0,%3,zero" + vlse.v\t%0,%3,zero + vmv.s.x\t%0,%3" "register_operand (operands[3], mode) && GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (Pmode)" [(set (match_dup 0) @@ -1030,30 +1079,65 @@ emit_move_insn (m, operands[3]); m = gen_rtx_MEM (mode, force_reg (Pmode, XEXP (m, 0))); operands[3] = m; + + /* For SEW = 64 in RV32 system, we expand vmv.s.x: + andi a2,a2,1 + vsetvl zero,a2,e64 + vlse64.v */ + if (satisfies_constraint_Wb1 (operands[1])) + { + rtx tmp = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (Pmode, operands[4], const1_rtx))); + operands[4] = tmp; + operands[1] = CONSTM1_RTX (mode); + } } - [(set_attr "type" "vimov,vlds,vlds") + [(set_attr "type" "vimov,vlds,vlds,vimovxv") (set_attr "mode" "")]) -(define_insn "@pred_broadcast" - [(set (match_operand:VF 0 "register_operand" "=vr, vr, vr") +(define_insn "*pred_broadcast" + [(set (match_operand:VF 0 "register_operand" "=vr, vr, vr, vr") (if_then_else:VF (unspec: - [(match_operand: 1 "vector_mask_operand" " Wc1, vm, Wc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_broadcast_mask_operand" " Wc1, vm, Wc1, Wb1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (vec_duplicate:VF - (match_operand: 3 "direct_broadcast_operand" " f, Wdm, Wdm")) - (match_operand:VF 2 "vector_merge_operand" "0vu, 0vu, 0vu")))] + (match_operand: 3 "direct_broadcast_operand" " f, Wdm, Wdm, f")) + (match_operand:VF 2 "vector_merge_operand" "0vu, 0vu, 0vu, 0vu")))] "TARGET_VECTOR" "@ vfmv.v.f\t%0,%3 vlse.v\t%0,%3,zero,%1.t - vlse.v\t%0,%3,zero" - [(set_attr "type" "vfmov") + vlse.v\t%0,%3,zero + vfmv.s.f\t%0,%3" + [(set_attr "type" "vfmov,vlds,vlds,vfmovfv") + (set_attr "mode" "")]) + +(define_insn "*pred_broadcast_extended_scalar" + [(set (match_operand:VI_D 0 "register_operand" "=vr, vr") + (if_then_else:VI_D + (unspec: + [(match_operand: 1 "vector_broadcast_mask_operand" " Wc1, Wb1") + (match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (vec_duplicate:VI_D + (sign_extend: + (match_operand: 3 "register_operand" " r, r"))) + (match_operand:VI_D 2 "vector_merge_operand" "0vu, 0vu")))] + "TARGET_VECTOR" + "@ + vmv.v.x\t%0,%3 + vmv.s.x\t%0,%3" + [(set_attr "type" "vimov,vimovxv") (set_attr "mode" "")]) ;; ------------------------------------------------------------------------------- @@ -6390,3 +6474,106 @@ "vfwredsum.vs\t%0,%3,%4%p1" [(set_attr "type" "vfwred") (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- Predicated permutation operations +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 16.1 Integer Scalar Move Instructions +;; - 16.2 Floating-Point Scalar Move Instructions +;; - 16.3 Vector Slide Instructions +;; - 16.4 Vector Register Gather Instructions +;; - 16.5 Vector Compress Instruction +;; ------------------------------------------------------------------------------- + +(define_expand "@pred_extract_first" + [(set (match_operand: 0 "register_operand") + (unspec: + [(vec_select: + (match_operand:VI 1 "reg_or_mem_operand") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE))] + "TARGET_VECTOR" +{ + if (MEM_P (operands[1])) + { + /* Combine vle.v + vmv.x.s ==> lw. */ + emit_move_insn (operands[0], gen_rtx_MEM (mode, XEXP (operands[1], 0))); + DONE; + } +}) + +(define_insn_and_split "*pred_extract_first" + [(set (match_operand: 0 "register_operand" "=r") + (unspec: + [(vec_select: + (match_operand:VI 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE))] + "TARGET_VECTOR" + "vmv.x.s\t%0,%1" + "known_gt (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (Pmode))" + [(const_int 0)] +{ + /* In rv32 system, we can't use vmv.x.s directly. + Instead, we should generate this following code sequence: + vsrl.vx v16,v8,a0 + vmv.x.s a1,v16 + vmv.x.s a0,v8 */ + rtx nbits = force_reg (Pmode, gen_int_mode (GET_MODE_BITSIZE (Pmode), Pmode)); + rtx high_bits = gen_reg_rtx (mode); + emit_insn (gen_pred_scalar (LSHIFTRT, mode, high_bits, CONSTM1_RTX (mode), + RVV_VUNDEF (mode), operands[1], nbits, /* vl */ const1_rtx, + gen_int_mode (riscv_vector::TAIL_ANY, Pmode), + gen_int_mode (riscv_vector::MASK_ANY, Pmode), + gen_int_mode (riscv_vector::NONVLMAX, Pmode))); + emit_insn (gen_pred_extract_first_trunc (mode, + gen_highpart (SImode, operands[0]), high_bits)); + emit_insn (gen_pred_extract_first_trunc (mode, + gen_lowpart (SImode, operands[0]), operands[1])); + DONE; +} + [(set_attr "type" "vimovvx") + (set_attr "mode" "")]) + +(define_insn "@pred_extract_first_trunc" + [(set (match_operand:SI 0 "register_operand" "=r") + (truncate:SI + (unspec: + [(vec_select: + (match_operand:VI_D 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)))] + "TARGET_VECTOR" + "vmv.x.s\t%0,%1" + [(set_attr "type" "vimovvx") + (set_attr "mode" "")]) + +(define_expand "@pred_extract_first" + [(set (match_operand: 0 "register_operand") + (unspec: + [(vec_select: + (match_operand:VF 1 "reg_or_mem_operand") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE))] + "TARGET_VECTOR" +{ + if (MEM_P (operands[1])) + { + /* Combine vle.v + vmv.f.s ==> flw. */ + emit_move_insn (operands[0], gen_rtx_MEM (mode, XEXP (operands[1], 0))); + DONE; + } +}) + +(define_insn "*pred_extract_first" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: + [(vec_select: + (match_operand:VF 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE))] + "TARGET_VECTOR" + "vfmv.f.s\t%0,%1" + [(set_attr "type" "vfmovvf") + (set_attr "mode" "")]) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c index 5b5a31b0eb6..1204f83e6bd 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c @@ -19,5 +19,5 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c } /* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ /* { dg-final { scan-assembler-times {vsetvli} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c index e880c2ccdc2..637b0a52295 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c @@ -19,5 +19,5 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c } /* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ /* { dg-final { scan-assembler-times {vsetvli} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c index c141f8bede2..e32994e5212 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c @@ -22,5 +22,5 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c } /* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ /* { dg-final { scan-assembler-times {vsetvli} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c index 12151bc519b..79a6f271997 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c @@ -18,6 +18,6 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c } } -/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ /* { dg-final { scan-assembler-times {vsetvli} 3 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*5} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ \ No newline at end of file +/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*5} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c index e34e2c99e88..8baedb0fef7 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c @@ -16,4 +16,4 @@ void f(int8_t *base, int8_t *out, size_t vl, size_t m, size_t n) { } } -/* { dg-final { scan-assembler-times {vsetvli} 4 { target { no-opts "-O0" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli} 3 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c index 5b5a31b0eb6..1204f83e6bd 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c @@ -19,5 +19,5 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c } /* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ /* { dg-final { scan-assembler-times {vsetvli} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */