From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7928) id 7EF92385841C; Fri, 20 Oct 2023 03:56:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7EF92385841C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1697774216; bh=oshDyv4PJUqTfdn7oSFhm1E42y6nxsIyCtrL95thKlM=; h=From:To:Subject:Date:From; b=UoKOV6Cdq2jGxWyqzTBaIyXh/iy3Zh9tj6MVCL1hjMbKWJCKtmQ2bY5M4eH5DbLNs OMn/Xb+AzoQ9caryl9qd8sKUzxfBvwXYhsihDDYZYlc7FZ7n5TtCl9MuNtIOzOg2vq E3QsXRRDX9NJy+gXB+aFMxEHWcErqaYboLbcaDns= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Lehua Ding To: gcc-cvs@gcc.gnu.org Subject: [gcc r14-4773] RISC-V: Refactor and cleanup vsetvl pass X-Act-Checkin: gcc X-Git-Author: Lehua Ding X-Git-Refname: refs/heads/trunk X-Git-Oldrev: df252e0f254638c44034f1879db34286c88a92c2 X-Git-Newrev: 29331e72d0ce9fe8aabdeb8c320b99943b9e067a Message-Id: <20231020035656.7EF92385841C@sourceware.org> Date: Fri, 20 Oct 2023 03:56:56 +0000 (GMT) List-Id: https://gcc.gnu.org/g:29331e72d0ce9fe8aabdeb8c320b99943b9e067a commit r14-4773-g29331e72d0ce9fe8aabdeb8c320b99943b9e067a Author: Lehua Ding Date: Fri Oct 20 10:22:43 2023 +0800 RISC-V: Refactor and cleanup vsetvl pass This patch refactors and cleanups the vsetvl pass in order to make the code easier to modify and understand. This patch does several things: 1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain and modify this virtual CFG. Phase 4 performs insertion, modification and deletion of vsetvl insns based on the virtual CFG. The basic block in the virtual CFG is called vsetvl_block_info and the vsetvl information inside is called vsetvl_info. 2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system, this phase only fuse local vsetvl info in forward direction. 3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl info to a pred basic block to a more unified method that there is a vsetvl info in the vsetvl defintion reaching in compatible with it. 4. Place all modification operations to the RTL in Phase 4 and Phase 5. Phase 4 is responsible for inserting, modifying and deleting vsetvl instructions based on fully optimized vsetvl infos. Phase 5 removes the avl operand from the RVV instruction and removes the unused dest operand register from the vsetvl insns. These modifications resulted in some testcases needing to be updated. The reasons for updating are summarized below: 1. more optimized vlmax_back_prop-{25,26}.c vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/ avl_single-{23,84,95}.c/pr109773-1.c 2. less unnecessary fusion avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c 3. local fuse direction (backward -> forward) scalar_move-1.c 4. add some bugfix testcases. pr111037-{3,4}.c/pr111037-4.c avl_single-{89,104,105,106,107,108,109}.c PR target/111037 PR target/111234 PR target/111725 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New. (debug): Removed. (compute_reaching_defintion): New. (enum vsetvl_type): Moved. (vlmax_avl_p): Moved. (enum emit_type): Moved. (vlmul_to_str): Moved. (vlmax_avl_insn_p): Removed. (policy_to_str): Moved. (loop_basic_block_p): Removed. (valid_sew_p): Removed. (vsetvl_insn_p): Moved. (vsetvl_vtype_change_only_p): Removed. (after_or_same_p): Removed. (before_p): Removed. (anticipatable_occurrence_p): Removed. (available_occurrence_p): Removed. (insn_should_be_added_p): Removed. (get_all_sets): Moved. (get_same_bb_set): Moved. (gen_vsetvl_pat): Removed. (calculate_vlmul): Moved. (get_max_int_sew): New. (emit_vsetvl_insn): Removed. (get_max_float_sew): New. (eliminate_insn): Removed. (insert_vsetvl): Removed. (count_regno_occurrences): Moved. (get_vl_vtype_info): Removed. (enum def_type): Moved. (validate_change_or_fail): Moved. (change_insn): Removed. (get_all_real_uses): Moved. (get_forward_read_vl_insn): Removed. (get_backward_fault_first_load_insn): Removed. (change_vsetvl_insn): Removed. (avl_source_has_vsetvl_p): Removed. (source_equal_p): Moved. (calculate_sew): Removed. (same_equiv_note_p): Moved. (get_expr_id): New. (incompatible_avl_p): Removed. (get_regno): New. (different_sew_p): Removed. (get_bb_index): New. (different_lmul_p): Removed. (has_no_uses): Moved. (different_ratio_p): Removed. (different_tail_policy_p): Removed. (different_mask_policy_p): Removed. (possible_zero_avl_p): Removed. (enum demand_flags): New. (second_ratio_invalid_for_first_sew_p): Removed. (second_ratio_invalid_for_first_lmul_p): Removed. (enum class): New. (float_insn_valid_sew_p): Removed. (second_sew_less_than_first_sew_p): Removed. (first_sew_less_than_second_sew_p): Removed. (class vsetvl_info): New. (compare_lmul): Removed. (second_lmul_less_than_first_lmul_p): Removed. (second_ratio_less_than_first_ratio_p): Removed. (DEF_INCOMPATIBLE_COND): Removed. (greatest_sew): Removed. (first_sew): Removed. (second_sew): Removed. (first_vlmul): Removed. (second_vlmul): Removed. (first_ratio): Removed. (second_ratio): Removed. (vlmul_for_first_sew_second_ratio): Removed. (vlmul_for_greatest_sew_second_ratio): Removed. (ratio_for_second_sew_first_vlmul): Removed. (class vsetvl_block_info): New. (DEF_SEW_LMUL_FUSE_RULE): New. (always_unavailable): Removed. (avl_unavailable_p): Removed. (class demand_system): New. (sew_unavailable_p): Removed. (lmul_unavailable_p): Removed. (ge_sew_unavailable_p): Removed. (ge_sew_lmul_unavailable_p): Removed. (ge_sew_ratio_unavailable_p): Removed. (DEF_UNAVAILABLE_COND): Removed. (same_sew_lmul_demand_p): Removed. (propagate_avl_across_demands_p): Removed. (reg_available_p): Removed. (support_relaxed_compatible_p): Removed. (demands_can_be_fused_p): Removed. (earliest_pred_can_be_fused_p): Removed. (vsetvl_dominated_by_p): Removed. (avl_info::avl_info): Removed. (avl_info::single_source_equal_p): Removed. (avl_info::multiple_source_equal_p): Removed. (DEF_SEW_LMUL_RULE): New. (avl_info::operator=): Removed. (avl_info::operator==): Removed. (DEF_POLICY_RULE): New. (avl_info::operator!=): Removed. (avl_info::has_non_zero_avl): Removed. (vl_vtype_info::vl_vtype_info): Removed. (vl_vtype_info::operator==): Removed. (DEF_AVL_RULE): New. (vl_vtype_info::operator!=): Removed. (vl_vtype_info::same_avl_p): Removed. (vl_vtype_info::same_vtype_p): Removed. (vl_vtype_info::same_vlmax_p): Removed. (vector_insn_info::operator>=): Removed. (vector_insn_info::operator==): Removed. (class pre_vsetvl): New. (vector_insn_info::parse_insn): Removed. (vector_insn_info::compatible_p): Removed. (vector_insn_info::skip_avl_compatible_p): Removed. (vector_insn_info::compatible_avl_p): Removed. (vector_insn_info::compatible_vtype_p): Removed. (vector_insn_info::available_p): Removed. (vector_insn_info::fuse_avl): Removed. (vector_insn_info::fuse_sew_lmul): Removed. (vector_insn_info::fuse_tail_policy): Removed. (vector_insn_info::fuse_mask_policy): Removed. (vector_insn_info::local_merge): Removed. (vector_insn_info::global_merge): Removed. (vector_insn_info::get_avl_or_vl_reg): Removed. (vector_insn_info::update_fault_first_load_avl): Removed. (vector_insn_info::dump): Removed. (vector_infos_manager::vector_infos_manager): Removed. (vector_infos_manager::create_expr): Removed. (vector_infos_manager::get_expr_id): Removed. (vector_infos_manager::all_same_ratio_p): Removed. (vector_infos_manager::all_avail_in_compatible_p): Removed. (vector_infos_manager::all_same_avl_p): Removed. (vector_infos_manager::expr_set_num): Removed. (vector_infos_manager::release): Removed. (vector_infos_manager::create_bitmap_vectors): Removed. (vector_infos_manager::free_bitmap_vectors): Removed. (vector_infos_manager::dump): Removed. (class pass_vsetvl): Adjust. (pass_vsetvl::get_vector_info): Removed. (pass_vsetvl::get_block_info): Removed. (pass_vsetvl::update_vector_info): Removed. (pass_vsetvl::update_block_info): Removed. (pre_vsetvl::compute_avl_def_data): New. (pass_vsetvl::simple_vsetvl): Removed. (pass_vsetvl::compute_local_backward_infos): Removed. (pass_vsetvl::need_vsetvl): Removed. (pass_vsetvl::transfer_before): Removed. (pass_vsetvl::transfer_after): Removed. (pre_vsetvl::compute_vsetvl_def_data): New. (pass_vsetvl::emit_local_forward_vsetvls): Removed. (pass_vsetvl::prune_expressions): Removed. (pass_vsetvl::compute_local_properties): Removed. (pre_vsetvl::compute_lcm_local_properties): New. (pass_vsetvl::earliest_fusion): Removed. (pre_vsetvl::fuse_local_vsetvl_info): New. (pass_vsetvl::vsetvl_fusion): Removed. (pass_vsetvl::can_refine_vsetvl_p): Removed. (pre_vsetvl::earliest_fuse_vsetvl_info): New. (pass_vsetvl::refine_vsetvls): Removed. (pass_vsetvl::cleanup_vsetvls): Removed. (pass_vsetvl::commit_vsetvls): Removed. (pass_vsetvl::pre_vsetvl): Removed. (pass_vsetvl::get_vsetvl_at_end): Removed. (local_avl_compatible_p): Removed. (pass_vsetvl::local_eliminate_vsetvl_insn): Removed. (pre_vsetvl::pre_global_vsetvl_info): New. (get_first_vsetvl_before_rvv_insns): Removed. (pass_vsetvl::global_eliminate_vsetvl_insn): Removed. (pre_vsetvl::emit_vsetvl): New. (pass_vsetvl::ssa_post_optimization): Removed. (pre_vsetvl::cleaup): New. (pre_vsetvl::remove_avl_operand): New. (pass_vsetvl::df_post_optimization): Removed. (pre_vsetvl::remove_unused_dest_operand): New. (pass_vsetvl::init): Removed. (pass_vsetvl::done): Removed. (pass_vsetvl::compute_probabilities): Removed. (pass_vsetvl::lazy_vsetvl): Adjust. (pass_vsetvl::execute): Adjust. * config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed. (DEF_SEW_LMUL_RULE): New. (DEF_SEW_LMUL_FUSE_RULE): Removed. (DEF_POLICY_RULE): New. (DEF_UNAVAILABLE_COND): Removed (DEF_AVL_RULE): New demand type. (sew_lmul): New demand type. (ratio_only): New demand type. (sew_only): New demand type. (ge_sew): New demand type. (ratio_and_ge_sew): New demand type. (tail_mask_policy): New demand type. (tail_policy_only): New demand type. (mask_policy_only): New demand type. (ignore_policy): New demand type. (avl): New demand type. (non_zero_avl): New demand type. (ignore_avl): New demand type. * config/riscv/t-riscv: Removed riscv-vsetvl.h * config/riscv/riscv-vsetvl.h: Removed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust. * gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust. * gcc.target/riscv/rvv/base/pr111037-1.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here. * gcc.target/riscv/rvv/base/pr111037-2.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-106.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-107.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-108.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-109.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test. Diff: --- gcc/config/riscv/riscv-vsetvl.cc | 6502 +++++++++----------- gcc/config/riscv/riscv-vsetvl.def | 641 +- gcc/config/riscv/riscv-vsetvl.h | 488 -- gcc/config/riscv/t-riscv | 2 +- .../gcc.target/riscv/rvv/base/scalar_move-1.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/avl_single-104.c | 35 + .../gcc.target/riscv/rvv/vsetvl/avl_single-105.c | 23 + .../gcc.target/riscv/rvv/vsetvl/avl_single-106.c | 34 + .../gcc.target/riscv/rvv/vsetvl/avl_single-107.c | 41 + .../gcc.target/riscv/rvv/vsetvl/avl_single-108.c | 41 + .../gcc.target/riscv/rvv/vsetvl/avl_single-109.c | 45 + .../gcc.target/riscv/rvv/vsetvl/avl_single-23.c | 7 +- .../gcc.target/riscv/rvv/vsetvl/avl_single-46.c | 3 +- .../gcc.target/riscv/rvv/vsetvl/avl_single-84.c | 5 +- .../gcc.target/riscv/rvv/vsetvl/avl_single-89.c | 8 +- .../gcc.target/riscv/rvv/vsetvl/avl_single-95.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c | 7 +- .../gcc.target/riscv/rvv/vsetvl/pr109743-2.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/pr109773-1.c | 2 +- .../riscv/rvv/{base => vsetvl}/pr111037-1.c | 0 .../riscv/rvv/{base => vsetvl}/pr111037-2.c | 0 .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c | 16 + .../gcc.target/riscv/rvv/vsetvl/pr111037-4.c | 16 + .../riscv/rvv/vsetvl/vlmax_back_prop-25.c | 10 +- .../riscv/rvv/vsetvl/vlmax_back_prop-26.c | 10 +- .../riscv/rvv/vsetvl/vlmax_conflict-12.c | 1 - .../gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-13.c | 4 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-18.c | 4 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-23.c | 2 +- 30 files changed, 3263 insertions(+), 4692 deletions(-) diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 4b06d93e7f90..e136351aee5f 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -18,60 +18,47 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ -/* This pass is to Set VL/VTYPE global status for RVV instructions - that depend on VL and VTYPE registers by Lazy code motion (LCM). - - Strategy: - - - Backward demanded info fusion within block. - - - Lazy code motion (LCM) based demanded info backward propagation. - - - RTL_SSA framework for def-use, PHI analysis. - - - Lazy code motion (LCM) for global VL/VTYPE optimization. - - Assumption: - - - Each avl operand is either an immediate (must be in range 0 ~ 31) or reg. - - This pass consists of 5 phases: - - - Phase 1 - compute VL/VTYPE demanded information within each block - by backward data-flow analysis. - - - Phase 2 - Emit vsetvl instructions within each basic block according to - demand, compute and save ANTLOC && AVLOC of each block. - - - Phase 3 - LCM Earliest-edge baseed VSETVL demand fusion. - - - Phase 4 - Lazy code motion including: compute local properties, - pre_edge_lcm and vsetvl insertion && delete edges for LCM results. - - - Phase 5 - Cleanup AVL operand of RVV instruction since it will not be - used any more and VL operand of VSETVL instruction if it is not used by - any non-debug instructions. - - - Phase 6 - DF based post VSETVL optimizations. - - Implementation: - - - The subroutine of optimize == 0 is simple_vsetvl. - This function simplily vsetvl insertion for each RVV - instruction. No optimization. - - - The subroutine of optimize > 0 is lazy_vsetvl. - This function optimize vsetvl insertion process by - lazy code motion (LCM) layering on RTL_SSA. - - - get_avl (), get_insn (), get_avl_source (): - - 1. get_insn () is the current instruction, find_access (get_insn - ())->def is the same as get_avl_source () if get_insn () demand VL. - 2. If get_avl () is non-VLMAX REG, get_avl () == get_avl_source - ()->regno (). - 3. get_avl_source ()->regno () is the REGNO that we backward propagate. - */ +/* The values of the vl and vtype registers will affect the behavior of RVV + insns. That is, when we need to execute an RVV instruction, we need to set + the correct vl and vtype values by executing the vsetvl instruction before. + Executing the fewest number of vsetvl instructions while keeping the behavior + the same is the problem this pass is trying to solve. This vsetvl pass is + divided into 5 phases: + + - Phase 1 (fuse local vsetvl infos): traverses each Basic Block, parses + each instruction in it that affects vl and vtype state and generates an + array of vsetvl_info objects. Then traverse the vsetvl_info array from + front to back and perform fusion according to the fusion rules. The fused + vsetvl infos are stored in the vsetvl_block_info object's `infos` field. + + - Phase 2 (earliest fuse global vsetvl infos): The header_info and + footer_info of vsetvl_block_info are used as expressions, and the + earliest of each expression is computed. Based on the earliest + information, try to lift up the corresponding vsetvl info to the src + basic block of the edge (mainly to reduce the total number of vsetvl + instructions, this uplift will cause some execution paths to execute + vsetvl instructions that shouldn't be there). + + - Phase 3 (pre global vsetvl info): The header_info and footer_info of + vsetvl_block_info are used as expressions, and the LCM algorithm is used + to compute the header_info that needs to be deleted and the one that + needs to be inserted in some edges. + + - Phase 4 (emit vsetvl insns) : Based on the fusion result of Phase 1 and + the deletion and insertion information of Phase 3, the mandatory vsetvl + instruction insertion, modification and deletion are performed. + + - Phase 5 (cleanup): Clean up the avl operand in the RVV operator + instruction and cleanup the unused dest operand of the vsetvl insn. + + After the Phase 1 a virtual CFG of vsetvl_info is generated. The virtual + basic block is represented by vsetvl_block_info, and the virtual vsetvl + statements inside are represented by vsetvl_info. The later phases 2 and 3 + are constantly modifying and adjusting this virtual CFG. Phase 4 performs + insertion, modification and deletion of vsetvl instructions based on the + optimized virtual CFG. The Phase 1, 2 and 3 do not involve modifications to + the RTL. +*/ #define IN_TARGET_CODE 1 #define INCLUDE_ALGORITHM @@ -98,61 +85,180 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" #include "profile-count.h" #include "gcse.h" -#include "riscv-vsetvl.h" using namespace rtl_ssa; using namespace riscv_vector; -static CONSTEXPR const unsigned ALL_SEW[] = {8, 16, 32, 64}; -static CONSTEXPR const vlmul_type ALL_LMUL[] - = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; +/* Set the bitmap DST to the union of SRC of predecessors of + basic block B. + It's a bit different from bitmap_union_of_preds in cfganal.cc. This function + takes into account the case where pred is ENTRY basic block. The main reason + for this difference is to make it easier to insert some special value into + the ENTRY base block. For example, vsetvl_info with a status of UNKNOW. */ +static void +bitmap_union_of_preds_with_entry (sbitmap dst, sbitmap *src, basic_block b) +{ + unsigned int set_size = dst->size; + edge e; + unsigned ix; + + for (ix = 0; ix < EDGE_COUNT (b->preds); ix++) + { + e = EDGE_PRED (b, ix); + bitmap_copy (dst, src[e->src->index]); + break; + } -DEBUG_FUNCTION void -debug (const vector_insn_info *info) + if (ix == EDGE_COUNT (b->preds)) + bitmap_clear (dst); + else + for (ix++; ix < EDGE_COUNT (b->preds); ix++) + { + unsigned int i; + SBITMAP_ELT_TYPE *p, *r; + + e = EDGE_PRED (b, ix); + p = src[e->src->index]->elms; + r = dst->elms; + for (i = 0; i < set_size; i++) + *r++ |= *p++; + } +} + +/* Compute the reaching defintion in and out based on the gen and KILL + informations in each Base Blocks. + This function references the compute_avaiable implementation in lcm.cc */ +static void +compute_reaching_defintion (sbitmap *gen, sbitmap *kill, sbitmap *in, + sbitmap *out) { - info->dump (stderr); + edge e; + basic_block *worklist, *qin, *qout, *qend, bb; + unsigned int qlen; + edge_iterator ei; + + /* Allocate a worklist array/queue. Entries are only added to the + list if they were not already on the list. So the size is + bounded by the number of basic blocks. */ + qin = qout = worklist + = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); + + /* Put every block on the worklist; this is necessary because of the + optimistic initialization of AVOUT above. Use reverse postorder + to make the forward dataflow problem require less iterations. */ + int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); + int n = pre_and_rev_post_order_compute_fn (cfun, NULL, rpo, false); + for (int i = 0; i < n; ++i) + { + bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]); + *qin++ = bb; + bb->aux = bb; + } + free (rpo); + + qin = worklist; + qend = &worklist[n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS]; + qlen = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS; + + /* Mark blocks which are successors of the entry block so that we + can easily identify them below. */ + FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs) + e->dest->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun); + + /* Iterate until the worklist is empty. */ + while (qlen) + { + /* Take the first entry off the worklist. */ + bb = *qout++; + qlen--; + + if (qout >= qend) + qout = worklist; + + /* Do not clear the aux field for blocks which are successors of the + ENTRY block. That way we never add then to the worklist again. */ + if (bb->aux != ENTRY_BLOCK_PTR_FOR_FN (cfun)) + bb->aux = NULL; + + bitmap_union_of_preds_with_entry (in[bb->index], out, bb); + + if (bitmap_ior_and_compl (out[bb->index], gen[bb->index], in[bb->index], + kill[bb->index])) + /* If the out state of this block changed, then we need + to add the successors of this block to the worklist + if they are not already on the worklist. */ + FOR_EACH_EDGE (e, ei, bb->succs) + if (!e->dest->aux && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)) + { + *qin++ = e->dest; + e->dest->aux = e; + qlen++; + + if (qin >= qend) + qin = worklist; + } + } + + clear_aux_for_edges (); + clear_aux_for_blocks (); + free (worklist); } -DEBUG_FUNCTION void -debug (const vector_infos_manager *info) +/* Classification of vsetvl instruction. */ +enum vsetvl_type { - info->dump (stderr); -} + VSETVL_NORMAL, + VSETVL_VTYPE_CHANGE_ONLY, + VSETVL_DISCARD_RESULT, + NUM_VSETVL_TYPE +}; -static bool -vlmax_avl_p (rtx x) +enum emit_type { - return x && rtx_equal_p (x, RVV_VLMAX); + /* emit_insn directly. */ + EMIT_DIRECT, + EMIT_BEFORE, + EMIT_AFTER, +}; + +/* dump helper functions */ +static const char * +vlmul_to_str (vlmul_type vlmul) +{ + switch (vlmul) + { + case LMUL_1: + return "m1"; + case LMUL_2: + return "m2"; + case LMUL_4: + return "m4"; + case LMUL_8: + return "m8"; + case LMUL_RESERVED: + return "INVALID LMUL"; + case LMUL_F8: + return "mf8"; + case LMUL_F4: + return "mf4"; + case LMUL_F2: + return "mf2"; + + default: + gcc_unreachable (); + } } -static bool -vlmax_avl_insn_p (rtx_insn *rinsn) +static const char * +policy_to_str (bool agnostic_p) { - return (INSN_CODE (rinsn) == CODE_FOR_vlmax_avlsi - || INSN_CODE (rinsn) == CODE_FOR_vlmax_avldi); + return agnostic_p ? "agnostic" : "undisturbed"; } -/* Return true if the block is a loop itself: - local_dem - __________ - ____|____ | - | | | - |________| | - |_________| - reaching_out -*/ static bool -loop_basic_block_p (const basic_block cfg_bb) +vlmax_avl_p (rtx x) { - if (JUMP_P (BB_END (cfg_bb)) && any_condjump_p (BB_END (cfg_bb))) - { - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, cfg_bb->succs) - if (e->dest->index == cfg_bb->index) - return true; - } - return false; + return x && rtx_equal_p (x, RVV_VLMAX); } /* Return true if it is an RVV instruction depends on VTYPE global @@ -171,13 +277,6 @@ has_vl_op (rtx_insn *rinsn) return recog_memoized (rinsn) >= 0 && get_attr_has_vl_op (rinsn); } -/* Is this a SEW value that can be encoded into the VTYPE format. */ -static bool -valid_sew_p (size_t sew) -{ - return exact_log2 (sew) && sew >= 8 && sew <= 64; -} - /* Return true if the instruction ignores VLMUL field of VTYPE. */ static bool ignore_vlmul_insn_p (rtx_insn *rinsn) @@ -223,7 +322,7 @@ vector_config_insn_p (rtx_insn *rinsn) static bool vsetvl_insn_p (rtx_insn *rinsn) { - if (!vector_config_insn_p (rinsn)) + if (!rinsn || !vector_config_insn_p (rinsn)) return false; return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi); @@ -239,34 +338,13 @@ vsetvl_discard_result_insn_p (rtx_insn *rinsn) || INSN_CODE (rinsn) == CODE_FOR_vsetvl_discard_resultsi); } -/* Return true if it is vsetvl zero, zero. */ -static bool -vsetvl_vtype_change_only_p (rtx_insn *rinsn) -{ - if (!vector_config_insn_p (rinsn)) - return false; - return (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only); -} - -static bool -after_or_same_p (const insn_info *insn1, const insn_info *insn2) -{ - return insn1->compare_with (insn2) >= 0; -} - static bool real_insn_and_same_bb_p (const insn_info *insn, const bb_info *bb) { return insn != nullptr && insn->is_real () && insn->bb () == bb; } -static bool -before_p (const insn_info *insn1, const insn_info *insn2) -{ - return insn1->compare_with (insn2) < 0; -} - -/* Helper function to get VL operand. */ +/* Helper function to get VL operand for VLMAX insn. */ static rtx get_vl (rtx_insn *rinsn) { @@ -278,224 +356,6 @@ get_vl (rtx_insn *rinsn) return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0)); } -/* An "anticipatable occurrence" is one that is the first occurrence in the - basic block, the operands are not modified in the basic block prior - to the occurrence and the output is not used between the start of - the block and the occurrence. - - For VSETVL instruction, we have these following formats: - 1. vsetvl zero, rs1. - 2. vsetvl zero, imm. - 3. vsetvl rd, rs1. - - So base on these circumstances, a DEM is considered as a local anticipatable - occurrence should satisfy these following conditions: - - 1). rs1 (avl) are not modified in the basic block prior to the VSETVL. - 2). rd (vl) are not modified in the basic block prior to the VSETVL. - 3). rd (vl) is not used between the start of the block and the occurrence. - - Note: We don't need to check VL/VTYPE here since DEM is UNKNOWN if VL/VTYPE - is modified prior to the occurrence. This case is already considered as - a non-local anticipatable occurrence. -*/ -static bool -anticipatable_occurrence_p (const bb_info *bb, const vector_insn_info dem) -{ - insn_info *insn = dem.get_insn (); - /* The only possible operand we care of VSETVL is AVL. */ - if (dem.has_avl_reg ()) - { - /* rs1 (avl) are not modified in the basic block prior to the VSETVL. */ - rtx avl = dem.get_avl_or_vl_reg (); - if (dem.dirty_p ()) - { - gcc_assert (!vsetvl_insn_p (insn->rtl ())); - - /* Earliest VSETVL will be inserted at the end of the block. */ - for (const insn_info *i : bb->real_nondebug_insns ()) - { - /* rs1 (avl) are not modified in the basic block prior to the - VSETVL. */ - if (find_access (i->defs (), REGNO (avl))) - return false; - if (vlmax_avl_p (dem.get_avl ())) - { - /* rd (avl) is not used between the start of the block and - the occurrence. Note: Only for Dirty and VLMAX-avl. */ - if (find_access (i->uses (), REGNO (avl))) - return false; - } - } - - return true; - } - else if (!vlmax_avl_p (avl)) - { - set_info *set = dem.get_avl_source (); - /* If it's undefined, it's not anticipatable conservatively. */ - if (!set) - return false; - if (real_insn_and_same_bb_p (set->insn (), bb) - && before_p (set->insn (), insn)) - return false; - for (insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - /* rs1 (avl) are not modified in the basic block prior to the - VSETVL. */ - if (find_access (i->defs (), REGNO (avl))) - return false; - } - } - } - - /* rd (vl) is not used between the start of the block and the occurrence. */ - if (vsetvl_insn_p (insn->rtl ())) - { - rtx dest = get_vl (insn->rtl ()); - for (insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - /* rd (vl) is not used between the start of the block and the - * occurrence. */ - if (find_access (i->uses (), REGNO (dest))) - return false; - /* rd (vl) are not modified in the basic block prior to the VSETVL. */ - if (find_access (i->defs (), REGNO (dest))) - return false; - } - } - - return true; -} - -/* An "available occurrence" is one that is the last occurrence in the - basic block and the operands are not modified by following statements in - the basic block [including this insn]. - - For VSETVL instruction, we have these following formats: - 1. vsetvl zero, rs1. - 2. vsetvl zero, imm. - 3. vsetvl rd, rs1. - - So base on these circumstances, a DEM is considered as a local available - occurrence should satisfy these following conditions: - - 1). rs1 (avl) are not modified by following statements in - the basic block. - 2). rd (vl) are not modified by following statements in - the basic block. - - Note: We don't need to check VL/VTYPE here since DEM is UNKNOWN if VL/VTYPE - is modified prior to the occurrence. This case is already considered as - a non-local available occurrence. -*/ -static bool -available_occurrence_p (const bb_info *bb, const vector_insn_info dem) -{ - insn_info *insn = dem.get_insn (); - /* The only possible operand we care of VSETVL is AVL. */ - if (dem.has_avl_reg ()) - { - if (!vlmax_avl_p (dem.get_avl ())) - { - rtx dest = NULL_RTX; - insn_info *i = insn; - if (vsetvl_insn_p (insn->rtl ())) - { - dest = get_vl (insn->rtl ()); - /* For user vsetvl a2, a2 instruction, we consider it as - available even though it modifies "a2". */ - i = i->next_nondebug_insn (); - } - for (; real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) - { - if (read_vl_insn_p (i->rtl ())) - continue; - /* rs1 (avl) are not modified by following statements in - the basic block. */ - if (find_access (i->defs (), REGNO (dem.get_avl ()))) - return false; - /* rd (vl) are not modified by following statements in - the basic block. */ - if (dest && find_access (i->defs (), REGNO (dest))) - return false; - } - } - } - return true; -} - -static bool -insn_should_be_added_p (const insn_info *insn, unsigned int types) -{ - if (insn->is_real () && (types & REAL_SET)) - return true; - if (insn->is_phi () && (types & PHI_SET)) - return true; - if (insn->is_bb_head () && (types & BB_HEAD_SET)) - return true; - if (insn->is_bb_end () && (types & BB_END_SET)) - return true; - return false; -} - -/* Recursively find all define instructions. The kind of instruction is - specified by the DEF_TYPE. */ -static hash_set -get_all_sets (phi_info *phi, unsigned int types) -{ - hash_set insns; - auto_vec work_list; - hash_set visited_list; - if (!phi) - return hash_set (); - work_list.safe_push (phi); - - while (!work_list.is_empty ()) - { - phi_info *phi = work_list.pop (); - visited_list.add (phi); - for (use_info *use : phi->inputs ()) - { - def_info *def = use->def (); - set_info *set = safe_dyn_cast (def); - if (!set) - return hash_set (); - - gcc_assert (!set->insn ()->is_debug_insn ()); - - if (insn_should_be_added_p (set->insn (), types)) - insns.add (set); - if (set->insn ()->is_phi ()) - { - phi_info *new_phi = as_a (set); - if (!visited_list.contains (new_phi)) - work_list.safe_push (new_phi); - } - } - } - return insns; -} - -static hash_set -get_all_sets (set_info *set, bool /* get_real_inst */ real_p, - bool /*get_phi*/ phi_p, bool /* get_function_parameter*/ param_p) -{ - if (real_p && phi_p && param_p) - return get_all_sets (safe_dyn_cast (set), - REAL_SET | PHI_SET | BB_HEAD_SET | BB_END_SET); - - else if (real_p && param_p) - return get_all_sets (safe_dyn_cast (set), - REAL_SET | BB_HEAD_SET | BB_END_SET); - - else if (real_p) - return get_all_sets (safe_dyn_cast (set), REAL_SET); - return hash_set (); -} - /* Helper function to get AVL operand. */ static rtx get_avl (rtx_insn *rinsn) @@ -511,15 +371,6 @@ get_avl (rtx_insn *rinsn) return recog_data.operand[get_attr_vl_op_idx (rinsn)]; } -static set_info * -get_same_bb_set (hash_set &sets, const basic_block cfg_bb) -{ - for (set_info *set : sets) - if (set->bb ()->cfg_bb () == cfg_bb) - return set; - return nullptr; -} - /* Helper function to get SEW operand. We always have SEW value for all RVV instructions that have VTYPE OP. */ static uint8_t @@ -589,365 +440,174 @@ has_vector_insn (function *fn) return false; } -/* Emit vsetvl instruction. */ -static rtx -gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_vtype_info &info, rtx vl) +static vlmul_type +calculate_vlmul (unsigned int sew, unsigned int ratio) { - rtx avl = info.get_avl (); - /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s, - set the value of avl to (const_int 0) so that VSETVL PASS will - insert vsetvl correctly.*/ - if (info.has_avl_no_reg ()) - avl = GEN_INT (0); - rtx sew = gen_int_mode (info.get_sew (), Pmode); - rtx vlmul = gen_int_mode (info.get_vlmul (), Pmode); - rtx ta = gen_int_mode (info.get_ta (), Pmode); - rtx ma = gen_int_mode (info.get_ma (), Pmode); - - if (insn_type == VSETVL_NORMAL) - { - gcc_assert (vl != NULL_RTX); - return gen_vsetvl (Pmode, vl, avl, sew, vlmul, ta, ma); - } - else if (insn_type == VSETVL_VTYPE_CHANGE_ONLY) - return gen_vsetvl_vtype_change_only (sew, vlmul, ta, ma); - else - return gen_vsetvl_discard_result (Pmode, avl, sew, vlmul, ta, ma); + const vlmul_type ALL_LMUL[] + = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; + for (const vlmul_type vlmul : ALL_LMUL) + if (calculate_ratio (sew, vlmul) == ratio) + return vlmul; + return LMUL_RESERVED; } -static rtx -gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info &info, - rtx vl = NULL_RTX) +/* Get the currently supported maximum sew used in the int rvv instructions. */ +static uint8_t +get_max_int_sew () { - rtx new_pat; - vl_vtype_info new_info = info; - if (info.get_insn () && info.get_insn ()->rtl () - && fault_first_load_p (info.get_insn ()->rtl ())) - new_info.set_avl_info ( - avl_info (get_avl (info.get_insn ()->rtl ()), nullptr)); - if (vl) - new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl); - else - { - if (vsetvl_insn_p (rinsn)) - new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, get_vl (rinsn)); - else if (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only) - new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL_RTX); - else - new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL_RTX); - } - return new_pat; + if (TARGET_VECTOR_ELEN_64) + return 64; + else if (TARGET_VECTOR_ELEN_32) + return 32; + gcc_unreachable (); } -static void -emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type, - const vl_vtype_info &info, rtx vl, rtx_insn *rinsn) -{ - rtx pat = gen_vsetvl_pat (insn_type, info, vl); - if (dump_file) - { - fprintf (dump_file, "\nInsert vsetvl insn PATTERN:\n"); - print_rtl_single (dump_file, pat); - fprintf (dump_file, "\nfor insn:\n"); - print_rtl_single (dump_file, rinsn); - } - - if (emit_type == EMIT_DIRECT) - emit_insn (pat); - else if (emit_type == EMIT_BEFORE) - emit_insn_before (pat, rinsn); - else - emit_insn_after (pat, rinsn); +/* Get the currently supported maximum sew used in the float rvv instructions. + */ +static uint8_t +get_max_float_sew () +{ + if (TARGET_VECTOR_ELEN_FP_64) + return 64; + else if (TARGET_VECTOR_ELEN_FP_32) + return 32; + else if (TARGET_VECTOR_ELEN_FP_16) + return 16; + gcc_unreachable (); } -static void -eliminate_insn (rtx_insn *rinsn) -{ - if (dump_file) - { - fprintf (dump_file, "\nEliminate insn %d:\n", INSN_UID (rinsn)); - print_rtl_single (dump_file, rinsn); - } - if (in_sequence_p ()) - remove_insn (rinsn); - else - delete_insn (rinsn); -} - -static vsetvl_type -insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, - const vector_insn_info &info, const vector_insn_info &prev_info) +/* Count the number of REGNO in RINSN. */ +static int +count_regno_occurrences (rtx_insn *rinsn, unsigned int regno) { - /* Use X0, X0 form if the AVL is the same and the SEW+LMUL gives the same - VLMAX. */ - if (prev_info.valid_or_dirty_p () && !prev_info.unknown_p () - && info.compatible_avl_p (prev_info) && info.same_vlmax_p (prev_info)) - { - emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_VTYPE_CHANGE_ONLY; - } - - if (info.has_avl_imm ()) - { - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_DISCARD_RESULT; - } - - if (info.has_avl_no_reg ()) - { - /* We can only use x0, x0 if there's no chance of the vtype change causing - the previous vl to become invalid. */ - if (prev_info.valid_or_dirty_p () && !prev_info.unknown_p () - && info.same_vlmax_p (prev_info)) - { - emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_VTYPE_CHANGE_ONLY; - } - /* Otherwise use an AVL of 0 to avoid depending on previous vl. */ - vl_vtype_info new_info = info; - new_info.set_avl_info (avl_info (const0_rtx, nullptr)); - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, new_info, NULL_RTX, - rinsn); - return VSETVL_DISCARD_RESULT; - } - - /* Use X0 as the DestReg unless AVLReg is X0. We also need to change the - opcode if the AVLReg is X0 as they have different register classes for - the AVL operand. */ - if (vlmax_avl_p (info.get_avl ())) - { - gcc_assert (has_vtype_op (rinsn) || vsetvl_insn_p (rinsn)); - /* For user vsetvli a5, zero, we should use get_vl to get the VL - operand "a5". */ - rtx vl_op = info.get_avl_or_vl_reg (); - gcc_assert (!vlmax_avl_p (vl_op)); - emit_vsetvl_insn (VSETVL_NORMAL, emit_type, info, vl_op, rinsn); - return VSETVL_NORMAL; - } - - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, rinsn); - - if (dump_file) - { - fprintf (dump_file, "Update VL/VTYPE info, previous info="); - prev_info.dump (dump_file); - } - return VSETVL_DISCARD_RESULT; + int count = 0; + extract_insn (rinsn); + for (int i = 0; i < recog_data.n_operands; i++) + if (refers_to_regno_p (regno, recog_data.operand[i])) + count++; + return count; } -/* Get VL/VTYPE information for INSN. */ -static vl_vtype_info -get_vl_vtype_info (const insn_info *insn) +enum def_type { - set_info *set = nullptr; - rtx avl = ::get_avl (insn->rtl ()); - if (avl && REG_P (avl)) - { - if (vlmax_avl_p (avl) && has_vl_op (insn->rtl ())) - set - = find_access (insn->uses (), REGNO (get_vl (insn->rtl ())))->def (); - else if (!vlmax_avl_p (avl)) - set = find_access (insn->uses (), REGNO (avl))->def (); - else - set = nullptr; - } - - uint8_t sew = get_sew (insn->rtl ()); - enum vlmul_type vlmul = get_vlmul (insn->rtl ()); - uint8_t ratio = get_attr_ratio (insn->rtl ()); - /* when get_attr_ratio is invalid, this kind of instructions - doesn't care about ratio. However, we still need this value - in demand info backward analysis. */ - if (ratio == INVALID_ATTRIBUTE) - ratio = calculate_ratio (sew, vlmul); - bool ta = tail_agnostic_p (insn->rtl ()); - bool ma = mask_agnostic_p (insn->rtl ()); - - /* If merge operand is undef value, we prefer agnostic. */ - int merge_op_idx = get_attr_merge_op_idx (insn->rtl ()); - if (merge_op_idx != INVALID_ATTRIBUTE - && satisfies_constraint_vu (recog_data.operand[merge_op_idx])) - { - ta = true; - ma = true; - } - - vl_vtype_info info (avl_info (avl, set), sew, vlmul, ratio, ta, ma); - return info; -} + REAL_SET = 1 << 0, + PHI_SET = 1 << 1, + BB_HEAD_SET = 1 << 2, + BB_END_SET = 1 << 3, + /* ??? TODO: In RTL_SSA framework, we have REAL_SET, + PHI_SET, BB_HEAD_SET, BB_END_SET and + CLOBBER_DEF def_info types. Currently, + we conservatively do not optimize clobber + def since we don't see the case that we + need to optimize it. */ + CLOBBER_DEF = 1 << 4 +}; -/* Change insn and Assert the change always happens. */ -static void -validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group) +static bool +insn_should_be_added_p (const insn_info *insn, unsigned int types) { - bool change_p = validate_change (object, loc, new_rtx, in_group); - gcc_assert (change_p); + if (insn->is_real () && (types & REAL_SET)) + return true; + if (insn->is_phi () && (types & PHI_SET)) + return true; + if (insn->is_bb_head () && (types & BB_HEAD_SET)) + return true; + if (insn->is_bb_end () && (types & BB_END_SET)) + return true; + return false; } -static void -change_insn (rtx_insn *rinsn, rtx new_pat) +static const hash_set +get_all_real_uses (insn_info *insn, unsigned regno) { - /* We don't apply change on RTL_SSA here since it's possible a - new INSN we add in the PASS before which doesn't have RTL_SSA - info yet.*/ - if (dump_file) - { - fprintf (dump_file, "\nChange PATTERN of insn %d from:\n", - INSN_UID (rinsn)); - print_rtl_single (dump_file, PATTERN (rinsn)); - } + gcc_assert (insn->is_real ()); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); + hash_set uses; + auto_vec work_list; + hash_set visited_list; - if (dump_file) + for (def_info *def : insn->defs ()) { - fprintf (dump_file, "\nto:\n"); - print_rtl_single (dump_file, PATTERN (rinsn)); + if (!def->is_reg () || def->regno () != regno) + continue; + set_info *set = safe_dyn_cast (def); + if (!set) + continue; + for (use_info *use : set->nondebug_insn_uses ()) + if (use->insn ()->is_real ()) + uses.add (use); + for (use_info *use : set->phi_uses ()) + work_list.safe_push (use->phi ()); } -} -static const insn_info * -get_forward_read_vl_insn (const insn_info *insn) -{ - const bb_info *bb = insn->bb (); - for (const insn_info *i = insn->next_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) + while (!work_list.is_empty ()) { - if (find_access (i->defs (), VL_REGNUM)) - return nullptr; - if (read_vl_insn_p (i->rtl ())) - return i; - } - return nullptr; -} + phi_info *phi = work_list.pop (); + visited_list.add (phi); -static const insn_info * -get_backward_fault_first_load_insn (const insn_info *insn) -{ - const bb_info *bb = insn->bb (); - for (const insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - if (fault_first_load_p (i->rtl ())) - return i; - if (find_access (i->defs (), VL_REGNUM)) - return nullptr; + for (use_info *use : phi->nondebug_insn_uses ()) + if (use->insn ()->is_real ()) + uses.add (use); + for (use_info *use : phi->phi_uses ()) + if (!visited_list.contains (use->phi ())) + work_list.safe_push (use->phi ()); } - return nullptr; + return uses; } -static bool -change_insn (function_info *ssa, insn_change change, insn_info *insn, - rtx new_pat) +/* Recursively find all define instructions. The kind of instruction is + specified by the DEF_TYPE. */ +static hash_set +get_all_sets (phi_info *phi, unsigned int types) { - rtx_insn *rinsn = insn->rtl (); - auto attempt = ssa->new_change_attempt (); - if (!restrict_movement (change)) - return false; + hash_set insns; + auto_vec work_list; + hash_set visited_list; + if (!phi) + return hash_set (); + work_list.safe_push (phi); - if (dump_file) + while (!work_list.is_empty ()) { - fprintf (dump_file, "\nChange PATTERN of insn %d from:\n", - INSN_UID (rinsn)); - print_rtl_single (dump_file, PATTERN (rinsn)); - } - - insn_change_watermark watermark; - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, true); - - /* These routines report failures themselves. */ - if (!recog (attempt, change) || !change_is_worthwhile (change, false)) - return false; + phi_info *phi = work_list.pop (); + visited_list.add (phi); + for (use_info *use : phi->inputs ()) + { + def_info *def = use->def (); + set_info *set = safe_dyn_cast (def); + if (!set) + return hash_set (); - /* Fix bug: - (insn 12 34 13 2 (set (reg:RVVM4DI 120 v24 [orig:134 _1 ] [134]) - (if_then_else:RVVM4DI (unspec:RVVMF8BI [ - (const_vector:RVVMF8BI repeat [ - (const_int 1 [0x1]) - ]) - (const_int 0 [0]) - (const_int 2 [0x2]) repeated x2 - (const_int 0 [0]) - (reg:SI 66 vl) - (reg:SI 67 vtype) - ] UNSPEC_VPREDICATE) - (plus:RVVM4DI (reg/v:RVVM4DI 104 v8 [orig:137 op1 ] [137]) - (sign_extend:RVVM4DI (vec_duplicate:RVVM4SI (reg:SI 15 a5 - [140])))) (unspec:RVVM4DI [ (const_int 0 [0]) ] UNSPEC_VUNDEF))) - "rvv.c":8:12 2784 {pred_single_widen_addsvnx8di_scalar} (expr_list:REG_EQUIV - (mem/c:RVVM4DI (reg:DI 10 a0 [142]) [1 +0 S[64, 64] A128]) - (expr_list:REG_EQUAL (if_then_else:RVVM4DI (unspec:RVVMF8BI [ - (const_vector:RVVMF8BI repeat [ - (const_int 1 [0x1]) - ]) - (reg/v:DI 13 a3 [orig:139 vl ] [139]) - (const_int 2 [0x2]) repeated x2 - (const_int 0 [0]) - (reg:SI 66 vl) - (reg:SI 67 vtype) - ] UNSPEC_VPREDICATE) - (plus:RVVM4DI (reg/v:RVVM4DI 104 v8 [orig:137 op1 ] [137]) - (const_vector:RVVM4DI repeat [ - (const_int 2730 [0xaaa]) - ])) - (unspec:RVVM4DI [ - (const_int 0 [0]) - ] UNSPEC_VUNDEF)) - (nil)))) - Here we want to remove use "a3". However, the REG_EQUAL/REG_EQUIV note use - "a3" which made us fail in change_insn. We reference to the - 'aarch64-cc-fusion.cc' and add this method. */ - remove_reg_equal_equiv_notes (rinsn); - confirm_change_group (); - ssa->change_insn (change); + gcc_assert (!set->insn ()->is_debug_insn ()); - if (dump_file) - { - fprintf (dump_file, "\nto:\n"); - print_rtl_single (dump_file, PATTERN (rinsn)); + if (insn_should_be_added_p (set->insn (), types)) + insns.add (set); + if (set->insn ()->is_phi ()) + { + phi_info *new_phi = as_a (set); + if (!visited_list.contains (new_phi)) + work_list.safe_push (new_phi); + } + } } - return true; + return insns; } -static void -change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info, - rtx vl = NULL_RTX) +static hash_set +get_all_sets (set_info *set, bool /* get_real_inst */ real_p, + bool /*get_phi*/ phi_p, bool /* get_function_parameter*/ param_p) { - rtx_insn *rinsn; - if (vector_config_insn_p (insn->rtl ())) - { - rinsn = insn->rtl (); - gcc_assert (vsetvl_insn_p (rinsn) && "Can't handle X0, rs1 vsetvli yet"); - } - else - { - gcc_assert (has_vtype_op (insn->rtl ())); - rinsn = PREV_INSN (insn->rtl ()); - gcc_assert (vector_config_insn_p (rinsn)); - } - rtx new_pat = gen_vsetvl_pat (rinsn, info, vl); - change_insn (rinsn, new_pat); -} + if (real_p && phi_p && param_p) + return get_all_sets (safe_dyn_cast (set), + REAL_SET | PHI_SET | BB_HEAD_SET | BB_END_SET); -static bool -avl_source_has_vsetvl_p (set_info *avl_source) -{ - if (!avl_source) - return false; - if (!avl_source->insn ()) - return false; - if (avl_source->insn ()->is_real ()) - return vsetvl_insn_p (avl_source->insn ()->rtl ()); - hash_set sets = get_all_sets (avl_source, true, false, true); - for (const auto set : sets) - { - if (set->insn ()->is_real () && vsetvl_insn_p (set->insn ()->rtl ())) - return true; - } - return false; + else if (real_p && param_p) + return get_all_sets (safe_dyn_cast (set), + REAL_SET | BB_HEAD_SET | BB_END_SET); + + else if (real_p) + return get_all_sets (safe_dyn_cast (set), REAL_SET); + return hash_set (); } static bool @@ -959,93 +619,14 @@ source_equal_p (insn_info *insn1, insn_info *insn2) rtx_insn *rinsn2 = insn2->rtl (); if (!rinsn1 || !rinsn2) return false; + rtx note1 = find_reg_equal_equiv_note (rinsn1); rtx note2 = find_reg_equal_equiv_note (rinsn2); - rtx single_set1 = single_set (rinsn1); - rtx single_set2 = single_set (rinsn2); - if (read_vl_insn_p (rinsn1) && read_vl_insn_p (rinsn2)) - { - const insn_info *load1 = get_backward_fault_first_load_insn (insn1); - const insn_info *load2 = get_backward_fault_first_load_insn (insn2); - return load1 && load2 && load1 == load2; - } - if (note1 && note2 && rtx_equal_p (note1, note2)) return true; - - /* Since vsetvl instruction is not single SET. - We handle this case specially here. */ - if (vsetvl_insn_p (insn1->rtl ()) && vsetvl_insn_p (insn2->rtl ())) - { - /* For example: - vsetvl1 a6,a5,e32m1 - RVV 1 (use a6 as AVL) - vsetvl2 a5,a5,e8mf4 - RVV 2 (use a5 as AVL) - We consider AVL of RVV 1 and RVV 2 are same so that we can - gain more optimization opportunities. - - Note: insn1_info.compatible_avl_p (insn2_info) - will make sure there is no instruction between vsetvl1 and vsetvl2 - modify a5 since their def will be different if there is instruction - modify a5 and compatible_avl_p will return false. */ - vector_insn_info insn1_info, insn2_info; - insn1_info.parse_insn (insn1); - insn2_info.parse_insn (insn2); - - /* To avoid dead loop, we don't optimize a vsetvli def has vsetvli - instructions which will complicate the situation. */ - if (avl_source_has_vsetvl_p (insn1_info.get_avl_source ()) - || avl_source_has_vsetvl_p (insn2_info.get_avl_source ())) - return false; - - if (insn1_info.same_vlmax_p (insn2_info) - && insn1_info.compatible_avl_p (insn2_info)) - return true; - } - - /* We only handle AVL is set by instructions with no side effects. */ - if (!single_set1 || !single_set2) - return false; - if (!rtx_equal_p (SET_SRC (single_set1), SET_SRC (single_set2))) - return false; - /* RTL_SSA uses include REG_NOTE. Consider this following case: - - insn1 RTL: - (insn 41 39 42 4 (set (reg:DI 26 s10 [orig:159 loop_len_46 ] [159]) - (umin:DI (reg:DI 15 a5 [orig:201 _149 ] [201]) - (reg:DI 14 a4 [276]))) 408 {*umindi3} - (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:201 _149 ] [201]) - (const_int 2 [0x2])) - (nil))) - The RTL_SSA uses of this instruction has 2 uses: - 1. (reg:DI 15 a5 [orig:201 _149 ] [201]) - twice. - 2. (reg:DI 14 a4 [276]) - once. - - insn2 RTL: - (insn 38 353 351 4 (set (reg:DI 27 s11 [orig:160 loop_len_47 ] [160]) - (umin:DI (reg:DI 15 a5 [orig:199 _146 ] [199]) - (reg:DI 14 a4 [276]))) 408 {*umindi3} - (expr_list:REG_EQUAL (umin:DI (reg:DI 28 t3 [orig:200 ivtmp_147 ] [200]) - (const_int 2 [0x2])) - (nil))) - The RTL_SSA uses of this instruction has 3 uses: - 1. (reg:DI 15 a5 [orig:199 _146 ] [199]) - once - 2. (reg:DI 14 a4 [276]) - once - 3. (reg:DI 28 t3 [orig:200 ivtmp_147 ] [200]) - once - - Return false when insn1->uses ().size () != insn2->uses ().size () - */ - if (insn1->uses ().size () != insn2->uses ().size ()) - return false; - for (size_t i = 0; i < insn1->uses ().size (); i++) - if (insn1->uses ()[i] != insn2->uses ()[i]) - return false; - return true; + return false; } -/* Helper function to get single same real RTL source. - return NULL if it is not a single real RTL source. */ static insn_info * extract_single_source (set_info *set) { @@ -1066,2068 +647,1932 @@ extract_single_source (set_info *set) NULL so that VSETVL PASS will insert vsetvl directly. */ if (set->insn ()->is_artificial ()) return nullptr; - if (!source_equal_p (set->insn (), first_insn)) + if (set != *sets.begin () && !source_equal_p (set->insn (), first_insn)) return nullptr; } return first_insn; } -static unsigned -calculate_sew (vlmul_type vlmul, unsigned int ratio) +static bool +same_equiv_note_p (set_info *set1, set_info *set2) { - for (const unsigned sew : ALL_SEW) - if (calculate_ratio (sew, vlmul) == ratio) - return sew; - return 0; + insn_info *insn1 = extract_single_source (set1); + insn_info *insn2 = extract_single_source (set2); + if (!insn1 || !insn2) + return false; + return source_equal_p (insn1, insn2); } -static vlmul_type -calculate_vlmul (unsigned int sew, unsigned int ratio) +static unsigned +get_expr_id (unsigned bb_index, unsigned regno, unsigned num_bbs) { - for (const vlmul_type vlmul : ALL_LMUL) - if (calculate_ratio (sew, vlmul) == ratio) - return vlmul; - return LMUL_RESERVED; + return regno * num_bbs + bb_index; } - -static bool -incompatible_avl_p (const vector_insn_info &info1, - const vector_insn_info &info2) +static unsigned +get_regno (unsigned expr_id, unsigned num_bb) { - return !info1.compatible_avl_p (info2) && !info2.compatible_avl_p (info1); + return expr_id / num_bb; } - -static bool -different_sew_p (const vector_insn_info &info1, const vector_insn_info &info2) +static unsigned +get_bb_index (unsigned expr_id, unsigned num_bb) { - return info1.get_sew () != info2.get_sew (); + return expr_id % num_bb; } +/* Return true if the SET result is not used by any instructions. */ static bool -different_lmul_p (const vector_insn_info &info1, const vector_insn_info &info2) +has_no_uses (basic_block cfg_bb, rtx_insn *rinsn, int regno) { - return info1.get_vlmul () != info2.get_vlmul (); -} + if (bitmap_bit_p (df_get_live_out (cfg_bb), regno)) + return false; -static bool -different_ratio_p (const vector_insn_info &info1, const vector_insn_info &info2) -{ - return info1.get_ratio () != info2.get_ratio (); -} + rtx_insn *iter; + for (iter = NEXT_INSN (rinsn); iter && iter != NEXT_INSN (BB_END (cfg_bb)); + iter = NEXT_INSN (iter)) + if (df_find_use (iter, regno_reg_rtx[regno])) + return false; -static bool -different_tail_policy_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return info1.get_ta () != info2.get_ta (); + return true; } -static bool -different_mask_policy_p (const vector_insn_info &info1, - const vector_insn_info &info2) +/* Change insn and Assert the change always happens. */ +static void +validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group) { - return info1.get_ma () != info2.get_ma (); + bool change_p = validate_change (object, loc, new_rtx, in_group); + gcc_assert (change_p); } -static bool -possible_zero_avl_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return !info1.has_non_zero_avl () || !info2.has_non_zero_avl (); -} +/* This flags indicates the minimum demand of the vl and vtype values by the + RVV instruction. For example, DEMAND_RATIO_P indicates that this RVV + instruction only needs the SEW/LMUL ratio to remain the same, and does not + require SEW and LMUL to be fixed. + Therefore, if the former RVV instruction needs DEMAND_RATIO_P and the latter + instruction needs DEMAND_SEW_LMUL_P and its SEW/LMUL is the same as that of + the former instruction, then we can make the minimu demand of the former + instruction strict to DEMAND_SEW_LMUL_P, and its required SEW and LMUL are + the SEW and LMUL of the latter instruction, and the vsetvl instruction + generated according to the new demand can also be used for the latter + instruction, so there is no need to insert a separate vsetvl instruction for + the latter instruction. */ +enum demand_flags : unsigned +{ + DEMAND_EMPTY_P = 0, + DEMAND_SEW_P = 1 << 0, + DEMAND_LMUL_P = 1 << 1, + DEMAND_RATIO_P = 1 << 2, + DEMAND_GE_SEW_P = 1 << 3, + DEMAND_TAIL_POLICY_P = 1 << 4, + DEMAND_MASK_POLICY_P = 1 << 5, + DEMAND_AVL_P = 1 << 6, + DEMAND_NON_ZERO_AVL_P = 1 << 7, +}; -static bool -second_ratio_invalid_for_first_sew_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_vlmul (info1.get_sew (), info2.get_ratio ()) - == LMUL_RESERVED; -} +/* We split the demand information into three parts. They are sew and lmul + related (sew_lmul_demand_type), tail and mask policy related + (policy_demand_type) and avl related (avl_demand_type). Then we define three + interfaces avaiable_with, compatible_p and merge. avaiable_with is + used to determine whether the two vsetvl infos prev_info and next_info are + available or not. If prev_info is available for next_info, it means that the + RVV insn corresponding to next_info on the path from prev_info to next_info + can be used without inserting a separate vsetvl instruction. compatible_p + is used to determine whether prev_info is compatible with next_info, and if + so, merge can be used to merge the stricter demand information from + next_info into prev_info so that prev_info becomes available to next_info. + */ -static bool -second_ratio_invalid_for_first_lmul_p (const vector_insn_info &info1, - const vector_insn_info &info2) +enum class sew_lmul_demand_type : unsigned { - return calculate_sew (info1.get_vlmul (), info2.get_ratio ()) == 0; -} + sew_lmul = demand_flags::DEMAND_SEW_P | demand_flags::DEMAND_LMUL_P, + ratio_only = demand_flags::DEMAND_RATIO_P, + sew_only = demand_flags::DEMAND_SEW_P, + ge_sew = demand_flags::DEMAND_GE_SEW_P, + ratio_and_ge_sew + = demand_flags::DEMAND_RATIO_P | demand_flags::DEMAND_GE_SEW_P, +}; -static bool -float_insn_valid_sew_p (const vector_insn_info &info, unsigned int sew) +enum class policy_demand_type : unsigned { - if (info.get_insn () && info.get_insn ()->is_real () - && get_attr_type (info.get_insn ()->rtl ()) == TYPE_VFMOVFV) - { - if (sew == 16) - return TARGET_VECTOR_ELEN_FP_16; - else if (sew == 32) - return TARGET_VECTOR_ELEN_FP_32; - else if (sew == 64) - return TARGET_VECTOR_ELEN_FP_64; - } - return true; -} + tail_mask_policy + = demand_flags::DEMAND_TAIL_POLICY_P | demand_flags::DEMAND_MASK_POLICY_P, + tail_policy_only = demand_flags::DEMAND_TAIL_POLICY_P, + mask_policy_only = demand_flags::DEMAND_MASK_POLICY_P, + ignore_policy = demand_flags::DEMAND_EMPTY_P, +}; -static bool -second_sew_less_than_first_sew_p (const vector_insn_info &info1, - const vector_insn_info &info2) +enum class avl_demand_type : unsigned { - return info2.get_sew () < info1.get_sew () - || !float_insn_valid_sew_p (info1, info2.get_sew ()); -} + avl = demand_flags::DEMAND_AVL_P, + non_zero_avl = demand_flags::DEMAND_NON_ZERO_AVL_P, + ignore_avl = demand_flags::DEMAND_EMPTY_P, +}; -static bool -first_sew_less_than_second_sew_p (const vector_insn_info &info1, - const vector_insn_info &info2) +class vsetvl_info { - return info1.get_sew () < info2.get_sew () - || !float_insn_valid_sew_p (info2, info1.get_sew ()); -} +private: + insn_info *m_insn; + bb_info *m_bb; + rtx m_avl; + rtx m_vl; + set_info *m_avl_def; + uint8_t m_sew; + uint8_t m_max_sew; + vlmul_type m_vlmul; + uint8_t m_ratio; + bool m_ta; + bool m_ma; + + sew_lmul_demand_type m_sew_lmul_demand; + policy_demand_type m_policy_demand; + avl_demand_type m_avl_demand; + + enum class state_type + { + UNINITIALIZED, + VALID, + UNKNOWN, + EMPTY, + }; + state_type m_state; + + bool m_delete; + bool m_change_vtype_only; + insn_info *m_read_vl_insn; + bool m_vl_used_by_non_rvv_insn; -/* return 0 if LMUL1 == LMUL2. - return -1 if LMUL1 < LMUL2. - return 1 if LMUL1 > LMUL2. */ -static int -compare_lmul (vlmul_type vlmul1, vlmul_type vlmul2) -{ - if (vlmul1 == vlmul2) - return 0; +public: + vsetvl_info () + : m_insn (nullptr), m_bb (nullptr), m_avl (NULL_RTX), m_vl (NULL_RTX), + m_avl_def (nullptr), m_sew (0), m_max_sew (0), m_vlmul (LMUL_RESERVED), + m_ratio (0), m_ta (false), m_ma (false), + m_sew_lmul_demand (sew_lmul_demand_type::sew_lmul), + m_policy_demand (policy_demand_type::tail_mask_policy), + m_avl_demand (avl_demand_type::avl), m_state (state_type::UNINITIALIZED), + m_delete (false), m_change_vtype_only (false), m_read_vl_insn (nullptr), + m_vl_used_by_non_rvv_insn (false) + {} + + vsetvl_info (insn_info *insn) : vsetvl_info () { parse_insn (insn); } + + vsetvl_info (rtx_insn *insn) : vsetvl_info () { parse_insn (insn); } + + void set_avl (rtx avl) { m_avl = avl; } + void set_vl (rtx vl) { m_vl = vl; } + void set_avl_def (set_info *avl_def) { m_avl_def = avl_def; } + void set_sew (uint8_t sew) { m_sew = sew; } + void set_vlmul (vlmul_type vlmul) { m_vlmul = vlmul; } + void set_ratio (uint8_t ratio) { m_ratio = ratio; } + void set_ta (bool ta) { m_ta = ta; } + void set_ma (bool ma) { m_ma = ma; } + void set_delete () { m_delete = true; } + void set_bb (bb_info *bb) { m_bb = bb; } + void set_max_sew (uint8_t max_sew) { m_max_sew = max_sew; } + void set_change_vtype_only () { m_change_vtype_only = true; } + void set_read_vl_insn (insn_info *insn) { m_read_vl_insn = insn; } + + rtx get_avl () const { return m_avl; } + rtx get_vl () const { return m_vl; } + set_info *get_avl_def () const { return m_avl_def; } + uint8_t get_sew () const { return m_sew; } + vlmul_type get_vlmul () const { return m_vlmul; } + uint8_t get_ratio () const { return m_ratio; } + bool get_ta () const { return m_ta; } + bool get_ma () const { return m_ma; } + insn_info *get_insn () const { return m_insn; } + bool delete_p () const { return m_delete; } + bb_info *get_bb () const { return m_bb; } + uint8_t get_max_sew () const { return m_max_sew; } + insn_info *get_read_vl_insn () const { return m_read_vl_insn; } + bool vl_use_by_non_rvv_insn_p () const { return m_vl_used_by_non_rvv_insn; } + + bool has_imm_avl () const { return m_avl && CONST_INT_P (m_avl); } + bool has_vlmax_avl () const { return vlmax_avl_p (m_avl); } + bool has_nonvlmax_reg_avl () const + { + return m_avl && REG_P (m_avl) && !has_vlmax_avl (); + } + bool has_non_zero_avl () const + { + if (has_imm_avl ()) + return INTVAL (m_avl) > 0; + return has_vlmax_avl (); + } + bool has_vl () const + { + /* The VL operand can only be either a NULL_RTX or a register. */ + gcc_assert (!m_vl || REG_P (m_vl)); + return m_vl != NULL_RTX; + } + bool has_same_ratio (const vsetvl_info &other) const + { + return get_ratio () == other.get_ratio (); + } + + /* The block of INSN isn't always same as the block of the VSETVL_INFO, + meaning we may have 'get_insn ()->bb () != get_bb ()'. + + E.g. BB 2 (Empty) ---> BB 3 (VALID, has rvv insn 1) + + BB 2 has empty VSETVL_INFO, wheras BB 3 has VSETVL_INFO that satisfies + get_insn ()->bb () == get_bb (). In earliest fusion, we may fuse bb 3 and + bb 2 so that the 'get_bb ()' of BB2 VSETVL_INFO will be BB2 wheras the + 'get_insn ()' of BB2 VSETVL INFO will be the rvv insn 1 (which is located + at BB3). */ + bool insn_inside_bb_p () const { return get_insn ()->bb () == get_bb (); } + void update_avl (const vsetvl_info &other) + { + m_avl = other.get_avl (); + m_vl = other.get_vl (); + m_avl_def = other.get_avl_def (); + } + + bool uninit_p () const { return m_state == state_type::UNINITIALIZED; } + bool valid_p () const { return m_state == state_type::VALID; } + bool unknown_p () const { return m_state == state_type::UNKNOWN; } + bool empty_p () const { return m_state == state_type::EMPTY; } + bool change_vtype_only_p () const { return m_change_vtype_only; } + + void set_valid () { m_state = state_type::VALID; } + void set_unknown () { m_state = state_type::UNKNOWN; } + void set_empty () { m_state = state_type::EMPTY; } + + void set_sew_lmul_demand (sew_lmul_demand_type demand) + { + m_sew_lmul_demand = demand; + } + void set_policy_demand (policy_demand_type demand) + { + m_policy_demand = demand; + } + void set_avl_demand (avl_demand_type demand) { m_avl_demand = demand; } + + sew_lmul_demand_type get_sew_lmul_demand () const + { + return m_sew_lmul_demand; + } + policy_demand_type get_policy_demand () const { return m_policy_demand; } + avl_demand_type get_avl_demand () const { return m_avl_demand; } + + void normalize_demand (unsigned demand_flags) + { + switch (demand_flags + & (DEMAND_SEW_P | DEMAND_LMUL_P | DEMAND_RATIO_P | DEMAND_GE_SEW_P)) + { + case (unsigned) sew_lmul_demand_type::sew_lmul: + m_sew_lmul_demand = sew_lmul_demand_type::sew_lmul; + break; + case (unsigned) sew_lmul_demand_type::ratio_only: + m_sew_lmul_demand = sew_lmul_demand_type::ratio_only; + break; + case (unsigned) sew_lmul_demand_type::sew_only: + m_sew_lmul_demand = sew_lmul_demand_type::sew_only; + break; + case (unsigned) sew_lmul_demand_type::ge_sew: + m_sew_lmul_demand = sew_lmul_demand_type::ge_sew; + break; + case (unsigned) sew_lmul_demand_type::ratio_and_ge_sew: + m_sew_lmul_demand = sew_lmul_demand_type::ratio_and_ge_sew; + break; + default: + gcc_unreachable (); + } + + switch (demand_flags & (DEMAND_TAIL_POLICY_P | DEMAND_MASK_POLICY_P)) + { + case (unsigned) policy_demand_type::tail_mask_policy: + m_policy_demand = policy_demand_type::tail_mask_policy; + break; + case (unsigned) policy_demand_type::tail_policy_only: + m_policy_demand = policy_demand_type::tail_policy_only; + break; + case (unsigned) policy_demand_type::mask_policy_only: + m_policy_demand = policy_demand_type::mask_policy_only; + break; + case (unsigned) policy_demand_type::ignore_policy: + m_policy_demand = policy_demand_type::ignore_policy; + break; + default: + gcc_unreachable (); + } + + switch (demand_flags & (DEMAND_AVL_P | DEMAND_NON_ZERO_AVL_P)) + { + case (unsigned) avl_demand_type::avl: + m_avl_demand = avl_demand_type::avl; + break; + case (unsigned) avl_demand_type::non_zero_avl: + m_avl_demand = avl_demand_type::non_zero_avl; + break; + case (unsigned) avl_demand_type::ignore_avl: + m_avl_demand = avl_demand_type::ignore_avl; + break; + default: + gcc_unreachable (); + } + } + + void parse_insn (rtx_insn *rinsn) + { + if (!NONDEBUG_INSN_P (rinsn)) + return; + if (optimize == 0 && !has_vtype_op (rinsn)) + return; + gcc_assert (!vsetvl_discard_result_insn_p (rinsn)); + set_valid (); + extract_insn_cached (rinsn); + m_avl = ::get_avl (rinsn); + if (has_vlmax_avl () || vsetvl_insn_p (rinsn)) + m_vl = ::get_vl (rinsn); + m_sew = ::get_sew (rinsn); + m_vlmul = ::get_vlmul (rinsn); + m_ta = tail_agnostic_p (rinsn); + m_ma = mask_agnostic_p (rinsn); + } + + void parse_insn (insn_info *insn) + { + m_insn = insn; + m_bb = insn->bb (); + /* Return if it is debug insn for the consistency with optimize == 0. */ + if (insn->is_debug_insn ()) + return; - switch (vlmul1) - { - case LMUL_1: - if (vlmul2 == LMUL_2 || vlmul2 == LMUL_4 || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_2: - if (vlmul2 == LMUL_4 || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_4: - if (vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_8: - return -1; - case LMUL_F2: - if (vlmul2 == LMUL_1 || vlmul2 == LMUL_2 || vlmul2 == LMUL_4 - || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_F4: - if (vlmul2 == LMUL_F2 || vlmul2 == LMUL_1 || vlmul2 == LMUL_2 - || vlmul2 == LMUL_4 || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_F8: - return 0; - default: - gcc_unreachable (); - } -} + /* We set it as unknown since we don't what will happen in CALL or ASM. */ + if (insn->is_call () || insn->is_asm ()) + { + set_unknown (); + return; + } + + /* If this is something that updates VL/VTYPE that we don't know about, set + the state to unknown. */ + if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ()) + && (find_access (insn->defs (), VL_REGNUM) + || find_access (insn->defs (), VTYPE_REGNUM))) + { + set_unknown (); + return; + } + + if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ())) + /* uninitialized */ + return; -static bool -second_lmul_less_than_first_lmul_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return compare_lmul (info2.get_vlmul (), info1.get_vlmul ()) == -1; -} + set_valid (); + + m_avl = ::get_avl (insn->rtl ()); + if (m_avl) + { + if (vsetvl_insn_p (insn->rtl ()) || has_vlmax_avl ()) + m_vl = ::get_vl (insn->rtl ()); + + if (has_nonvlmax_reg_avl ()) + m_avl_def = find_access (insn->uses (), REGNO (m_avl))->def (); + } + + m_sew = ::get_sew (insn->rtl ()); + m_vlmul = ::get_vlmul (insn->rtl ()); + m_ratio = get_attr_ratio (insn->rtl ()); + /* when get_attr_ratio is invalid, this kind of instructions + doesn't care about ratio. However, we still need this value + in demand info backward analysis. */ + if (m_ratio == INVALID_ATTRIBUTE) + m_ratio = calculate_ratio (m_sew, m_vlmul); + m_ta = tail_agnostic_p (insn->rtl ()); + m_ma = mask_agnostic_p (insn->rtl ()); + + /* If merge operand is undef value, we prefer agnostic. */ + int merge_op_idx = get_attr_merge_op_idx (insn->rtl ()); + if (merge_op_idx != INVALID_ATTRIBUTE + && satisfies_constraint_vu (recog_data.operand[merge_op_idx])) + { + m_ta = true; + m_ma = true; + } + + /* Determine the demand info of the RVV insn. */ + m_max_sew = get_max_int_sew (); + unsigned demand_flags = 0; + if (vector_config_insn_p (insn->rtl ())) + { + demand_flags |= demand_flags::DEMAND_AVL_P; + demand_flags |= demand_flags::DEMAND_RATIO_P; + } + else + { + if (has_vl_op (insn->rtl ())) + { + if (scalar_move_insn_p (insn->rtl ())) + { + /* If the avl for vmv.s.x comes from the vsetvl instruction, we + don't know if the avl is non-zero, so it is set to + DEMAND_AVL_P for now. it may be corrected to + DEMAND_NON_ZERO_AVL_P later when more information is + available. + */ + if (has_non_zero_avl ()) + demand_flags |= demand_flags::DEMAND_NON_ZERO_AVL_P; + else + demand_flags |= demand_flags::DEMAND_AVL_P; + } + else + demand_flags |= demand_flags::DEMAND_AVL_P; + } -static bool -second_ratio_less_than_first_ratio_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return info2.get_ratio () < info1.get_ratio (); -} - -static CONSTEXPR const demands_cond incompatible_conds[] = { -#define DEF_INCOMPATIBLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, \ - GE_SEW1, TAIL_POLICTY1, MASK_POLICY1, AVL2, \ - SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, \ - TAIL_POLICTY2, MASK_POLICY2, COND) \ - {{{AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, TAIL_POLICTY1, \ - MASK_POLICY1}, \ - {AVL2, SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ - MASK_POLICY2}}, \ - COND}, -#include "riscv-vsetvl.def" -}; - -static unsigned -greatest_sew (const vector_insn_info &info1, const vector_insn_info &info2) -{ - return std::max (info1.get_sew (), info2.get_sew ()); -} - -static unsigned -first_sew (const vector_insn_info &info1, const vector_insn_info &) -{ - return info1.get_sew (); -} - -static unsigned -second_sew (const vector_insn_info &, const vector_insn_info &info2) -{ - return info2.get_sew (); -} - -static vlmul_type -first_vlmul (const vector_insn_info &info1, const vector_insn_info &) -{ - return info1.get_vlmul (); -} - -static vlmul_type -second_vlmul (const vector_insn_info &, const vector_insn_info &info2) -{ - return info2.get_vlmul (); -} - -static unsigned -first_ratio (const vector_insn_info &info1, const vector_insn_info &) -{ - return info1.get_ratio (); -} + if (get_attr_ratio (insn->rtl ()) != INVALID_ATTRIBUTE) + demand_flags |= demand_flags::DEMAND_RATIO_P; + else + { + if (scalar_move_insn_p (insn->rtl ()) && m_ta) + { + demand_flags |= demand_flags::DEMAND_GE_SEW_P; + m_max_sew = get_attr_type (insn->rtl ()) == TYPE_VFMOVFV + ? get_max_float_sew () + : get_max_int_sew (); + } + else + demand_flags |= demand_flags::DEMAND_SEW_P; + + if (!ignore_vlmul_insn_p (insn->rtl ())) + demand_flags |= demand_flags::DEMAND_LMUL_P; + } -static unsigned -second_ratio (const vector_insn_info &, const vector_insn_info &info2) -{ - return info2.get_ratio (); -} + if (!m_ta) + demand_flags |= demand_flags::DEMAND_TAIL_POLICY_P; + if (!m_ma) + demand_flags |= demand_flags::DEMAND_MASK_POLICY_P; + } + + normalize_demand (demand_flags); + + /* Optimize AVL from the vsetvl instruction. */ + insn_info *def_insn = extract_single_source (get_avl_def ()); + if (def_insn && vsetvl_insn_p (def_insn->rtl ())) + { + vsetvl_info def_info = vsetvl_info (def_insn); + if ((scalar_move_insn_p (insn->rtl ()) + || def_info.get_ratio () == get_ratio ()) + && (def_info.has_vlmax_avl () || def_info.has_imm_avl ())) + { + update_avl (def_info); + if (scalar_move_insn_p (insn->rtl ()) && has_non_zero_avl ()) + m_avl_demand = avl_demand_type::non_zero_avl; + } + } + + /* Determine if dest operand(vl) has been used by non-RVV instructions. */ + if (has_vl ()) + { + const hash_set vl_uses + = get_all_real_uses (get_insn (), REGNO (get_vl ())); + for (use_info *use : vl_uses) + { + gcc_assert (use->insn ()->is_real ()); + rtx_insn *rinsn = use->insn ()->rtl (); + if (!has_vl_op (rinsn) + || count_regno_occurrences (rinsn, REGNO (get_vl ())) != 1) + { + m_vl_used_by_non_rvv_insn = true; + break; + } + rtx avl = ::get_avl (rinsn); + if (!avl || REGNO (get_vl ()) != REGNO (avl)) + { + m_vl_used_by_non_rvv_insn = true; + break; + } + } + } -static vlmul_type -vlmul_for_first_sew_second_ratio (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_vlmul (info1.get_sew (), info2.get_ratio ()); -} + /* Collect the read vl insn for the fault-only-first rvv loads. */ + if (fault_first_load_p (insn->rtl ())) + { + for (insn_info *i = insn->next_nondebug_insn (); + i->bb () == insn->bb (); i = i->next_nondebug_insn ()) + { + if (find_access (i->defs (), VL_REGNUM)) + break; + if (i->rtl () && read_vl_insn_p (i->rtl ())) + { + m_read_vl_insn = i; + break; + } + } + } + } + + /* Returns the corresponding vsetvl rtx pat. */ + rtx get_vsetvl_pat (bool ignore_vl = false) const + { + rtx avl = get_avl (); + /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s, + set the value of avl to (const_int 0) so that VSETVL PASS will + insert vsetvl correctly.*/ + if (!get_avl ()) + avl = GEN_INT (0); + rtx sew = gen_int_mode (get_sew (), Pmode); + rtx vlmul = gen_int_mode (get_vlmul (), Pmode); + rtx ta = gen_int_mode (get_ta (), Pmode); + rtx ma = gen_int_mode (get_ma (), Pmode); + + if (change_vtype_only_p ()) + return gen_vsetvl_vtype_change_only (sew, vlmul, ta, ma); + else if (has_vl () && !ignore_vl) + return gen_vsetvl (Pmode, get_vl (), avl, sew, vlmul, ta, ma); + else + return gen_vsetvl_discard_result (Pmode, avl, sew, vlmul, ta, ma); + } + + bool operator== (const vsetvl_info &other) const + { + gcc_assert (!uninit_p () && !other.uninit_p () + && "Uninitialization should not happen"); + + if (empty_p ()) + return other.empty_p (); + if (unknown_p ()) + return other.unknown_p (); + + return get_insn () == other.get_insn () && get_bb () == other.get_bb () + && get_avl () == other.get_avl () && get_vl () == other.get_vl () + && get_avl_def () == other.get_avl_def () + && get_sew () == other.get_sew () + && get_vlmul () == other.get_vlmul () && get_ta () == other.get_ta () + && get_ma () == other.get_ma () + && get_avl_demand () == other.get_avl_demand () + && get_sew_lmul_demand () == other.get_sew_lmul_demand () + && get_policy_demand () == other.get_policy_demand (); + } + + void dump (FILE *file, const char *indent = "") const + { + if (uninit_p ()) + { + fprintf (file, "UNINITIALIZED.\n"); + return; + } + else if (unknown_p ()) + { + fprintf (file, "UNKNOWN.\n"); + return; + } + else if (empty_p ()) + { + fprintf (file, "EMPTY.\n"); + return; + } + else if (valid_p ()) + fprintf (file, "VALID (insn %u, bb %u)%s\n", get_insn ()->uid (), + get_bb ()->index (), delete_p () ? " (deleted)" : ""); + else + gcc_unreachable (); -static vlmul_type -vlmul_for_greatest_sew_second_ratio (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_vlmul (MAX (info1.get_sew (), info2.get_sew ()), - info2.get_ratio ()); -} + fprintf (file, "%sDemand fields:", indent); + if (m_sew_lmul_demand == sew_lmul_demand_type::sew_lmul) + fprintf (file, " demand_sew_lmul"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::ratio_only) + fprintf (file, " demand_ratio_only"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::sew_only) + fprintf (file, " demand_sew_only"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::ge_sew) + fprintf (file, " demand_ge_sew"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::ratio_and_ge_sew) + fprintf (file, " demand_ratio_and_ge_sew"); + + if (m_policy_demand == policy_demand_type::tail_mask_policy) + fprintf (file, " demand_tail_mask_policy"); + else if (m_policy_demand == policy_demand_type::tail_policy_only) + fprintf (file, " demand_tail_policy_only"); + else if (m_policy_demand == policy_demand_type::mask_policy_only) + fprintf (file, " demand_mask_policy_only"); + + if (m_avl_demand == avl_demand_type::avl) + fprintf (file, " demand_avl"); + else if (m_avl_demand == avl_demand_type::non_zero_avl) + fprintf (file, " demand_non_zero_avl"); + fprintf (file, "\n"); + + fprintf (file, "%sSEW=%d, ", indent, get_sew ()); + fprintf (file, "VLMUL=%s, ", vlmul_to_str (get_vlmul ())); + fprintf (file, "RATIO=%d, ", get_ratio ()); + fprintf (file, "MAX_SEW=%d\n", get_max_sew ()); + + fprintf (file, "%sTAIL_POLICY=%s, ", indent, policy_to_str (get_ta ())); + fprintf (file, "MASK_POLICY=%s\n", policy_to_str (get_ma ())); + + fprintf (file, "%sAVL=", indent); + print_rtl_single (file, get_avl ()); + fprintf (file, "%sVL=", indent); + print_rtl_single (file, get_vl ()); + if (change_vtype_only_p ()) + fprintf (file, "%schange vtype only\n", indent); + if (get_read_vl_insn ()) + fprintf (file, "%sread_vl_insn: insn %u\n", indent, + get_read_vl_insn ()->uid ()); + if (vl_use_by_non_rvv_insn_p ()) + fprintf (file, "%suse_by_non_rvv_insn=true\n", indent); + } +}; -static unsigned -ratio_for_second_sew_first_vlmul (const vector_insn_info &info1, - const vector_insn_info &info2) +class vsetvl_block_info { - return calculate_ratio (info2.get_sew (), info1.get_vlmul ()); -} - -static CONSTEXPR const demands_fuse_rule fuse_rules[] = { -#define DEF_SEW_LMUL_FUSE_RULE(DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, \ - DEMAND_GE_SEW1, DEMAND_SEW2, DEMAND_LMUL2, \ - DEMAND_RATIO2, DEMAND_GE_SEW2, NEW_DEMAND_SEW, \ - NEW_DEMAND_LMUL, NEW_DEMAND_RATIO, \ - NEW_DEMAND_GE_SEW, NEW_SEW, NEW_VLMUL, \ - NEW_RATIO) \ - {{{DEMAND_ANY, DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, DEMAND_ANY, \ - DEMAND_GE_SEW1, DEMAND_ANY, DEMAND_ANY}, \ - {DEMAND_ANY, DEMAND_SEW2, DEMAND_LMUL2, DEMAND_RATIO2, DEMAND_ANY, \ - DEMAND_GE_SEW2, DEMAND_ANY, DEMAND_ANY}}, \ - NEW_DEMAND_SEW, \ - NEW_DEMAND_LMUL, \ - NEW_DEMAND_RATIO, \ - NEW_DEMAND_GE_SEW, \ - NEW_SEW, \ - NEW_VLMUL, \ - NEW_RATIO}, -#include "riscv-vsetvl.def" +public: + /* The static execute probability of the demand info. */ + profile_probability probability; + + auto_vec infos; + vsetvl_info m_info; + bb_info *m_bb; + + bool full_available; + + vsetvl_block_info () : m_bb (nullptr), full_available (false) + { + infos.safe_grow_cleared (0); + m_info.set_empty (); + } + vsetvl_block_info (const vsetvl_block_info &other) + : probability (other.probability), infos (other.infos.copy ()), + m_info (other.m_info), m_bb (other.m_bb) + {} + + vsetvl_info &get_entry_info () + { + gcc_assert (!empty_p ()); + return infos.is_empty () ? m_info : infos[0]; + } + vsetvl_info &get_exit_info () + { + gcc_assert (!empty_p ()); + return infos.is_empty () ? m_info : infos[infos.length () - 1]; + } + const vsetvl_info &get_entry_info () const + { + gcc_assert (!empty_p ()); + return infos.is_empty () ? m_info : infos[0]; + } + const vsetvl_info &get_exit_info () const + { + gcc_assert (!empty_p ()); + return infos.is_empty () ? m_info : infos[infos.length () - 1]; + } + + bool empty_p () const { return infos.is_empty () && !has_info (); } + bool has_info () const { return !m_info.empty_p (); } + void set_info (const vsetvl_info &info) + { + gcc_assert (infos.is_empty ()); + m_info = info; + m_info.set_bb (m_bb); + } + void set_empty_info () { m_info.set_empty (); } }; -static bool -always_unavailable (const vector_insn_info &, const vector_insn_info &) -{ - return true; -} -static bool -avl_unavailable_p (const vector_insn_info &info1, const vector_insn_info &info2) +/* Demand system is the RVV-based VSETVL info analysis tools wrapper. + It defines compatible rules for SEW/LMUL, POLICY and AVL. + Also, it provides 3 iterfaces avaiable_p, compatible_p and + merge for the VSETVL PASS analysis and optimization. + + - avaiable_p: Determine whether the next info can get the + avaiable VSETVL status from previous info. + e.g. bb 2 (demand SEW = 32, LMUL = M2) -> bb 3 (demand RATIO = 16). + Since bb 2 demand info (SEW/LMUL = 32/2 = 16) satisfies the bb 3 + demand, the VSETVL instruction in bb 3 can be elided. + avaiable_p (previous, next) is true in such situation. + - compatible_p: Determine whether prev_info is compatible with next_info + so that we can have a new merged info that is avaiable to both of them. + - merge: Merge the stricter demand information from + next_info into prev_info so that prev_info becomes available to + next_info. */ +class demand_system { - return !info2.compatible_avl_p (info1.get_avl_info ()); -} +private: + sbitmap *m_avl_def_in; + sbitmap *m_avl_def_out; -static bool -sew_unavailable_p (const vector_insn_info &info1, const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_LMUL) && !info2.demand_p (DEMAND_RATIO)) - { - if (info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - return info1.get_sew () != info2.get_sew (); - } - return true; -} + /* predictors. */ -static bool -lmul_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info1.get_vlmul () == info2.get_vlmul () && !info2.demand_p (DEMAND_SEW) - && !info2.demand_p (DEMAND_RATIO)) + inline bool always_true (const vsetvl_info &prev ATTRIBUTE_UNUSED, + const vsetvl_info &next ATTRIBUTE_UNUSED) + { + return true; + } + inline bool always_false (const vsetvl_info &prev ATTRIBUTE_UNUSED, + const vsetvl_info &next ATTRIBUTE_UNUSED) + { return false; - return true; -} + } + + /* predictors for sew and lmul */ + + inline bool lmul_eq_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_vlmul () == next.get_vlmul (); + } + inline bool sew_eq_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_sew () == next.get_sew (); + } + inline bool sew_lmul_eq_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return lmul_eq_p (prev, next) && sew_eq_p (prev, next); + } + inline bool sew_ge_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_sew () == next.get_sew () + || (next.get_ta () && prev.get_sew () > next.get_sew ()); + } + inline bool sew_le_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_sew () == next.get_sew () + || (prev.get_ta () && prev.get_sew () < next.get_sew ()); + } + inline bool prev_sew_le_next_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_sew () <= next.get_max_sew (); + } + inline bool next_sew_le_prev_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return next.get_sew () <= prev.get_max_sew (); + } + inline bool max_sew_overlap_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return !(prev.get_sew () > next.get_max_sew () + || next.get_sew () > prev.get_max_sew ()); + } + inline bool ratio_eq_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.has_same_ratio (next); + } + inline bool prev_ratio_valid_for_next_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ratio () >= (next.get_sew () / 8); + } + inline bool next_ratio_valid_for_prev_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return next.get_ratio () >= (prev.get_sew () / 8); + } + + inline bool sew_ge_and_ratio_eq_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return sew_ge_p (prev, next) && ratio_eq_p (prev, next); + } + inline bool sew_ge_and_prev_sew_le_next_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return sew_ge_p (prev, next) && prev_sew_le_next_max_sew_p (prev, next); + } + inline bool + sew_ge_and_prev_sew_le_next_max_sew_and_next_ratio_valid_for_prev_sew_p ( + const vsetvl_info &prev, const vsetvl_info &next) + { + return sew_ge_p (prev, next) && prev_sew_le_next_max_sew_p (prev, next) + && next_ratio_valid_for_prev_sew_p (prev, next); + } + inline bool sew_le_and_next_sew_le_prev_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return sew_le_p (prev, next) && next_sew_le_prev_max_sew_p (prev, next); + } + inline bool + max_sew_overlap_and_next_ratio_valid_for_prev_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return next_ratio_valid_for_prev_sew_p (prev, next) + && max_sew_overlap_p (prev, next); + } + inline bool + sew_le_and_next_sew_le_prev_max_sew_and_ratio_eq_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return sew_le_p (prev, next) && ratio_eq_p (prev, next) + && next_sew_le_prev_max_sew_p (prev, next); + } + inline bool + max_sew_overlap_and_prev_ratio_valid_for_next_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev_ratio_valid_for_next_sew_p (prev, next) + && max_sew_overlap_p (prev, next); + } + inline bool + sew_le_and_next_sew_le_prev_max_sew_and_prev_ratio_valid_for_next_sew_p ( + const vsetvl_info &prev, const vsetvl_info &next) + { + return sew_le_p (prev, next) && prev_ratio_valid_for_next_sew_p (prev, next) + && next_sew_le_prev_max_sew_p (prev, next); + } + inline bool max_sew_overlap_and_ratio_eq_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ratio_eq_p (prev, next) && max_sew_overlap_p (prev, next); + } + + /* predictors for tail and mask policy */ + + inline bool tail_policy_eq_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ta () == next.get_ta (); + } + inline bool mask_policy_eq_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ma () == next.get_ma (); + } + inline bool tail_mask_policy_eq_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return tail_policy_eq_p (prev, next) && mask_policy_eq_p (prev, next); + } + + /* predictors for avl */ + + inline bool modify_or_use_vl_p (insn_info *i, const vsetvl_info &info) + { + return info.has_vl () + && (find_access (i->uses (), REGNO (info.get_vl ())) + || find_access (i->defs (), REGNO (info.get_vl ()))); + } + inline bool modify_avl_p (insn_info *i, const vsetvl_info &info) + { + return info.has_nonvlmax_reg_avl () + && find_access (i->defs (), REGNO (info.get_avl ())); + } + + inline bool modify_reg_between_p (insn_info *prev_insn, insn_info *curr_insn, + unsigned regno) + { + gcc_assert (prev_insn->compare_with (curr_insn) < 0); + for (insn_info *i = curr_insn->prev_nondebug_insn (); i != prev_insn; + i = i->prev_nondebug_insn ()) + { + // no def of regno + if (find_access (i->defs (), regno)) + return true; + } + return false; + } -static bool -ge_sew_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_LMUL) && !info2.demand_p (DEMAND_RATIO) - && info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - return true; -} + inline bool reg_avl_equal_p (const vsetvl_info &prev, const vsetvl_info &next) + { + if (!prev.has_nonvlmax_reg_avl () || !next.has_nonvlmax_reg_avl ()) + return false; -static bool -ge_sew_lmul_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_RATIO) && info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - return true; -} + if (same_equiv_note_p (prev.get_avl_def (), next.get_avl_def ())) + return true; -static bool -ge_sew_ratio_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_LMUL)) - { - if (info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - /* Demand GE_SEW should be available for non-demand SEW. */ - else if (!info2.demand_p (DEMAND_SEW)) - return false; - } - return true; -} + if (REGNO (prev.get_avl ()) != REGNO (next.get_avl ())) + return false; -static CONSTEXPR const demands_cond unavailable_conds[] = { -#define DEF_UNAVAILABLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, \ - TAIL_POLICTY1, MASK_POLICY1, AVL2, SEW2, LMUL2, \ - RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ - MASK_POLICY2, COND) \ - {{{AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, TAIL_POLICTY1, \ - MASK_POLICY1}, \ - {AVL2, SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ - MASK_POLICY2}}, \ - COND}, -#include "riscv-vsetvl.def" -}; + insn_info *prev_insn = prev.get_insn (); + if (prev.get_bb () != prev_insn->bb ()) + prev_insn = prev.get_bb ()->end_insn (); -static bool -same_sew_lmul_demand_p (const bool *dems1, const bool *dems2) -{ - return dems1[DEMAND_SEW] == dems2[DEMAND_SEW] - && dems1[DEMAND_LMUL] == dems2[DEMAND_LMUL] - && dems1[DEMAND_RATIO] == dems2[DEMAND_RATIO] && !dems1[DEMAND_GE_SEW] - && !dems2[DEMAND_GE_SEW]; -} + insn_info *next_insn = next.get_insn (); + if (next.get_bb () != next_insn->bb ()) + next_insn = next.get_bb ()->end_insn (); -static bool -propagate_avl_across_demands_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info2.demand_p (DEMAND_AVL)) - { - if (info2.demand_p (DEMAND_NONZERO_AVL)) - return info1.demand_p (DEMAND_AVL) - && !info1.demand_p (DEMAND_NONZERO_AVL) && info1.has_avl_reg (); - } - else - return info1.demand_p (DEMAND_AVL) && info1.has_avl_reg (); - return false; -} + return avl_vl_unmodified_between_p (prev_insn, next_insn, next, false); + } -static bool -reg_available_p (const insn_info *insn, const vector_insn_info &info) -{ - if (info.has_avl_reg () && !info.get_avl_source ()) - return false; - insn_info *def_insn = info.get_avl_source ()->insn (); - if (def_insn->bb () == insn->bb ()) - return before_p (def_insn, insn); - else - return dominated_by_p (CDI_DOMINATORS, insn->bb ()->cfg_bb (), - def_insn->bb ()->cfg_bb ()); -} + inline bool avl_equal_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); -/* Return true if the instruction support relaxed compatible check. */ -static bool -support_relaxed_compatible_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (fault_first_load_p (info1.get_insn ()->rtl ()) - && info2.demand_p (DEMAND_AVL) && info2.has_avl_reg () - && info2.get_avl_source () && info2.get_avl_source ()->insn ()->is_phi ()) - { - hash_set sets - = get_all_sets (info2.get_avl_source (), true, false, false); - for (set_info *set : sets) - { - if (read_vl_insn_p (set->insn ()->rtl ())) - { - const insn_info *insn - = get_backward_fault_first_load_insn (set->insn ()); - if (insn == info1.get_insn ()) - return info2.compatible_vtype_p (info1); - } - } - } - return false; -} + if (prev.get_ratio () != next.get_ratio ()) + return false; -/* Count the number of REGNO in RINSN. */ -static int -count_regno_occurrences (rtx_insn *rinsn, unsigned int regno) -{ - int count = 0; - extract_insn (rinsn); - for (int i = 0; i < recog_data.n_operands; i++) - if (refers_to_regno_p (regno, recog_data.operand[i])) - count++; - return count; -} + if (next.has_vl () && next.vl_use_by_non_rvv_insn_p ()) + return false; -/* Return TRUE if the demands can be fused. */ -static bool -demands_can_be_fused_p (const vector_insn_info &be_fused, - const vector_insn_info &to_fuse) -{ - return be_fused.compatible_p (to_fuse) && !be_fused.available_p (to_fuse); -} + if (vector_config_insn_p (prev.get_insn ()->rtl ()) && next.get_avl_def () + && next.get_avl_def ()->insn () == prev.get_insn ()) + return true; -/* Return true if we can fuse VSETVL demand info into predecessor of earliest - * edge. */ -static bool -earliest_pred_can_be_fused_p (const bb_info *earliest_pred, - const vector_insn_info &earliest_info, - const vector_insn_info &expr, rtx *vlmax_vl) -{ - /* Backward VLMAX VL: - bb 3: - vsetivli zero, 1 ... -> vsetvli t1, zero - vmv.s.x - bb 5: - vsetvli t1, zero ... -> to be elided. - vlse16.v - - We should forward "t1". */ - if (!earliest_info.has_avl_reg () && expr.has_avl_reg ()) - { - rtx avl_or_vl_reg = expr.get_avl_or_vl_reg (); - gcc_assert (avl_or_vl_reg); - const insn_info *last_insn = earliest_info.get_insn (); - /* To fuse demand on earlest edge, we make sure AVL/VL - didn't change from the consume insn to the predecessor - of the edge. */ - for (insn_info *i = earliest_pred->end_insn ()->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, earliest_pred) - && after_or_same_p (i, last_insn); - i = i->prev_nondebug_insn ()) - { - if (find_access (i->defs (), REGNO (avl_or_vl_reg))) - return false; - if (find_access (i->uses (), REGNO (avl_or_vl_reg))) + if (prev.get_read_vl_insn ()) + { + if (!next.has_nonvlmax_reg_avl () || !next.get_avl_def ()) + return false; + insn_info *avl_def_insn = extract_single_source (next.get_avl_def ()); + return avl_def_insn == prev.get_read_vl_insn (); + } + + if (prev == next && prev.has_nonvlmax_reg_avl ()) + { + insn_info *insn = prev.get_insn (); + bb_info *bb = insn->bb (); + for (insn_info *i = insn; real_insn_and_same_bb_p (i, bb); + i = i->next_nondebug_insn ()) + if (find_access (i->defs (), REGNO (prev.get_avl ()))) return false; - } - if (vlmax_vl && vlmax_avl_p (expr.get_avl ())) - *vlmax_vl = avl_or_vl_reg; - } + } - return true; -} - -/* Return true if the current VSETVL 1 is dominated by preceding VSETVL 2. - - VSETVL 2 dominates VSETVL 1 should satisfy this following check: + if (prev.has_vlmax_avl () && next.has_vlmax_avl ()) + return true; + else if (prev.has_imm_avl () && next.has_imm_avl ()) + return INTVAL (prev.get_avl ()) == INTVAL (next.get_avl ()); + else if (prev.has_vl () && next.has_nonvlmax_reg_avl () + && REGNO (prev.get_vl ()) == REGNO (next.get_avl ())) + { + insn_info *prev_insn = prev.insn_inside_bb_p () + ? prev.get_insn () + : prev.get_bb ()->end_insn (); + + insn_info *next_insn = next.insn_inside_bb_p () + ? next.get_insn () + : next.get_bb ()->end_insn (); + return avl_vl_unmodified_between_p (prev_insn, next_insn, next, false); + } + else if (prev.has_nonvlmax_reg_avl () && next.has_nonvlmax_reg_avl ()) + return reg_avl_equal_p (prev, next); - - VSETVL 2 should have the RATIO (SEW/LMUL) with VSETVL 1. - - VSETVL 2 is user vsetvl (vsetvl VL, AVL) - - VSETVL 2 "VL" result is the "AVL" of VSETL1. */ -static bool -vsetvl_dominated_by_p (const basic_block cfg_bb, - const vector_insn_info &vsetvl1, - const vector_insn_info &vsetvl2, bool fuse_p) -{ - if (!vsetvl1.valid_or_dirty_p () || !vsetvl2.valid_or_dirty_p ()) - return false; - if (!has_vl_op (vsetvl1.get_insn ()->rtl ()) - || !vsetvl_insn_p (vsetvl2.get_insn ()->rtl ())) return false; + } + inline bool avl_equal_or_prev_avl_non_zero_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return avl_equal_p (prev, next) || prev.has_non_zero_avl (); + } + + inline bool can_use_next_avl_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + if (!next.has_nonvlmax_reg_avl () && !next.has_vl ()) + return true; - hash_set sets - = get_all_sets (vsetvl1.get_avl_source (), true, false, false); - set_info *set = get_same_bb_set (sets, cfg_bb); + insn_info *prev_insn = prev.get_insn (); + if (prev.get_bb () != prev_insn->bb ()) + prev_insn = prev.get_bb ()->end_insn (); + + insn_info *next_insn = next.get_insn (); + if (next.get_bb () != next_insn->bb ()) + next_insn = next.get_bb ()->end_insn (); + + return avl_vl_unmodified_between_p (prev_insn, next_insn, next); + } + + inline bool avl_equal_or_next_avl_non_zero_and_can_use_next_avl_p ( + const vsetvl_info &prev, const vsetvl_info &next) + { + return avl_equal_p (prev, next) + || (next.has_non_zero_avl () && can_use_next_avl_p (prev, next)); + } + + /* modifiers */ + + inline void nop (const vsetvl_info &prev ATTRIBUTE_UNUSED, + const vsetvl_info &next ATTRIBUTE_UNUSED) + {} + + /* modifiers for sew and lmul */ + + inline void use_min_of_max_sew (vsetvl_info &prev, const vsetvl_info &next) + { + prev.set_max_sew (MIN (prev.get_max_sew (), next.get_max_sew ())); + } + inline void use_next_sew (vsetvl_info &prev, const vsetvl_info &next) + { + prev.set_sew (next.get_sew ()); + use_min_of_max_sew (prev, next); + } + inline void use_max_sew (vsetvl_info &prev, const vsetvl_info &next) + { + auto max_sew = std::max (prev.get_sew (), next.get_sew ()); + prev.set_sew (max_sew); + use_min_of_max_sew (prev, next); + } + inline void use_next_sew_lmul (vsetvl_info &prev, const vsetvl_info &next) + { + use_next_sew (prev, next); + prev.set_vlmul (next.get_vlmul ()); + prev.set_ratio (next.get_ratio ()); + } + inline void use_next_sew_with_prev_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + use_next_sew (prev, next); + prev.set_vlmul (calculate_vlmul (next.get_sew (), prev.get_ratio ())); + } + inline void modify_lmul_with_next_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + prev.set_vlmul (calculate_vlmul (prev.get_sew (), next.get_ratio ())); + prev.set_ratio (next.get_ratio ()); + } + + inline void use_max_sew_and_lmul_with_next_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + prev.set_vlmul (calculate_vlmul (prev.get_sew (), next.get_ratio ())); + use_max_sew (prev, next); + prev.set_ratio (next.get_ratio ()); + } + + inline void use_max_sew_and_lmul_with_prev_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + auto max_sew = std::max (prev.get_sew (), next.get_sew ()); + prev.set_vlmul (calculate_vlmul (max_sew, prev.get_ratio ())); + prev.set_sew (max_sew); + } + + /* modifiers for tail and mask policy */ + + inline void use_tail_policy (vsetvl_info &prev, const vsetvl_info &next) + { + if (!next.get_ta ()) + prev.set_ta (next.get_ta ()); + } + inline void use_mask_policy (vsetvl_info &prev, const vsetvl_info &next) + { + if (!next.get_ma ()) + prev.set_ma (next.get_ma ()); + } + inline void use_tail_mask_policy (vsetvl_info &prev, const vsetvl_info &next) + { + use_tail_policy (prev, next); + use_mask_policy (prev, next); + } + + /* modifiers for avl */ + + inline void use_next_avl (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (can_use_next_avl_p (prev, next)); + prev.update_avl (next); + } + + inline void use_next_avl_when_not_equal (vsetvl_info &prev, + const vsetvl_info &next) + { + if (avl_equal_p (prev, next)) + return; + gcc_assert (next.has_non_zero_avl ()); + use_next_avl (prev, next); + } - if (!vsetvl1.has_avl_reg () || vlmax_avl_p (vsetvl1.get_avl ()) - || !vsetvl2.same_vlmax_p (vsetvl1) || !set - || set->insn () != vsetvl2.get_insn ()) - return false; +public: + demand_system () : m_avl_def_in (nullptr), m_avl_def_out (nullptr) {} + + void set_avl_in_out_data (sbitmap *m_avl_def_in, sbitmap *m_avl_def_out) + { + m_avl_def_in = m_avl_def_in; + m_avl_def_out = m_avl_def_out; + } + + /* Can we move vsetvl info between prev_insn and next_insn safe? */ + bool avl_vl_unmodified_between_p (insn_info *prev_insn, insn_info *next_insn, + const vsetvl_info &info, + bool ignore_vl = false) + { + gcc_assert ((ignore_vl && info.has_nonvlmax_reg_avl ()) + || (info.has_nonvlmax_reg_avl () || info.has_vl ())); + + gcc_assert (!prev_insn->is_debug_insn () && !next_insn->is_debug_insn ()); + if (prev_insn->bb () == next_insn->bb () + && prev_insn->compare_with (next_insn) < 0) + { + for (insn_info *i = next_insn->prev_nondebug_insn (); i != prev_insn; + i = i->prev_nondebug_insn ()) + { + // no def amd use of vl + if (!ignore_vl && modify_or_use_vl_p (i, info)) + return false; - if (fuse_p && vsetvl2.same_vtype_p (vsetvl1)) - return false; - else if (!fuse_p && !vsetvl2.same_vtype_p (vsetvl1)) - return false; - return true; -} + // no def of avl + if (modify_avl_p (i, info)) + return false; + } + return true; + } + else + { + if (!ignore_vl && info.has_vl ()) + { + bitmap live_out = df_get_live_out (prev_insn->bb ()->cfg_bb ()); + if (bitmap_bit_p (live_out, REGNO (info.get_vl ()))) + return false; + } -avl_info::avl_info (const avl_info &other) -{ - m_value = other.get_value (); - m_source = other.get_source (); -} + if (info.has_nonvlmax_reg_avl () && m_avl_def_in && m_avl_def_out) + { + bool has_avl_out = false; + unsigned regno = REGNO (info.get_avl ()); + unsigned expr_id; + sbitmap_iterator sbi; + EXECUTE_IF_SET_IN_BITMAP (m_avl_def_out[prev_insn->bb ()->index ()], + 0, expr_id, sbi) + { + if (get_regno (expr_id, last_basic_block_for_fn (cfun)) + != regno) + continue; + has_avl_out = true; + if (!bitmap_bit_p (m_avl_def_in[next_insn->bb ()->index ()], + expr_id)) + return false; + } + if (!has_avl_out) + return false; + } -avl_info::avl_info (rtx value_in, set_info *source_in) - : m_value (value_in), m_source (source_in) -{} + for (insn_info *i = next_insn; i != next_insn->bb ()->head_insn (); + i = i->prev_nondebug_insn ()) + { + // no def amd use of vl + if (!ignore_vl && modify_or_use_vl_p (i, info)) + return false; -bool -avl_info::single_source_equal_p (const avl_info &other) const -{ - set_info *set1 = m_source; - set_info *set2 = other.get_source (); - insn_info *insn1 = extract_single_source (set1); - insn_info *insn2 = extract_single_source (set2); - if (!insn1 || !insn2) - return false; - return source_equal_p (insn1, insn2); -} + // no def of avl + if (modify_avl_p (i, info)) + return false; + } -bool -avl_info::multiple_source_equal_p (const avl_info &other) const -{ - /* When the def info is same in RTL_SSA namespace, it's safe - to consider they are avl compatible. */ - if (m_source == other.get_source ()) + for (insn_info *i = prev_insn->bb ()->end_insn (); i != prev_insn; + i = i->prev_nondebug_insn ()) + { + // no def amd use of vl + if (!ignore_vl && modify_or_use_vl_p (i, info)) + return false; + + // no def of avl + if (modify_avl_p (i, info)) + return false; + } + } return true; + } + + bool sew_lmul_compatible_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + sew_lmul_demand_type prev_flags = prev.get_sew_lmul_demand (); + sew_lmul_demand_type next_flags = next.get_sew_lmul_demand (); +#define DEF_SEW_LMUL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == sew_lmul_demand_type::PREV_FLAGS \ + && next_flags == sew_lmul_demand_type::NEXT_FLAGS) \ + return COMPATIBLE_P (prev, next); - /* We only consider handle PHI node. */ - if (!m_source->insn ()->is_phi () || !other.get_source ()->insn ()->is_phi ()) - return false; +#include "riscv-vsetvl.def" - phi_info *phi1 = as_a (m_source); - phi_info *phi2 = as_a (other.get_source ()); + gcc_unreachable (); + } - if (phi1->is_degenerate () && phi2->is_degenerate ()) - { - /* Degenerate PHI means the PHI node only have one input. */ + bool sew_lmul_available_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + sew_lmul_demand_type prev_flags = prev.get_sew_lmul_demand (); + sew_lmul_demand_type next_flags = next.get_sew_lmul_demand (); +#define DEF_SEW_LMUL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == sew_lmul_demand_type::PREV_FLAGS \ + && next_flags == sew_lmul_demand_type::NEXT_FLAGS) \ + return AVAILABLE_P (prev, next); - /* If both PHI nodes have the same single input in use list. - We consider they are AVL compatible. */ - if (phi1->input_value (0) == phi2->input_value (0)) - return true; - } - /* TODO: We can support more optimization cases in the future. */ - return false; -} +#include "riscv-vsetvl.def" -avl_info & -avl_info::operator= (const avl_info &other) -{ - m_value = other.get_value (); - m_source = other.get_source (); - return *this; -} + gcc_unreachable (); + } + + void merge_sew_lmul (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + sew_lmul_demand_type prev_flags = prev.get_sew_lmul_demand (); + sew_lmul_demand_type next_flags = next.get_sew_lmul_demand (); +#define DEF_SEW_LMUL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == sew_lmul_demand_type::PREV_FLAGS \ + && next_flags == sew_lmul_demand_type::NEXT_FLAGS) \ + { \ + gcc_assert (COMPATIBLE_P (prev, next)); \ + FUSE (prev, next); \ + prev.set_sew_lmul_demand (sew_lmul_demand_type::NEW_FLAGS); \ + return; \ + } -bool -avl_info::operator== (const avl_info &other) const -{ - if (!m_value) - return !other.get_value (); - if (!other.get_value ()) - return false; +#include "riscv-vsetvl.def" - if (GET_CODE (m_value) != GET_CODE (other.get_value ())) - return false; + gcc_unreachable (); + } - /* Handle CONST_INT AVL. */ - if (CONST_INT_P (m_value)) - return INTVAL (m_value) == INTVAL (other.get_value ()); + bool policy_compatible_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + policy_demand_type prev_flags = prev.get_policy_demand (); + policy_demand_type next_flags = next.get_policy_demand (); +#define DEF_POLICY_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == policy_demand_type::PREV_FLAGS \ + && next_flags == policy_demand_type::NEXT_FLAGS) \ + return COMPATIBLE_P (prev, next); - /* Handle VLMAX AVL. */ - if (vlmax_avl_p (m_value)) - return vlmax_avl_p (other.get_value ()); - if (vlmax_avl_p (other.get_value ())) - return false; +#include "riscv-vsetvl.def" - /* If any source is undef value, we think they are not equal. */ - if (!m_source || !other.get_source ()) - return false; + gcc_unreachable (); + } - /* If both sources are single source (defined by a single real RTL) - and their definitions are same. */ - if (single_source_equal_p (other)) - return true; + bool policy_available_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + policy_demand_type prev_flags = prev.get_policy_demand (); + policy_demand_type next_flags = next.get_policy_demand (); +#define DEF_POLICY_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == policy_demand_type::PREV_FLAGS \ + && next_flags == policy_demand_type::NEXT_FLAGS) \ + return AVAILABLE_P (prev, next); - return multiple_source_equal_p (other); -} +#include "riscv-vsetvl.def" -bool -avl_info::operator!= (const avl_info &other) const -{ - return !(*this == other); -} + gcc_unreachable (); + } + + void merge_policy (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + policy_demand_type prev_flags = prev.get_policy_demand (); + policy_demand_type next_flags = next.get_policy_demand (); +#define DEF_POLICY_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == policy_demand_type::PREV_FLAGS \ + && next_flags == policy_demand_type::NEXT_FLAGS) \ + { \ + gcc_assert (COMPATIBLE_P (prev, next)); \ + FUSE (prev, next); \ + prev.set_policy_demand (policy_demand_type::NEW_FLAGS); \ + return; \ + } -bool -avl_info::has_non_zero_avl () const -{ - if (has_avl_imm ()) - return INTVAL (get_value ()) > 0; - if (has_avl_reg ()) - return vlmax_avl_p (get_value ()); - return false; -} +#include "riscv-vsetvl.def" -/* Initialize VL/VTYPE information. */ -vl_vtype_info::vl_vtype_info (avl_info avl_in, uint8_t sew_in, - enum vlmul_type vlmul_in, uint8_t ratio_in, - bool ta_in, bool ma_in) - : m_avl (avl_in), m_sew (sew_in), m_vlmul (vlmul_in), m_ratio (ratio_in), - m_ta (ta_in), m_ma (ma_in) -{ - gcc_assert (valid_sew_p (m_sew) && "Unexpected SEW"); -} + gcc_unreachable (); + } -bool -vl_vtype_info::operator== (const vl_vtype_info &other) const -{ - return same_avl_p (other) && m_sew == other.get_sew () - && m_vlmul == other.get_vlmul () && m_ta == other.get_ta () - && m_ma == other.get_ma () && m_ratio == other.get_ratio (); -} + bool avl_compatible_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + avl_demand_type prev_flags = prev.get_avl_demand (); + avl_demand_type next_flags = next.get_avl_demand (); +#define DEF_AVL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == avl_demand_type::PREV_FLAGS \ + && next_flags == avl_demand_type::NEXT_FLAGS) \ + return COMPATIBLE_P (prev, next); -bool -vl_vtype_info::operator!= (const vl_vtype_info &other) const -{ - return !(*this == other); -} +#include "riscv-vsetvl.def" -bool -vl_vtype_info::same_avl_p (const vl_vtype_info &other) const -{ - /* We need to compare both RTL and SET. If both AVL are CONST_INT. - For example, const_int 3 and const_int 4, we need to compare - RTL. If both AVL are REG and their REGNO are same, we need to - compare SET. */ - return get_avl () == other.get_avl () - && get_avl_source () == other.get_avl_source (); -} + gcc_unreachable (); + } -bool -vl_vtype_info::same_vtype_p (const vl_vtype_info &other) const -{ - return get_sew () == other.get_sew () && get_vlmul () == other.get_vlmul () - && get_ta () == other.get_ta () && get_ma () == other.get_ma (); -} + bool avl_available_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + avl_demand_type prev_flags = prev.get_avl_demand (); + avl_demand_type next_flags = next.get_avl_demand (); +#define DEF_AVL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == avl_demand_type::PREV_FLAGS \ + && next_flags == avl_demand_type::NEXT_FLAGS) \ + return AVAILABLE_P (prev, next); -bool -vl_vtype_info::same_vlmax_p (const vl_vtype_info &other) const -{ - return get_ratio () == other.get_ratio (); -} +#include "riscv-vsetvl.def" -/* Compare the compatibility between Dem1 and Dem2. - If Dem1 > Dem2, Dem1 has bigger compatibility then Dem2 - meaning Dem1 is easier be compatible with others than Dem2 - or Dem2 is stricter than Dem1. - For example, Dem1 (demand SEW + LMUL) > Dem2 (demand RATIO). */ -bool -vector_insn_info::operator>= (const vector_insn_info &other) const -{ - if (support_relaxed_compatible_p (*this, other)) - { - unsigned array_size = sizeof (unavailable_conds) / sizeof (demands_cond); - /* Bypass AVL unavailable cases. */ - for (unsigned i = 2; i < array_size; i++) - if (unavailable_conds[i].pair.match_cond_p (this->get_demands (), - other.get_demands ()) - && unavailable_conds[i].incompatible_p (*this, other)) - return false; - return true; + gcc_unreachable (); + } + + void merge_avl (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + avl_demand_type prev_flags = prev.get_avl_demand (); + avl_demand_type next_flags = next.get_avl_demand (); +#define DEF_AVL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == avl_demand_type::PREV_FLAGS \ + && next_flags == avl_demand_type::NEXT_FLAGS) \ + { \ + gcc_assert (COMPATIBLE_P (prev, next)); \ + FUSE (prev, next); \ + prev.set_avl_demand (avl_demand_type::NEW_FLAGS); \ + return; \ } - if (!other.compatible_p (static_cast (*this))) - return false; - if (!this->compatible_p (static_cast (other))) - return true; - - if (*this == other) - return true; +#include "riscv-vsetvl.def" - for (const auto &cond : unavailable_conds) - if (cond.pair.match_cond_p (this->get_demands (), other.get_demands ()) - && cond.incompatible_p (*this, other)) - return false; + gcc_unreachable (); + } + + bool compatible_p (const vsetvl_info &prev, const vsetvl_info &next) + { + bool compatible_p = sew_lmul_compatible_p (prev, next) + && policy_compatible_p (prev, next) + && avl_compatible_p (prev, next); + return compatible_p; + } + + bool available_p (const vsetvl_info &prev, const vsetvl_info &next) + { + bool available_p = sew_lmul_available_p (prev, next) + && policy_available_p (prev, next) + && avl_available_p (prev, next); + gcc_assert (!available_p || compatible_p (prev, next)); + return available_p; + } + + void merge (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (compatible_p (prev, next)); + merge_sew_lmul (prev, next); + merge_policy (prev, next); + merge_avl (prev, next); + gcc_assert (available_p (prev, next)); + } +}; - return true; -} -bool -vector_insn_info::operator== (const vector_insn_info &other) const +class pre_vsetvl { - gcc_assert (!uninit_p () && !other.uninit_p () - && "Uninitialization should not happen"); - - /* Empty is only equal to another Empty. */ - if (empty_p ()) - return other.empty_p (); - if (other.empty_p ()) - return empty_p (); - - /* Unknown is only equal to another Unknown. */ - if (unknown_p ()) - return other.unknown_p (); - if (other.unknown_p ()) - return unknown_p (); - - for (size_t i = 0; i < NUM_DEMAND; i++) - if (m_demands[i] != other.demand_p ((enum demand_type) i)) - return false; +private: + demand_system m_dem; + auto_vec m_vector_block_infos; + + /* data for avl reaching defintion. */ + sbitmap m_avl_regs; + sbitmap *m_avl_def_in; + sbitmap *m_avl_def_out; + sbitmap *m_reg_def_loc; + + /* data for vsetvl info reaching defintion. */ + vsetvl_info m_unknow_info; + auto_vec m_vsetvl_def_exprs; + sbitmap *m_vsetvl_def_in; + sbitmap *m_vsetvl_def_out; + + /* data for lcm */ + auto_vec m_exprs; + sbitmap *m_avloc; + sbitmap *m_avin; + sbitmap *m_avout; + sbitmap *m_kill; + sbitmap *m_antloc; + sbitmap *m_transp; + sbitmap *m_insert; + sbitmap *m_del; + struct edge_list *m_edges; + + auto_vec m_delete_list; + + vsetvl_block_info &get_block_info (const bb_info *bb) + { + return m_vector_block_infos[bb->index ()]; + } + const vsetvl_block_info &get_block_info (const basic_block bb) const + { + return m_vector_block_infos[bb->index]; + } + + vsetvl_block_info &get_block_info (const basic_block bb) + { + return m_vector_block_infos[bb->index]; + } + + void add_expr (auto_vec &m_exprs, vsetvl_info &info) + { + for (vsetvl_info *item : m_exprs) + { + if (*item == info) + return; + } + m_exprs.safe_push (&info); + } + + unsigned get_expr_index (auto_vec &m_exprs, + const vsetvl_info &info) + { + for (size_t i = 0; i < m_exprs.length (); i += 1) + { + if (*m_exprs[i] == info) + return i; + } + gcc_unreachable (); + } + + bool anticpatable_exp_p (const vsetvl_info &header_info) + { + if (!header_info.has_nonvlmax_reg_avl () && !header_info.has_vl ()) + return true; - /* We should consider different INSN demands as different - expression. Otherwise, we will be doing incorrect vsetvl - elimination. */ - if (m_insn != other.get_insn ()) - return false; + bb_info *bb = header_info.get_bb (); + insn_info *prev_insn = bb->head_insn (); + insn_info *next_insn = header_info.insn_inside_bb_p () + ? header_info.get_insn () + : header_info.get_bb ()->end_insn (); + + return m_dem.avl_vl_unmodified_between_p (prev_insn, next_insn, + header_info); + } + + bool available_exp_p (const vsetvl_info &prev_info, + const vsetvl_info &next_info) + { + return m_dem.available_p (prev_info, next_info); + } + + void compute_probabilities () + { + edge e; + edge_iterator ei; + + for (const bb_info *bb : crtl->ssa->bbs ()) + { + basic_block cfg_bb = bb->cfg_bb (); + auto &curr_prob = get_block_info (cfg_bb).probability; + + /* GCC assume entry block (bb 0) are always so + executed so set its probability as "always". */ + if (ENTRY_BLOCK_PTR_FOR_FN (cfun) == cfg_bb) + curr_prob = profile_probability::always (); + /* Exit block (bb 1) is the block we don't need to process. */ + if (EXIT_BLOCK_PTR_FOR_FN (cfun) == cfg_bb) + continue; - if (!same_avl_p (other)) - return false; - - /* If the full VTYPE is valid, check that it is the same. */ - return same_vtype_p (other); -} - -void -vector_insn_info::parse_insn (rtx_insn *rinsn) -{ - *this = vector_insn_info (); - if (!NONDEBUG_INSN_P (rinsn)) - return; - if (optimize == 0 && !has_vtype_op (rinsn)) - return; - gcc_assert (!vsetvl_discard_result_insn_p (rinsn)); - m_state = VALID; - extract_insn_cached (rinsn); - rtx avl = ::get_avl (rinsn); - m_avl = avl_info (avl, nullptr); - m_sew = ::get_sew (rinsn); - m_vlmul = ::get_vlmul (rinsn); - m_ta = tail_agnostic_p (rinsn); - m_ma = mask_agnostic_p (rinsn); -} - -void -vector_insn_info::parse_insn (insn_info *insn) -{ - *this = vector_insn_info (); - - /* Return if it is debug insn for the consistency with optimize == 0. */ - if (insn->is_debug_insn ()) - return; - - /* We set it as unknown since we don't what will happen in CALL or ASM. */ - if (insn->is_call () || insn->is_asm ()) - { - set_unknown (); - return; - } - - /* If this is something that updates VL/VTYPE that we don't know about, set - the state to unknown. */ - if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ()) - && (find_access (insn->defs (), VL_REGNUM) - || find_access (insn->defs (), VTYPE_REGNUM))) - { - set_unknown (); - return; - } - - if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ())) - return; - - /* Warning: This function has to work on both the lowered (i.e. post - emit_local_forward_vsetvls) and pre-lowering forms. The main implication - of this is that it can't use the value of a SEW, VL, or Policy operand as - they might be stale after lowering. */ - vl_vtype_info::operator= (get_vl_vtype_info (insn)); - m_insn = insn; - m_state = VALID; - if (vector_config_insn_p (insn->rtl ())) - { - m_demands[DEMAND_AVL] = true; - m_demands[DEMAND_RATIO] = true; - return; - } - - if (has_vl_op (insn->rtl ())) - m_demands[DEMAND_AVL] = true; - - if (get_attr_ratio (insn->rtl ()) != INVALID_ATTRIBUTE) - m_demands[DEMAND_RATIO] = true; - else - { - /* TODO: By default, if it doesn't demand RATIO, we set it - demand SEW && LMUL both. Some instructions may demand SEW - only and ignore LMUL, will fix it later. */ - m_demands[DEMAND_SEW] = true; - if (!ignore_vlmul_insn_p (insn->rtl ())) - m_demands[DEMAND_LMUL] = true; - } - - if (get_attr_ta (insn->rtl ()) != INVALID_ATTRIBUTE) - m_demands[DEMAND_TAIL_POLICY] = true; - if (get_attr_ma (insn->rtl ()) != INVALID_ATTRIBUTE) - m_demands[DEMAND_MASK_POLICY] = true; - - if (vector_config_insn_p (insn->rtl ())) - return; - - if (scalar_move_insn_p (insn->rtl ())) - { - if (m_avl.has_non_zero_avl ()) - m_demands[DEMAND_NONZERO_AVL] = true; - if (m_ta) - m_demands[DEMAND_GE_SEW] = true; - } - - if (!m_avl.has_avl_reg () || vlmax_avl_p (get_avl ()) || !m_avl.get_source ()) - return; - if (!m_avl.get_source ()->insn ()->is_real () - && !m_avl.get_source ()->insn ()->is_phi ()) - return; - - insn_info *def_insn = extract_single_source (m_avl.get_source ()); - if (!def_insn || !vsetvl_insn_p (def_insn->rtl ())) - return; - - vector_insn_info new_info; - new_info.parse_insn (def_insn); - if (!same_vlmax_p (new_info) && !scalar_move_insn_p (insn->rtl ())) - return; - - if (new_info.has_avl ()) - { - if (new_info.has_avl_imm ()) - set_avl_info (avl_info (new_info.get_avl (), nullptr)); - else - { - if (vlmax_avl_p (new_info.get_avl ())) - set_avl_info (avl_info (new_info.get_avl (), get_avl_source ())); - else - { - /* Conservatively propagate non-VLMAX AVL of user vsetvl: - 1. The user vsetvl should be same block with the rvv insn. - 2. The user vsetvl is the only def insn of rvv insn. - 3. The AVL is not modified between def-use chain. - 4. The VL is only used by insn within EBB. - */ - bool modified_p = false; - for (insn_info *i = def_insn->next_nondebug_insn (); - real_insn_and_same_bb_p (i, get_insn ()->bb ()); - i = i->next_nondebug_insn ()) - { - /* Consider this following sequence: - - insn 1: vsetvli a5,a3,e8,mf4,ta,mu - insn 2: vsetvli zero,a5,e32,m1,ta,ma - ... - vle32.v v1,0(a1) - vsetvli a2,zero,e32,m1,ta,ma - vadd.vv v1,v1,v1 - vsetvli zero,a5,e32,m1,ta,ma - vse32.v v1,0(a0) - ... - insn 3: sub a3,a3,a5 - ... - - We can local AVL propagate "a3" from insn 1 to insn 2 - if no insns between insn 1 and insn 2 modify "a3 even - though insn 3 modifies "a3". - Otherwise, we can't perform local AVL propagation. - - Early break if we reach the insn 2. */ - if (!before_p (i, insn)) - break; - if (find_access (i->defs (), REGNO (new_info.get_avl ()))) - { - modified_p = true; - break; - } - } - - bool has_live_out_use = false; - for (use_info *use : m_avl.get_source ()->all_uses ()) - { - if (use->is_live_out_use ()) - { - has_live_out_use = true; - break; - } - } - if (!modified_p && !has_live_out_use - && def_insn == m_avl.get_source ()->insn () - && m_insn->bb () == def_insn->bb ()) - set_avl_info (new_info.get_avl_info ()); - } - } - } - - if (scalar_move_insn_p (insn->rtl ()) && m_avl.has_non_zero_avl ()) - m_demands[DEMAND_NONZERO_AVL] = true; -} - -bool -vector_insn_info::compatible_p (const vector_insn_info &other) const -{ - gcc_assert (valid_or_dirty_p () && other.valid_or_dirty_p () - && "Can't compare invalid demanded infos"); - - for (const auto &cond : incompatible_conds) - if (cond.dual_incompatible_p (*this, other)) - return false; - return true; -} - -bool -vector_insn_info::skip_avl_compatible_p (const vector_insn_info &other) const -{ - gcc_assert (valid_or_dirty_p () && other.valid_or_dirty_p () - && "Can't compare invalid demanded infos"); - unsigned array_size = sizeof (incompatible_conds) / sizeof (demands_cond); - /* Bypass AVL incompatible cases. */ - for (unsigned i = 1; i < array_size; i++) - if (incompatible_conds[i].dual_incompatible_p (*this, other)) - return false; - return true; -} - -bool -vector_insn_info::compatible_avl_p (const vl_vtype_info &other) const -{ - gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); - gcc_assert (!unknown_p () && "Can't compare AVL in unknown state"); - if (!demand_p (DEMAND_AVL)) - return true; - if (demand_p (DEMAND_NONZERO_AVL) && other.has_non_zero_avl ()) - return true; - return get_avl_info () == other.get_avl_info (); -} - -bool -vector_insn_info::compatible_avl_p (const avl_info &other) const -{ - gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); - gcc_assert (!unknown_p () && "Can't compare AVL in unknown state"); - gcc_assert (demand_p (DEMAND_AVL) && "Can't compare AVL undemand state"); - if (!demand_p (DEMAND_AVL)) - return true; - if (demand_p (DEMAND_NONZERO_AVL) && other.has_non_zero_avl ()) + gcc_assert (curr_prob.initialized_p ()); + FOR_EACH_EDGE (e, ei, cfg_bb->succs) + { + auto &new_prob = get_block_info (e->dest).probability; + /* Normally, the edge probability should be initialized. + However, some special testing code which is written in + GIMPLE IR style force the edge probility uninitialized, + we conservatively set it as never so that it will not + affect PRE (Phase 3 && Phse 4). */ + if (!e->probability.initialized_p ()) + new_prob = profile_probability::never (); + else if (!new_prob.initialized_p ()) + new_prob = curr_prob * e->probability; + else if (new_prob == profile_probability::always ()) + continue; + else + new_prob += curr_prob * e->probability; + } + } + } + + void insert_vsetvl_insn (enum emit_type emit_type, const vsetvl_info &info) + { + rtx pat = info.get_vsetvl_pat (); + rtx_insn *rinsn = info.get_insn ()->rtl (); + + if (emit_type == EMIT_DIRECT) + { + emit_insn (pat); + if (dump_file) + { + fprintf (dump_file, " Insert vsetvl insn %d:\n", + INSN_UID (get_last_insn ())); + print_rtl_single (dump_file, get_last_insn ()); + } + } + else if (emit_type == EMIT_BEFORE) + { + emit_insn_before (pat, rinsn); + if (dump_file) + { + fprintf (dump_file, " Insert vsetvl insn before insn %d:\n", + INSN_UID (rinsn)); + print_rtl_single (dump_file, PREV_INSN (rinsn)); + } + } + else + { + emit_insn_after (pat, rinsn); + if (dump_file) + { + fprintf (dump_file, " Insert vsetvl insn after insn %d:\n", + INSN_UID (rinsn)); + print_rtl_single (dump_file, NEXT_INSN (rinsn)); + } + } + } + + void change_vsetvl_insn (const vsetvl_info &info) + { + rtx_insn *rinsn = info.get_insn ()->rtl (); + rtx new_pat = info.get_vsetvl_pat (); + + if (dump_file) + { + fprintf (dump_file, " Change insn %d from:\n", INSN_UID (rinsn)); + print_rtl_single (dump_file, rinsn); + } + + validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); + + if (dump_file) + { + fprintf (dump_file, "\n to:\n"); + print_rtl_single (dump_file, rinsn); + } + } + + void remove_vsetvl_insn (const vsetvl_info &info) + { + rtx_insn *rinsn = info.get_insn ()->rtl (); + if (dump_file) + { + fprintf (dump_file, " Eliminate insn %d:\n", INSN_UID (rinsn)); + print_rtl_single (dump_file, rinsn); + } + if (in_sequence_p ()) + remove_insn (rinsn); + else + delete_insn (rinsn); + } + + bool successors_probability_equal_p (const basic_block cfg_bb) const + { + edge e; + edge_iterator ei; + profile_probability prob = profile_probability::uninitialized (); + FOR_EACH_EDGE (e, ei, cfg_bb->succs) + { + if (prob == profile_probability::uninitialized ()) + prob = m_vector_block_infos[e->dest->index].probability; + else if (prob == m_vector_block_infos[e->dest->index].probability) + continue; + else + /* We pick the highest probability among those incompatible VSETVL + infos. When all incompatible VSTEVL infos have same probability, we + don't pick any of them. */ + return false; + } return true; - return get_avl_info () == other; -} - -bool -vector_insn_info::compatible_vtype_p (const vl_vtype_info &other) const -{ - gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); - gcc_assert (!unknown_p () && "Can't compare VTYPE in unknown state"); - if (demand_p (DEMAND_SEW)) - { - if (!demand_p (DEMAND_GE_SEW) && m_sew != other.get_sew ()) - return false; - if (demand_p (DEMAND_GE_SEW) && m_sew > other.get_sew ()) - return false; - } - if (demand_p (DEMAND_LMUL) && m_vlmul != other.get_vlmul ()) - return false; - if (demand_p (DEMAND_RATIO) && m_ratio != other.get_ratio ()) - return false; - if (demand_p (DEMAND_TAIL_POLICY) && m_ta != other.get_ta ()) - return false; - if (demand_p (DEMAND_MASK_POLICY) && m_ma != other.get_ma ()) - return false; - return true; -} - -/* Determine whether the vector instructions requirements represented by - Require are compatible with the previous vsetvli instruction represented - by this. INSN is the instruction whose requirements we're considering. */ -bool -vector_insn_info::compatible_p (const vl_vtype_info &curr_info) const -{ - gcc_assert (!uninit_p () && "Can't handle uninitialized info"); - if (empty_p ()) - return false; - - /* Nothing is compatible with Unknown. */ - if (unknown_p ()) - return false; - - /* If the instruction doesn't need an AVLReg and the SEW matches, consider - it compatible. */ - if (!demand_p (DEMAND_AVL)) - if (m_sew == curr_info.get_sew ()) - return true; - - return compatible_avl_p (curr_info) && compatible_vtype_p (curr_info); -} - -bool -vector_insn_info::available_p (const vector_insn_info &other) const -{ - return *this >= other; -} - -void -vector_insn_info::fuse_avl (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - set_insn (info1.get_insn ()); - if (info1.demand_p (DEMAND_AVL)) - { - if (info1.demand_p (DEMAND_NONZERO_AVL)) - { - if (info2.demand_p (DEMAND_AVL) - && !info2.demand_p (DEMAND_NONZERO_AVL)) - { - set_avl_info (info2.get_avl_info ()); - set_demand (DEMAND_AVL, true); - set_demand (DEMAND_NONZERO_AVL, false); - return; - } - } - set_avl_info (info1.get_avl_info ()); - set_demand (DEMAND_NONZERO_AVL, info1.demand_p (DEMAND_NONZERO_AVL)); - } - else - { - set_avl_info (info2.get_avl_info ()); - set_demand (DEMAND_NONZERO_AVL, info2.demand_p (DEMAND_NONZERO_AVL)); - } - set_demand (DEMAND_AVL, - info1.demand_p (DEMAND_AVL) || info2.demand_p (DEMAND_AVL)); -} - -void -vector_insn_info::fuse_sew_lmul (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - /* We need to fuse sew && lmul according to demand info: - - 1. GE_SEW. - 2. SEW. - 3. LMUL. - 4. RATIO. */ - if (same_sew_lmul_demand_p (info1.get_demands (), info2.get_demands ())) - { - set_demand (DEMAND_SEW, info2.demand_p (DEMAND_SEW)); - set_demand (DEMAND_LMUL, info2.demand_p (DEMAND_LMUL)); - set_demand (DEMAND_RATIO, info2.demand_p (DEMAND_RATIO)); - set_demand (DEMAND_GE_SEW, info2.demand_p (DEMAND_GE_SEW)); - set_sew (info2.get_sew ()); - set_vlmul (info2.get_vlmul ()); - set_ratio (info2.get_ratio ()); - return; - } - for (const auto &rule : fuse_rules) - { - if (rule.pair.match_cond_p (info1.get_demands (), info2.get_demands ())) - { - set_demand (DEMAND_SEW, rule.demand_sew_p); - set_demand (DEMAND_LMUL, rule.demand_lmul_p); - set_demand (DEMAND_RATIO, rule.demand_ratio_p); - set_demand (DEMAND_GE_SEW, rule.demand_ge_sew_p); - set_sew (rule.new_sew (info1, info2)); - set_vlmul (rule.new_vlmul (info1, info2)); - set_ratio (rule.new_ratio (info1, info2)); - return; - } - if (rule.pair.match_cond_p (info2.get_demands (), info1.get_demands ())) - { - set_demand (DEMAND_SEW, rule.demand_sew_p); - set_demand (DEMAND_LMUL, rule.demand_lmul_p); - set_demand (DEMAND_RATIO, rule.demand_ratio_p); - set_demand (DEMAND_GE_SEW, rule.demand_ge_sew_p); - set_sew (rule.new_sew (info2, info1)); - set_vlmul (rule.new_vlmul (info2, info1)); - set_ratio (rule.new_ratio (info2, info1)); - return; - } - } - gcc_unreachable (); -} - -void -vector_insn_info::fuse_tail_policy (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info1.demand_p (DEMAND_TAIL_POLICY)) - { - set_ta (info1.get_ta ()); - demand (DEMAND_TAIL_POLICY); - } - else if (info2.demand_p (DEMAND_TAIL_POLICY)) - { - set_ta (info2.get_ta ()); - demand (DEMAND_TAIL_POLICY); - } - else - set_ta (get_default_ta ()); -} - -void -vector_insn_info::fuse_mask_policy (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info1.demand_p (DEMAND_MASK_POLICY)) - { - set_ma (info1.get_ma ()); - demand (DEMAND_MASK_POLICY); - } - else if (info2.demand_p (DEMAND_MASK_POLICY)) - { - set_ma (info2.get_ma ()); - demand (DEMAND_MASK_POLICY); - } - else - set_ma (get_default_ma ()); -} - -vector_insn_info -vector_insn_info::local_merge (const vector_insn_info &merge_info) const -{ - if (!vsetvl_insn_p (get_insn ()->rtl ()) && *this != merge_info) - gcc_assert (this->compatible_p (merge_info) - && "Can't merge incompatible demanded infos"); - - vector_insn_info new_info; - new_info.set_valid (); - /* For local backward data flow, we always update INSN && AVL as the - latest INSN and AVL so that we can keep track status of each INSN. */ - new_info.fuse_avl (merge_info, *this); - new_info.fuse_sew_lmul (*this, merge_info); - new_info.fuse_tail_policy (*this, merge_info); - new_info.fuse_mask_policy (*this, merge_info); - return new_info; -} - -vector_insn_info -vector_insn_info::global_merge (const vector_insn_info &merge_info, - unsigned int bb_index) const -{ - if (!vsetvl_insn_p (get_insn ()->rtl ()) && *this != merge_info) - gcc_assert (this->compatible_p (merge_info) - && "Can't merge incompatible demanded infos"); - - vector_insn_info new_info; - new_info.set_valid (); - - /* For global data flow, we should keep original INSN and AVL if they - valid since we should keep the life information of each block. - - For example: - bb 0 -> bb 1. - We should keep INSN && AVL of bb 1 since we will eventually emit - vsetvl instruction according to INSN and AVL of bb 1. */ - new_info.fuse_avl (*this, merge_info); - /* Recompute the AVL source whose block index is equal to BB_INDEX. */ - if (new_info.get_avl_source () - && new_info.get_avl_source ()->insn ()->is_phi () - && new_info.get_avl_source ()->bb ()->index () != bb_index) - { - hash_set sets - = get_all_sets (new_info.get_avl_source (), true, true, true); - new_info.set_avl_source (nullptr); - bool can_find_set_p = false; - set_info *first_set = nullptr; - for (set_info *set : sets) - { - if (!first_set) - first_set = set; - if (set->bb ()->index () == bb_index) - { - gcc_assert (!can_find_set_p); - new_info.set_avl_source (set); - can_find_set_p = true; - } - } - if (!can_find_set_p && sets.elements () == 1 - && first_set->insn ()->is_real ()) - new_info.set_avl_source (first_set); - } - - /* Make sure VLMAX AVL always has a set_info the get VL. */ - if (vlmax_avl_p (new_info.get_avl ())) - { - if (this->get_avl_source ()) - new_info.set_avl_source (this->get_avl_source ()); - else - { - gcc_assert (merge_info.get_avl_source ()); - new_info.set_avl_source (merge_info.get_avl_source ()); - } - } - - new_info.fuse_sew_lmul (*this, merge_info); - new_info.fuse_tail_policy (*this, merge_info); - new_info.fuse_mask_policy (*this, merge_info); - return new_info; -} - -/* Wrapper helps to return the AVL or VL operand for the - vector_insn_info. Return AVL if the AVL is not VLMAX. - Otherwise, return the VL operand. */ -rtx -vector_insn_info::get_avl_or_vl_reg (void) const -{ - gcc_assert (has_avl_reg ()); - if (!vlmax_avl_p (get_avl ())) - return get_avl (); - - rtx_insn *rinsn = get_insn ()->rtl (); - if (has_vl_op (rinsn) || vsetvl_insn_p (rinsn)) - { - rtx vl = ::get_vl (rinsn); - /* For VLMAX, we should make sure we get the - REG to emit 'vsetvl VL,zero' since the 'VL' - should be the REG according to RVV ISA. */ - if (REG_P (vl)) - return vl; - } - - /* We always has avl_source if it is VLMAX AVL. */ - gcc_assert (get_avl_source ()); - return get_avl_reg_rtx (); -} - -bool -vector_insn_info::update_fault_first_load_avl (insn_info *insn) -{ - // Update AVL to vl-output of the fault first load. - const insn_info *read_vl = get_forward_read_vl_insn (insn); - if (read_vl) - { - rtx vl = SET_DEST (PATTERN (read_vl->rtl ())); - def_info *def = find_access (read_vl->defs (), REGNO (vl)); - set_info *set = safe_dyn_cast (def); - set_avl_info (avl_info (vl, set)); - set_insn (insn); - return true; - } - return false; -} - -static const char * -vlmul_to_str (vlmul_type vlmul) -{ - switch (vlmul) - { - case LMUL_1: - return "m1"; - case LMUL_2: - return "m2"; - case LMUL_4: - return "m4"; - case LMUL_8: - return "m8"; - case LMUL_RESERVED: - return "INVALID LMUL"; - case LMUL_F8: - return "mf8"; - case LMUL_F4: - return "mf4"; - case LMUL_F2: - return "mf2"; - - default: - gcc_unreachable (); - } -} - -static const char * -policy_to_str (bool agnostic_p) -{ - return agnostic_p ? "agnostic" : "undisturbed"; -} - -void -vector_insn_info::dump (FILE *file) const -{ - fprintf (file, "["); - if (uninit_p ()) - fprintf (file, "UNINITIALIZED,"); - else if (valid_p ()) - fprintf (file, "VALID,"); - else if (unknown_p ()) - fprintf (file, "UNKNOWN,"); - else if (empty_p ()) - fprintf (file, "EMPTY,"); - else - fprintf (file, "DIRTY,"); - - fprintf (file, "Demand field={%d(VL),", demand_p (DEMAND_AVL)); - fprintf (file, "%d(DEMAND_NONZERO_AVL),", demand_p (DEMAND_NONZERO_AVL)); - fprintf (file, "%d(SEW),", demand_p (DEMAND_SEW)); - fprintf (file, "%d(DEMAND_GE_SEW),", demand_p (DEMAND_GE_SEW)); - fprintf (file, "%d(LMUL),", demand_p (DEMAND_LMUL)); - fprintf (file, "%d(RATIO),", demand_p (DEMAND_RATIO)); - fprintf (file, "%d(TAIL_POLICY),", demand_p (DEMAND_TAIL_POLICY)); - fprintf (file, "%d(MASK_POLICY)}\n", demand_p (DEMAND_MASK_POLICY)); - - fprintf (file, "AVL="); - print_rtl_single (file, get_avl ()); - fprintf (file, "SEW=%d,", get_sew ()); - fprintf (file, "VLMUL=%s,", vlmul_to_str (get_vlmul ())); - fprintf (file, "RATIO=%d,", get_ratio ()); - fprintf (file, "TAIL_POLICY=%s,", policy_to_str (get_ta ())); - fprintf (file, "MASK_POLICY=%s", policy_to_str (get_ma ())); - fprintf (file, "]\n"); - - if (valid_p ()) - { - if (get_insn ()) - { - fprintf (file, "The real INSN="); - print_rtl_single (file, get_insn ()->rtl ()); - } - } -} - -vector_infos_manager::vector_infos_manager () -{ - vector_edge_list = nullptr; - vector_kill = nullptr; - vector_del = nullptr; - vector_insert = nullptr; - vector_antic = nullptr; - vector_transp = nullptr; - vector_comp = nullptr; - vector_avin = nullptr; - vector_avout = nullptr; - vector_antin = nullptr; - vector_antout = nullptr; - vector_earliest = nullptr; - vector_insn_infos.safe_grow_cleared (get_max_uid ()); - vector_block_infos.safe_grow_cleared (last_basic_block_for_fn (cfun)); - if (!optimize) - { - basic_block cfg_bb; - rtx_insn *rinsn; - FOR_ALL_BB_FN (cfg_bb, cfun) - { - vector_block_infos[cfg_bb->index].local_dem = vector_insn_info (); - vector_block_infos[cfg_bb->index].reaching_out = vector_insn_info (); - FOR_BB_INSNS (cfg_bb, rinsn) - vector_insn_infos[INSN_UID (rinsn)].parse_insn (rinsn); - } - } - else - { - for (const bb_info *bb : crtl->ssa->bbs ()) - { - vector_block_infos[bb->index ()].local_dem = vector_insn_info (); - vector_block_infos[bb->index ()].reaching_out = vector_insn_info (); - for (insn_info *insn : bb->real_insns ()) - vector_insn_infos[insn->uid ()].parse_insn (insn); - vector_block_infos[bb->index ()].probability = profile_probability (); - } - } -} - -void -vector_infos_manager::create_expr (vector_insn_info &info) -{ - for (size_t i = 0; i < vector_exprs.length (); i++) - if (*vector_exprs[i] == info) - return; - vector_exprs.safe_push (&info); -} - -size_t -vector_infos_manager::get_expr_id (const vector_insn_info &info) const -{ - for (size_t i = 0; i < vector_exprs.length (); i++) - if (*vector_exprs[i] == info) - return i; - gcc_unreachable (); -} - -auto_vec -vector_infos_manager::get_all_available_exprs ( - const vector_insn_info &info) const -{ - auto_vec available_list; - for (size_t i = 0; i < vector_exprs.length (); i++) - if (info.available_p (*vector_exprs[i])) - available_list.safe_push (i); - return available_list; -} - -bool -vector_infos_manager::all_same_ratio_p (sbitmap bitdata) const -{ - if (bitmap_empty_p (bitdata)) - return false; - - int ratio = -1; - unsigned int bb_index; - sbitmap_iterator sbi; - - EXECUTE_IF_SET_IN_BITMAP (bitdata, 0, bb_index, sbi) - { - if (ratio == -1) - ratio = vector_exprs[bb_index]->get_ratio (); - else if (vector_exprs[bb_index]->get_ratio () != ratio) - return false; - } - return true; -} - -/* Return TRUE if the incoming vector configuration state - to CFG_BB is compatible with the vector configuration - state in CFG_BB, FALSE otherwise. */ -bool -vector_infos_manager::all_avail_in_compatible_p (const basic_block cfg_bb) const -{ - const auto &info = vector_block_infos[cfg_bb->index].local_dem; - sbitmap avin = vector_avin[cfg_bb->index]; - unsigned int bb_index; - sbitmap_iterator sbi; - EXECUTE_IF_SET_IN_BITMAP (avin, 0, bb_index, sbi) - { - const auto &avin_info - = static_cast (*vector_exprs[bb_index]); - if (!info.compatible_p (avin_info)) - return false; - } - return true; -} - -bool -vector_infos_manager::all_same_avl_p (const basic_block cfg_bb, - sbitmap bitdata) const -{ - if (bitmap_empty_p (bitdata)) - return false; + } + + bool preds_has_same_avl_p (const vsetvl_info &curr_info) + { + gcc_assert ( + !bitmap_empty_p (m_vsetvl_def_in[curr_info.get_bb ()->index ()])); + + unsigned expr_index; + sbitmap_iterator sbi; + EXECUTE_IF_SET_IN_BITMAP (m_vsetvl_def_in[curr_info.get_bb ()->index ()], 0, + expr_index, sbi) + { + const vsetvl_info &prev_info = *m_vsetvl_def_exprs[expr_index]; + if (!prev_info.valid_p () + || !m_dem.avl_available_p (prev_info, curr_info)) + return false; + } - const auto &block_info = vector_block_infos[cfg_bb->index]; - if (!block_info.local_dem.demand_p (DEMAND_AVL)) return true; + } - avl_info avl = block_info.local_dem.get_avl_info (); - unsigned int bb_index; - sbitmap_iterator sbi; - - EXECUTE_IF_SET_IN_BITMAP (bitdata, 0, bb_index, sbi) - { - if (vector_exprs[bb_index]->get_avl_info () != avl) - return false; - } - return true; -} - -bool -vector_infos_manager::earliest_fusion_worthwhile_p ( - const basic_block cfg_bb) const -{ - edge e; - edge_iterator ei; - profile_probability prob = profile_probability::uninitialized (); - FOR_EACH_EDGE (e, ei, cfg_bb->succs) - { - if (prob == profile_probability::uninitialized ()) - prob = vector_block_infos[e->dest->index].probability; - else if (prob == vector_block_infos[e->dest->index].probability) - continue; - else - /* We pick the highest probability among those incompatible VSETVL - infos. When all incompatible VSTEVL infos have same probability, we - don't pick any of them. */ - return true; - } - return false; -} - -bool -vector_infos_manager::vsetvl_dominated_by_all_preds_p ( - const basic_block cfg_bb, const vector_insn_info &info) const -{ - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, cfg_bb->preds) - { - const auto &reaching_out = vector_block_infos[e->src->index].reaching_out; - if (e->src->index == cfg_bb->index && reaching_out.compatible_p (info)) - continue; - if (!vsetvl_dominated_by_p (e->src, info, reaching_out, false)) - return false; - } - return true; -} - -size_t -vector_infos_manager::expr_set_num (sbitmap bitdata) const -{ - size_t count = 0; - for (size_t i = 0; i < vector_exprs.length (); i++) - if (bitmap_bit_p (bitdata, i)) - count++; - return count; -} - -void -vector_infos_manager::release (void) -{ - if (!vector_insn_infos.is_empty ()) - vector_insn_infos.release (); - if (!vector_block_infos.is_empty ()) - vector_block_infos.release (); - if (!vector_exprs.is_empty ()) - vector_exprs.release (); - - gcc_assert (to_refine_vsetvls.is_empty ()); - gcc_assert (to_delete_vsetvls.is_empty ()); - if (optimize > 0) - free_bitmap_vectors (); -} - -void -vector_infos_manager::create_bitmap_vectors (void) -{ - /* Create the bitmap vectors. */ - vector_antic = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_comp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_avin = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_avout = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_kill = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_antin = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_antout = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - - bitmap_vector_ones (vector_transp, last_basic_block_for_fn (cfun)); - bitmap_vector_clear (vector_antic, last_basic_block_for_fn (cfun)); - bitmap_vector_clear (vector_comp, last_basic_block_for_fn (cfun)); - vector_edge_list = create_edge_list (); - vector_earliest = sbitmap_vector_alloc (NUM_EDGES (vector_edge_list), - vector_exprs.length ()); -} - -void -vector_infos_manager::free_bitmap_vectors (void) -{ - /* Finished. Free up all the things we've allocated. */ - free_edge_list (vector_edge_list); - if (vector_del) - sbitmap_vector_free (vector_del); - if (vector_insert) - sbitmap_vector_free (vector_insert); - if (vector_kill) - sbitmap_vector_free (vector_kill); - if (vector_antic) - sbitmap_vector_free (vector_antic); - if (vector_transp) - sbitmap_vector_free (vector_transp); - if (vector_comp) - sbitmap_vector_free (vector_comp); - if (vector_avin) - sbitmap_vector_free (vector_avin); - if (vector_avout) - sbitmap_vector_free (vector_avout); - if (vector_antin) - sbitmap_vector_free (vector_antin); - if (vector_antout) - sbitmap_vector_free (vector_antout); - if (vector_earliest) - sbitmap_vector_free (vector_earliest); - - vector_edge_list = nullptr; - vector_kill = nullptr; - vector_del = nullptr; - vector_insert = nullptr; - vector_antic = nullptr; - vector_transp = nullptr; - vector_comp = nullptr; - vector_avin = nullptr; - vector_avout = nullptr; - vector_antin = nullptr; - vector_antout = nullptr; - vector_earliest = nullptr; -} - -void -vector_infos_manager::dump (FILE *file) const -{ - basic_block cfg_bb; - rtx_insn *rinsn; - - fprintf (file, "\n"); - FOR_ALL_BB_FN (cfg_bb, cfun) - { - fprintf (file, "Local vector info of :\n", cfg_bb->index); - fprintf (file, "
="); - vector_block_infos[cfg_bb->index].local_dem.dump (file); - FOR_BB_INSNS (cfg_bb, rinsn) - { - if (!NONDEBUG_INSN_P (rinsn) || !has_vtype_op (rinsn)) - continue; - fprintf (file, "=", INSN_UID (rinsn)); - const auto &info = vector_insn_infos[INSN_UID (rinsn)]; - info.dump (file); - } - fprintf (file, "