From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbgbr2.qq.com (smtpbgbr2.qq.com [54.207.22.56]) by sourceware.org (Postfix) with ESMTPS id 899AA3858D3C for ; Tue, 12 Sep 2023 06:49:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 899AA3858D3C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp64t1694501374tnafgzhg Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 12 Sep 2023 14:49:33 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: oGOjGSUjcuBGhyvatBaludGO7dJL8F0agoPyZQ5jhp3TihyE2k3tOFUyZ25Hw 7ebApV8s7ytHTHV9bCD5y5Ap6CPTOgSpoZx/2J8QZuxDQvlOYknvbpK95dMmTD87HWI2WrO UJlsQggz+BweQWBchdPdxY/6kf74qi274O+KyRkrnpxhps+vfIGwOV4JIZ1+HX84zM63b45 qm4BE2fEg1bl9W6/khhR8kqdBtNGyFRS6OrD4wR1A1saAsFBCeF497X0jFfi+4cN4p37tfY fwXjkV1Ogt3XfoP0pBLBX4m1h98nm/4zbW5q8Ip2WuQUejE43qEB+wdL1z3WxgXsg2jQPz8 AfAqoSPTamQxXxw0TiJGU8a8EqDAF1VQtf8a97yDXfI4hqCAyAMn6g/nBnb3d7QA4bSmBTO VSj3ikn67D0= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 6408122577507683781 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model Date: Tue, 12 Sep 2023 14:49:32 +0800 Message-Id: <20230912064932.647337-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch support dynamic LMUL cost modeling with --param=riscv-autovec-lmul=dynamic. Consider this following case: void foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, int32_t *__restrict d4, int32_t *__restrict d5, int n) { for (int i = 0; i < n; i++) { a[i] = b[i] + c[i]; b5[i] = b[i] + c[i]; a2[i] = b2[i] + c2[i]; a3[i] = b3[i] + c3[i]; a4[i] = b4[i] + c4[i]; a5[i] = a[i] + a4[i]; d2[i] = a2[i] + c2[i]; d3[i] = a3[i] + c3[i]; d4[i] = a4[i] + c4[i]; d5[i] = a[i] + a4[i]; a[i] = a5[i] + b5[i] + a[i]; c2[i] = a[i] + c[i]; c3[i] = b5[i] * a5[i]; c4[i] = a2[i] * a3[i]; c5[i] = b5[i] * a2[i]; c[i] = a[i] + c3[i]; c2[i] = a[i] + c4[i]; a5[i] = a[i] + a4[i]; a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; } } Demo: https://godbolt.org/z/x1acoMxGT You can see it will produce register spilling if you specify LMUL >= 4 Now, with --param=riscv-autovec-lmul=dynamic. GCC is able to pick LMUL = 2 to optimized this case. This feature is supported by linear scan based local live ranges analysis and compute maximum live V_REGS in specific program point of the function to determine the VF/LMUL. Note that this patch can well handle both SLP and non-SLP loop. Currenty approach didn't consider the later instruction scheduler which may improve the register pressure. In this case, we are conservatively applying smaller VF/LMUL. (Not sure whether we should support live range shrink for such corner case since we don't known whether it can improve performance a lot.) gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (get_last_live_range): New function. (compute_nregs_for_mode): Ditto. (live_range_conflict_p): Ditto. (max_number_of_live_regs): Ditto. (compute_lmul): Ditto. (costs::prefer_new_lmul_p): Ditto. (costs::better_main_loop_than_p): Ditto. * config/riscv/riscv-vector-costs.h (struct stmt_point): New struct. (struct var_live_range): Ditto. (struct autovec_info): Ditto. * config/riscv/t-riscv: Update makefile for COST model. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-mixed-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-2.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-3.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-4.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-5.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-6.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-2.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-3.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-5.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-6.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-2.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-3.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-6.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-7.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-8.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-10.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-3.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-4.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-5.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-6.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-7.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-8.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/rvv-costmodel-vect.exp: New test. --- gcc/config/riscv/riscv-vector-costs.cc | 504 ++++++++++++++++++ gcc/config/riscv/riscv-vector-costs.h | 21 + gcc/config/riscv/t-riscv | 3 +- .../riscv/rvv/dynamic-lmul-mixed-1.c | 50 ++ .../costmodel/riscv/rvv/dynamic-lmul1-1.c | 91 ++++ .../costmodel/riscv/rvv/dynamic-lmul1-2.c | 63 +++ .../costmodel/riscv/rvv/dynamic-lmul1-3.c | 91 ++++ .../costmodel/riscv/rvv/dynamic-lmul1-4.c | 121 +++++ .../costmodel/riscv/rvv/dynamic-lmul1-5.c | 149 ++++++ .../costmodel/riscv/rvv/dynamic-lmul1-6.c | 150 ++++++ .../costmodel/riscv/rvv/dynamic-lmul1-7.c | 48 ++ .../costmodel/riscv/rvv/dynamic-lmul2-1.c | 51 ++ .../costmodel/riscv/rvv/dynamic-lmul2-2.c | 51 ++ .../costmodel/riscv/rvv/dynamic-lmul2-3.c | 51 ++ .../costmodel/riscv/rvv/dynamic-lmul2-4.c | 49 ++ .../costmodel/riscv/rvv/dynamic-lmul2-5.c | 52 ++ .../costmodel/riscv/rvv/dynamic-lmul2-6.c | 54 ++ .../costmodel/riscv/rvv/dynamic-lmul4-1.c | 35 ++ .../costmodel/riscv/rvv/dynamic-lmul4-2.c | 35 ++ .../costmodel/riscv/rvv/dynamic-lmul4-3.c | 47 ++ .../costmodel/riscv/rvv/dynamic-lmul4-4.c | 47 ++ .../costmodel/riscv/rvv/dynamic-lmul4-5.c | 47 ++ .../costmodel/riscv/rvv/dynamic-lmul4-6.c | 27 + .../costmodel/riscv/rvv/dynamic-lmul4-7.c | 47 ++ .../costmodel/riscv/rvv/dynamic-lmul4-8.c | 36 ++ .../costmodel/riscv/rvv/dynamic-lmul8-1.c | 18 + .../costmodel/riscv/rvv/dynamic-lmul8-10.c | 22 + .../costmodel/riscv/rvv/dynamic-lmul8-2.c | 18 + .../costmodel/riscv/rvv/dynamic-lmul8-3.c | 18 + .../costmodel/riscv/rvv/dynamic-lmul8-4.c | 19 + .../costmodel/riscv/rvv/dynamic-lmul8-5.c | 25 + .../costmodel/riscv/rvv/dynamic-lmul8-6.c | 23 + .../costmodel/riscv/rvv/dynamic-lmul8-7.c | 23 + .../costmodel/riscv/rvv/dynamic-lmul8-8.c | 19 + .../costmodel/riscv/rvv/dynamic-lmul8-9.c | 19 + .../riscv/rvv/rvv-costmodel-vect.exp | 52 ++ 36 files changed, 2175 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-mixed-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-2.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-3.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-4.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-5.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-6.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-2.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-3.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-5.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-6.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-2.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-3.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-6.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-7.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-8.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-10.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-3.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-4.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-5.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-6.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-7.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-8.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/rvv-costmodel-vect.exp diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 1a5e13d5eb3..1e82dab1bc1 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -36,16 +36,520 @@ along with GCC; see the file COPYING3. If not see #include "fold-const.h" #include "tm_p.h" #include "tree-vectorizer.h" +#include "gimple-iterator.h" +#include "bitmap.h" +#include "ssa.h" +#include "backend.h" /* This file should be included last. */ #include "riscv-vector-costs.h" namespace riscv_vector { +/* Dynamic LMUL philosophy - Local linear-scan SSA live range based analysis + determine LMUL + + - Collect all vectorize STMTs locally for each loop block. + - Build program point based graph, ignore non-vectorize STMTs: + + vectorize STMT 0 - point 0 + scalar STMT 0 - ignore. + vectorize STMT 1 - point 1 + ... + - Compute the number of live V_REGs live at each program point + - Determine LMUL in VECTOR COST model according to the program point + which has maximum live V_REGs. + + Note: + + - BIGGEST_MODE is the biggest LMUL auto-vectorization element mode. + It's important for mixed size auto-vectorization (Conversions, ... etc). + E.g. For a loop that is vectorizing conversion of INT32 -> INT64. + The biggest mode is DImode and LMUL = 8, LMUL = 4 for SImode. + We compute the number live V_REGs at each program point according to + this information. + - We only compute program points and live ranges locally (within a block) + since we just need to compute the number of live V_REGs at each program + point and we are not really allocating the registers for each SSA. + We can make the variable has another local live range in another block + if it live out/live in to another block. Such approach doesn't affect + out accurate live range analysis. + - Current analysis didn't consider any instruction scheduling which + may improve the register pressure. So we are conservatively doing the + analysis which may end up with smaller LMUL. + TODO: Maybe we could support a reasonable live range shrink algorithm + which take advantage of instruction scheduling. + - We may have these following possible autovec modes analysis: + + 1. M8 -> M4 -> M2 -> M1 (stop analysis here) -> MF2 -> MF4 -> MF8 + 2. M8 -> M1(M4) -> MF2(M2) -> MF4(M1) (stop analysis here) -> MF8(MF2) + 3. M1(M8) -> MF2(M4) -> MF4(M2) -> MF8(M1) +*/ +static hash_map loop_autovec_infos; + +/* Collect all STMTs that are vectorized and compute their program points. + Note that we don't care about the STMTs that are not vectorized and + we only build the local graph (within a block) of program points. + + Loop: + bb 2: + STMT 1 (be vectorized) -- point 0 + STMT 2 (not be vectorized) -- ignored + STMT 3 (be vectorized) -- point 1 + STMT 4 (be vectorized) -- point 2 + STMT 5 (be vectorized) -- point 3 + ... + bb 3: + STMT 1 (be vectorized) -- point 0 + STMT 2 (be vectorized) -- point 1 + STMT 3 (not be vectorized) -- ignored + STMT 4 (not be vectorized) -- ignored + STMT 5 (be vectorized) -- point 2 + ... +*/ +static void +compute_local_program_points ( + vec_info *vinfo, + hash_map> &program_points_per_bb) +{ + if (loop_vec_info loop_vinfo = dyn_cast (vinfo)) + { + class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); + unsigned int nbbs = loop->num_nodes; + gimple_stmt_iterator si; + unsigned int i; + /* Collect the stmts that is vectorized and mark their program point. */ + for (i = 0; i < nbbs; i++) + { + int point = 0; + basic_block bb = bbs[i]; + vec program_points = vNULL; + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Compute local program points for bb %d:\n", + bb->index); + for (si = gsi_start_bb (bbs[i]); !gsi_end_p (si); gsi_next (&si)) + { + if (!(is_gimple_assign (gsi_stmt (si)) + || is_gimple_call (gsi_stmt (si)))) + continue; + stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si)); + if (STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info)) + != undef_vec_info_type) + { + stmt_point info = {point, gsi_stmt (si)}; + program_points.safe_push (info); + point++; + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "program point %d: %G", info.point, + gsi_stmt (si)); + } + } + program_points_per_bb.put (bb, program_points); + } + } +} + +/* Compute local live ranges of each vectorized variable. + Note that we only compute local live ranges (within a block) since + local live ranges information is accurate enough for us to determine + the LMUL/vectorization factor of the loop. + + Loop: + bb 2: + STMT 1 -- point 0 + STMT 2 (def SSA 1) -- point 1 + STMT 3 (use SSA 1) -- point 2 + STMT 4 -- point 3 + bb 3: + STMT 1 -- point 0 + STMT 2 -- point 1 + STMT 3 -- point 2 + STMT 4 (use SSA 2) -- point 3 + + The live range of SSA 1 is [1, 3] in bb 2. + The live range of SSA 2 is [0, 4] in bb 3. */ +static machine_mode +compute_local_live_ranges ( + const hash_map> &program_points_per_bb, + hash_map> &live_ranges_per_bb) +{ + machine_mode biggest_mode = QImode; + if (!program_points_per_bb.is_empty ()) + { + auto_vec visited_vars; + unsigned int i; + for (hash_map>::iterator iter + = program_points_per_bb.begin (); + iter != program_points_per_bb.end (); ++iter) + { + basic_block bb = (*iter).first; + vec program_points = (*iter).second; + bool existed_p = false; + hash_map *live_ranges + = &live_ranges_per_bb.get_or_insert (bb, &existed_p); + gcc_assert (!existed_p); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Compute local live ranges for bb %d:\n", + bb->index); + for (const auto program_point : program_points) + { + unsigned int point = program_point.point; + gimple *stmt = program_point.stmt; + machine_mode mode = biggest_mode; + if (!gimple_store_p (stmt)) + { + tree lhs = gimple_get_lhs (stmt); + mode = TYPE_MODE (TREE_TYPE (lhs)); + bool existed_p = false; + pair &live_range + = live_ranges->get_or_insert (lhs, &existed_p); + gcc_assert (!existed_p); + live_range = pair (point, point); + } + for (i = 0; i < gimple_num_args (stmt); i++) + { + tree var = gimple_arg (stmt, i); + if (is_gimple_reg (var) && !POINTER_TYPE_P (TREE_TYPE (var))) + { + mode = TYPE_MODE (TREE_TYPE (var)); + bool existed_p = false; + pair &live_range + = live_ranges->get_or_insert (var, &existed_p); + if (existed_p) + /* We will grow the live range for each use. */ + live_range = pair (live_range.first, point); + else + /* We assume the variable is live from the start of + this block. */ + live_range = pair (0, point); + } + } + if (GET_MODE_SIZE (mode).to_constant () + > GET_MODE_SIZE (biggest_mode).to_constant ()) + biggest_mode = mode; + } + if (dump_enabled_p ()) + for (hash_map::iterator iter = live_ranges->begin (); + iter != live_ranges->end (); ++iter) + dump_printf_loc (MSG_NOTE, vect_location, + "%T: type = %T, start = %d, end = %d\n", + (*iter).first, TREE_TYPE ((*iter).first), + (*iter).second.first, (*iter).second.second); + } + } + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, "Biggest mode = %s\n", + GET_MODE_NAME (biggest_mode)); + return biggest_mode; +} + +/* Compute the mode for MODE, BIGGEST_MODE and LMUL. + + E.g. If mode = SImode, biggest_mode = DImode, LMUL = M4. + Then return RVVM4SImode (LMUL = 4, element mode = SImode). */ +static unsigned int +compute_nregs_for_mode (machine_mode mode, machine_mode biggest_mode, int lmul) +{ + unsigned int mode_size = GET_MODE_SIZE (mode).to_constant (); + unsigned int biggest_size = GET_MODE_SIZE (biggest_mode).to_constant (); + gcc_assert (biggest_size >= mode_size); + unsigned int ratio = biggest_size / mode_size; + return lmul / ratio; +} + +/* This function helps to determine whether current LMUL will cause + potential vector register (V_REG) spillings according to live range + information. + + - First, compute how many variable are alive of each program point + in each bb of the loop. + - Second, compute how many V_REGs are alive of each program point + in each bb of the loop according the BIGGEST_MODE and the variable + mode. + - Third, Return the maximum V_REGs are alive of the loop. */ +static unsigned int +max_number_of_live_regs (const basic_block bb, + const hash_map &live_ranges, + unsigned int max_point, machine_mode biggest_mode, + int lmul) +{ + unsigned int max_nregs = 0; + unsigned int i; + unsigned int live_point = 0; + auto_vec live_vars_vec; + live_vars_vec.safe_grow (max_point + 1, true); + for (i = 0; i < live_vars_vec.length (); ++i) + live_vars_vec[i] = 0; + for (hash_map::iterator iter = live_ranges.begin (); + iter != live_ranges.end (); ++iter) + { + tree var = (*iter).first; + pair live_range = (*iter).second; + for (i = live_range.first; i <= live_range.second; i++) + { + machine_mode mode = TYPE_MODE (TREE_TYPE (var)); + unsigned int nregs + = compute_nregs_for_mode (mode, biggest_mode, lmul); + live_vars_vec[i] += nregs; + if (live_vars_vec[i] > max_nregs) + max_nregs = live_vars_vec[i]; + } + } + + /* Collect user explicit RVV type. */ + auto_vec all_preds + = get_all_dominated_blocks (CDI_POST_DOMINATORS, bb); + for (i = 0; i < cfun->gimple_df->ssa_names->length (); i++) + { + tree t = ssa_name (i); + if (!t) + continue; + machine_mode mode = TYPE_MODE (TREE_TYPE (t)); + if (!lookup_vector_type_attribute (TREE_TYPE (t)) + && !riscv_v_ext_vls_mode_p (mode)) + continue; + + gimple *def = SSA_NAME_DEF_STMT (t); + if (gimple_bb (def) && !all_preds.contains (gimple_bb (def))) + continue; + use_operand_p use_p; + imm_use_iterator iterator; + + FOR_EACH_IMM_USE_FAST (use_p, iterator, t) + { + if (!USE_STMT (use_p) || is_gimple_debug (USE_STMT (use_p)) + || !dominated_by_p (CDI_POST_DOMINATORS, bb, + gimple_bb (USE_STMT (use_p)))) + continue; + + int regno_alignment = riscv_get_v_regno_alignment (mode); + max_nregs += regno_alignment; + if (dump_enabled_p ()) + dump_printf_loc ( + MSG_NOTE, vect_location, + "Explicit used SSA %T, vectype = %T, mode = %s, cause %d " + "V_REG live in bb %d at program point %d\n", + t, TREE_TYPE (t), GET_MODE_NAME (mode), regno_alignment, + bb->index, live_point); + break; + } + } + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Maximum lmul = %d, %d number of live V_REG at program " + "point %d for bb %d\n", + lmul, max_nregs, live_point, bb->index); + return max_nregs; +} + +/* Return the LMUL of the current analysis. */ +static int +get_current_lmul (class loop *loop) +{ + return loop_autovec_infos.get (loop)->current_lmul; +} + +/* Update the live ranges according PHI. + + Loop: + bb 2: + STMT 1 -- point 0 + STMT 2 (def SSA 1) -- point 1 + STMT 3 (use SSA 1) -- point 2 + STMT 4 -- point 3 + bb 3: + SSA 2 = PHI + STMT 1 -- point 0 + STMT 2 -- point 1 + STMT 3 (use SSA 2) -- point 2 + STMT 4 -- point 3 + + Before this function, the SSA 1 live range is [2, 3] in bb 2 + and SSA 2 is [0, 3] in bb 3. + + Then, after this function, we update SSA 1 live range in bb 2 + into [2, 4] since SSA 1 is live out into bb 3. */ +static void +update_local_live_ranges ( + vec_info *vinfo, + hash_map> &program_points_per_bb, + hash_map> &live_ranges_per_bb) +{ + if (loop_vec_info loop_vinfo = dyn_cast (vinfo)) + { + class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); + unsigned int nbbs = loop->num_nodes; + unsigned int i, j; + gphi_iterator psi; + for (i = 0; i < nbbs; i++) + { + basic_block bb = bbs[i]; + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Update local program points for bb %d:\n", + bb->index); + for (psi = gsi_start_phis (bbs[i]); !gsi_end_p (psi); gsi_next (&psi)) + { + gphi *phi = psi.phi (); + stmt_vec_info stmt_info = vinfo->lookup_stmt (phi); + if (STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info)) + != undef_vec_info_type) + { + for (j = 0; j < gimple_phi_num_args (phi); j++) + { + edge e = gimple_phi_arg_edge (phi, j); + tree def = gimple_phi_arg_def (phi, j); + auto *live_ranges = live_ranges_per_bb.get (e->src); + if (!program_points_per_bb.get (e->src)) + continue; + unsigned int max_point + = (*program_points_per_bb.get (e->src)).length () - 1; + auto *live_range = live_ranges->get (def); + if (live_range) + { + unsigned int end = (*live_range).second; + (*live_range).second = max_point; + if (dump_enabled_p ()) + dump_printf_loc ( + MSG_NOTE, vect_location, + "Update %T end point from %d to %d:\n", def, end, + (*live_range).second); + } + } + } + } + } + } +} + costs::costs (vec_info *vinfo, bool costing_for_scalar) : vector_costs (vinfo, costing_for_scalar) {} +/* Return true that the LMUL of new COST model is preferred. */ +bool +costs::preferred_new_lmul_p (const vector_costs *uncast_other) const +{ + auto other = static_cast (uncast_other); + auto this_loop_vinfo = as_a (this->m_vinfo); + auto other_loop_vinfo = as_a (other->m_vinfo); + class loop *loop = LOOP_VINFO_LOOP (this_loop_vinfo); + + if (!LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (this_loop_vinfo) + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (other_loop_vinfo)) + return false; + + if (loop_autovec_infos.get (loop) && loop_autovec_infos.get (loop)->end_p) + return false; + else if (loop_autovec_infos.get (loop)) + loop_autovec_infos.get (loop)->current_lmul + = loop_autovec_infos.get (loop)->current_lmul / 2; + else + { + int regno_alignment + = riscv_get_v_regno_alignment (other_loop_vinfo->vector_mode); + if (known_eq (LOOP_VINFO_SLP_UNROLLING_FACTOR (other_loop_vinfo), 1U)) + regno_alignment = RVV_M8; + loop_autovec_infos.put (loop, {regno_alignment, regno_alignment, false}); + } + + int lmul = get_current_lmul (loop); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Comparing two main loops (%s at VF %d vs %s at VF %d)\n", + GET_MODE_NAME (this_loop_vinfo->vector_mode), + vect_vf_for_cost (this_loop_vinfo), + GET_MODE_NAME (other_loop_vinfo->vector_mode), + vect_vf_for_cost (other_loop_vinfo)); + + /* Compute local program points. + It's a fast and effective computation. */ + hash_map> program_points_per_bb; + compute_local_program_points (other->m_vinfo, program_points_per_bb); + + /* Compute local live ranges. */ + hash_map> live_ranges_per_bb; + machine_mode biggest_mode + = compute_local_live_ranges (program_points_per_bb, live_ranges_per_bb); + + /* Update live ranges according to PHI. */ + update_local_live_ranges (other->m_vinfo, program_points_per_bb, + live_ranges_per_bb); + + /* TODO: We calculate the maximum live vars base on current STMTS + sequence. We can support live range shrink if it can give us + big improvement in the future. */ + if (!live_ranges_per_bb.is_empty ()) + { + unsigned int max_nregs = 0; + for (hash_map>::iterator iter + = live_ranges_per_bb.begin (); + iter != live_ranges_per_bb.end (); ++iter) + { + basic_block bb = (*iter).first; + unsigned int max_point + = (*program_points_per_bb.get (bb)).length () - 1; + if ((*iter).second.is_empty ()) + continue; + /* We prefer larger LMUL unless it causes register spillings. */ + unsigned int nregs + = max_number_of_live_regs (bb, (*iter).second, max_point, + biggest_mode, lmul); + if (nregs > max_nregs) + max_nregs = nregs; + live_ranges_per_bb.empty (); + } + live_ranges_per_bb.empty (); + if (loop_autovec_infos.get (loop)->current_lmul == RVV_M1 + || max_nregs <= V_REG_NUM) + loop_autovec_infos.get (loop)->end_p = true; + if (loop_autovec_infos.get (loop)->current_lmul > RVV_M1) + return max_nregs > V_REG_NUM; + return false; + } + if (!program_points_per_bb.is_empty ()) + { + for (hash_map>::iterator iter + = program_points_per_bb.begin (); + iter != program_points_per_bb.end (); ++iter) + { + vec program_points = (*iter).second; + if (!program_points.is_empty ()) + program_points.release (); + } + program_points_per_bb.empty (); + } + return lmul > RVV_M1; +} + +bool +costs::better_main_loop_than_p (const vector_costs *uncast_other) const +{ + auto other = static_cast (uncast_other); + + if (!flag_vect_cost_model) + return vector_costs::better_main_loop_than_p (other); + + if (riscv_autovec_lmul == RVV_DYNAMIC) + { + bool post_dom_available_p = dom_info_available_p (CDI_POST_DOMINATORS); + if (!post_dom_available_p) + calculate_dominance_info (CDI_POST_DOMINATORS); + bool preferred_p = preferred_new_lmul_p (uncast_other); + if (!post_dom_available_p) + free_dominance_info (CDI_POST_DOMINATORS); + return preferred_p; + } + + return vector_costs::better_main_loop_than_p (other); +} + unsigned costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_vec_info stmt_info, slp_tree, tree vectype, diff --git a/gcc/config/riscv/riscv-vector-costs.h b/gcc/config/riscv/riscv-vector-costs.h index 57b1be01048..7b5814a4cff 100644 --- a/gcc/config/riscv/riscv-vector-costs.h +++ b/gcc/config/riscv/riscv-vector-costs.h @@ -23,6 +23,23 @@ namespace riscv_vector { +struct stmt_point +{ + /* Program point. */ + unsigned int point; + gimple *stmt; +}; + +/* Pair typedef used by live range: . */ +typedef std::pair pair; + +struct autovec_info +{ + unsigned int initial_lmul; + unsigned int current_lmul; + bool end_p; +}; + /* rvv-specific vector costs. */ class costs : public vector_costs { @@ -31,12 +48,16 @@ class costs : public vector_costs public: costs (vec_info *, bool); + bool better_main_loop_than_p (const vector_costs *other) const override; + private: unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_vec_info stmt_info, slp_tree node, tree vectype, int misalign, vect_cost_model_location where) override; void finish_cost (const vector_costs *) override; + + bool preferred_new_lmul_p (const vector_costs *) const; }; } // namespace riscv_vector diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index b1f80d1d87c..ec5d563859e 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -70,7 +70,8 @@ riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \ riscv-vector-costs.o: $(srcdir)/config/riscv/riscv-vector-costs.cc \ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TARGET_H) $(FUNCTION_H) \ $(TREE_H) basic-block.h $(RTL_H) gimple.h targhooks.h cfgloop.h \ - fold-const.h $(TM_P_H) tree-vectorizer.h \ + fold-const.h $(TM_P_H) tree-vectorizer.h gimple-iterator.h bitmap.h \ + ssa.h backend.h \ $(srcdir)/config/riscv/riscv-vector-costs.h $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/riscv/riscv-vector-costs.cc diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-mixed-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-mixed-1.c new file mode 100644 index 00000000000..fd9f38bc766 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-mixed-1.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, + int32_t *__restrict d4, int32_t *__restrict d5, int n) +{ + for (int i = 0; i < n; i++) + a[i] = d5[i] + b[i]; + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i] + * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler {e32,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-1.c new file mode 100644 index 00000000000..6c414bcd115 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-1.c @@ -0,0 +1,91 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, + int32_t *__restrict d2, + int32_t *__restrict d3, + int32_t *__restrict d4, + int32_t *__restrict d5, + int32_t *__restrict e, + int32_t *__restrict e2, + int32_t *__restrict e3, + int32_t *__restrict e4, + int32_t *__restrict e5, + int32_t *__restrict f, + int32_t *__restrict f2, + int32_t *__restrict f3, + int32_t *__restrict f4, + int32_t *__restrict f5, + int32_t *__restrict g, + int32_t *__restrict g2, + int32_t *__restrict g3, + int32_t *__restrict g4, + int32_t *__restrict g5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + + e[i] = c2[i] + c2[i]; + e2[i] = c2[i] + d2[i]; + e3[i] = d3[i] + d3[i]; + e4[i] = c4[i] + a4[i]; + e5[i] = a[i] + a4[i]; + a5[i] = a[i] + a4[i]; + + f[i] = e2[i] + c2[i]; + f2[i] = e2[i] + d2[i]; + f3[i] = e3[i] + d3[i]; + f4[i] = e4[i] + a4[i]; + f5[i] = e[i] + a4[i]; + f5[i] = e5[i] + a4[i]; + + g[i] = f2[i] + c2[i]; + g2[i] = f2[i] + d2[i]; + g3[i] = f3[i] + d3[i]; + g4[i] = f4[i] + a4[i]; + g5[i] = f[i] + a4[i]; + g5[i] = f5[i] + a4[i]; + + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i] + * e[i] * e2[i] * e3[i] * e4[i] * e5[i] + * f[i] * f2[i] * f3[i] * f4[i] * f5[i] + * g[i] * g2[i] * g3[i] * g4[i] * g5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-2.c new file mode 100644 index 00000000000..b77f3ff58ed --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-2.c @@ -0,0 +1,63 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fno-schedule-insns -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, + int32_t *__restrict d2, + int32_t *__restrict d3, + int32_t *__restrict d4, + int32_t *__restrict d5, + int32_t *__restrict e, + int32_t *__restrict e2, + int32_t *__restrict e3, + int32_t *__restrict e4, + int32_t *__restrict e5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + e[i] = a2[i] + c2[i]; + e2[i] = d2[i] + a2[i]; + e3[i] = d3[i] + a3[i]; + e4[i] = d4[i] + a4[i]; + e5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i] + * e[i] * e2[i] * e3[i] * e4[i] * e5[i]; + } +} + +/* FIXME: Choosing LMUL = 1 is not the optimal since it can be LMUL = 2 if we apply instruction scheduler. */ +/* { dg-final { scan-assembler {e32,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-3.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-3.c new file mode 100644 index 00000000000..164930c9bba --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-3.c @@ -0,0 +1,91 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict c, + int8_t *__restrict a2, int8_t *__restrict b2, int8_t *__restrict c2, + int8_t *__restrict a3, int8_t *__restrict b3, int8_t *__restrict c3, + int8_t *__restrict a4, int8_t *__restrict b4, int8_t *__restrict c4, + int8_t *__restrict a5, int8_t *__restrict b5, int8_t *__restrict c5, + int8_t *__restrict d, + int8_t *__restrict d2, + int8_t *__restrict d3, + int8_t *__restrict d4, + int8_t *__restrict d5, + int8_t *__restrict e, + int8_t *__restrict e2, + int8_t *__restrict e3, + int8_t *__restrict e4, + int8_t *__restrict e5, + int8_t *__restrict f, + int8_t *__restrict f2, + int8_t *__restrict f3, + int8_t *__restrict f4, + int8_t *__restrict f5, + int8_t *__restrict g, + int8_t *__restrict g2, + int8_t *__restrict g3, + int8_t *__restrict g4, + int8_t *__restrict g5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + + e[i] = c2[i] + c2[i]; + e2[i] = c2[i] + d2[i]; + e3[i] = d3[i] + d3[i]; + e4[i] = c4[i] + a4[i]; + e5[i] = a[i] + a4[i]; + a5[i] = a[i] + a4[i]; + + f[i] = e2[i] + c2[i]; + f2[i] = e2[i] + d2[i]; + f3[i] = e3[i] + d3[i]; + f4[i] = e4[i] + a4[i]; + f5[i] = e[i] + a4[i]; + f5[i] = e5[i] + a4[i]; + + g[i] = f2[i] + c2[i]; + g2[i] = f2[i] + d2[i]; + g3[i] = f3[i] + d3[i]; + g4[i] = f4[i] + a4[i]; + g5[i] = f[i] + a4[i]; + g5[i] = f5[i] + a4[i]; + + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i] + * e[i] * e2[i] * e3[i] * e4[i] * e5[i] + * f[i] * f2[i] * f3[i] * f4[i] * f5[i] + * g[i] * g2[i] * g3[i] * g4[i] * g5[i]; + } +} + +/* { dg-final { scan-assembler {e8,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-4.c new file mode 100644 index 00000000000..8d80fbfe390 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-4.c @@ -0,0 +1,121 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, + int32_t *__restrict d2, + int32_t *__restrict d3, + int32_t *__restrict d4, + int32_t *__restrict d5, + int32_t *__restrict e, + int32_t *__restrict e2, + int32_t *__restrict e3, + int32_t *__restrict e4, + int32_t *__restrict e5, + int32_t *__restrict f, + int32_t *__restrict f2, + int32_t *__restrict f3, + int32_t *__restrict f4, + int32_t *__restrict f5, + int32_t *__restrict g, + int32_t *__restrict g2, + int32_t *__restrict g3, + int32_t *__restrict g4, + int32_t *__restrict g5, + + int32_t *__restrict gg, + int32_t *__restrict gg2, + int32_t *__restrict gg3, + int32_t *__restrict gg4, + int32_t *__restrict gg5, + + int32_t *__restrict ggg, + int32_t *__restrict ggg2, + int32_t *__restrict ggg3, + int32_t *__restrict ggg4, + int32_t *__restrict ggg5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + + e[i] = c2[i] + c2[i]; + e2[i] = c2[i] + d2[i]; + e3[i] = d3[i] + d3[i]; + e4[i] = c4[i] + a4[i]; + e5[i] = a[i] + a4[i]; + a5[i] = a[i] + a4[i]; + + f[i] = e2[i] + c2[i]; + f2[i] = e2[i] + d2[i]; + f3[i] = e3[i] + d3[i]; + f4[i] = e4[i] + a4[i]; + f5[i] = e[i] + a4[i]; + f5[i] = e5[i] + a4[i]; + + g[i] = f2[i] + c2[i]; + g2[i] = f2[i] + d2[i]; + g3[i] = f3[i] + d3[i]; + g4[i] = f4[i] + a4[i]; + g5[i] = f[i] + a4[i]; + g5[i] = f5[i] + a4[i]; + + + gg[i] = f2[i] + c2[i]; + gg2[i] = f2[i] + d2[i]; + gg3[i] = f3[i] + d3[i]; + gg4[i] = f4[i] + a4[i]; + gg5[i] = f[i] + a4[i]; + gg5[i] = f5[i] + a4[i]; + + + ggg[i] = f2[i] + c2[i]; + ggg2[i] = f2[i] + d2[i]; + ggg3[i] = f3[i] + d3[i]; + ggg4[i] = f4[i] + a4[i]; + ggg5[i] = f[i] + a4[i]; + ggg5[i] = f5[i] + a4[i]; + + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i] + * e[i] * e2[i] * e3[i] * e4[i] * e5[i] + * f[i] * f2[i] * f3[i] * f4[i] * f5[i] + * g[i] * g2[i] * g3[i] * g4[i] * g5[i] + * gg[i] * gg2[i] * gg3[i] * gg4[i] * gg5[i] + * ggg[i] * ggg2[i] * ggg3[i] * ggg4[i] * ggg5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-5.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-5.c new file mode 100644 index 00000000000..7b4014ddaf6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-5.c @@ -0,0 +1,149 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, + int32_t *__restrict d2, + int32_t *__restrict d3, + int32_t *__restrict d4, + int32_t *__restrict d5, + int32_t *__restrict e, + int32_t *__restrict e2, + int32_t *__restrict e3, + int32_t *__restrict e4, + int32_t *__restrict e5, + int32_t *__restrict f, + int32_t *__restrict f2, + int32_t *__restrict f3, + int32_t *__restrict f4, + int32_t *__restrict f5, + int32_t *__restrict g, + int32_t *__restrict g2, + int32_t *__restrict g3, + int32_t *__restrict g4, + int32_t *__restrict g5, + + int32_t *__restrict gg, + int32_t *__restrict gg2, + int32_t *__restrict gg3, + int32_t *__restrict gg4, + int32_t *__restrict gg5, + + int32_t *__restrict ggg, + int32_t *__restrict ggg2, + int32_t *__restrict ggg3, + int32_t *__restrict ggg4, + int32_t *__restrict ggg5, + + int32_t *__restrict gggg, + int32_t *__restrict gggg2, + int32_t *__restrict gggg3, + int32_t *__restrict gggg4, + int32_t *__restrict gggg5, + + int32_t *__restrict ggggg, + int32_t *__restrict ggggg2, + int32_t *__restrict ggggg3, + int32_t *__restrict ggggg4, + int32_t *__restrict ggggg5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + + e[i] = c2[i] + c2[i]; + e2[i] = c2[i] + d2[i]; + e3[i] = d3[i] + d3[i]; + e4[i] = c4[i] + a4[i]; + e5[i] = a[i] + a4[i]; + a5[i] = a[i] + a4[i]; + + f[i] = e2[i] + c2[i]; + f2[i] = e2[i] + d2[i]; + f3[i] = e3[i] + d3[i]; + f4[i] = e4[i] + a4[i]; + f5[i] = e[i] + a4[i]; + f5[i] = e5[i] + a4[i]; + + g[i] = f2[i] + c2[i]; + g2[i] = f2[i] + d2[i]; + g3[i] = f3[i] + d3[i]; + g4[i] = f4[i] + a4[i]; + g5[i] = f[i] + a4[i]; + g5[i] = f5[i] + a4[i]; + + + gg[i] = f2[i] + c2[i]; + gg2[i] = f2[i] + d2[i]; + gg3[i] = f3[i] + d3[i]; + gg4[i] = f4[i] + a4[i]; + gg5[i] = f[i] + a4[i]; + gg5[i] = f5[i] + a4[i]; + + + ggg[i] = f2[i] + c2[i]; + ggg2[i] = f2[i] + d2[i]; + ggg3[i] = f3[i] + d3[i]; + ggg4[i] = f4[i] + a4[i]; + ggg5[i] = f[i] + a4[i]; + ggg5[i] = f5[i] + a4[i]; + + gggg[i] = f2[i] + c2[i]; + gggg2[i] = f2[i] + d2[i]; + gggg3[i] = f3[i] + d3[i]; + gggg4[i] = f4[i] + a4[i]; + gggg5[i] = f[i] + a4[i]; + gggg5[i] = f5[i] + a4[i]; + + ggggg[i] = f2[i] + c2[i]; + ggggg2[i] = f2[i] + d2[i]; + ggggg3[i] = f3[i] + d3[i]; + ggggg4[i] = f4[i] + a4[i]; + ggggg5[i] = f[i] + a4[i]; + ggggg5[i] = f5[i] + a4[i]; + + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i] + * e[i] * e2[i] * e3[i] * e4[i] * e5[i] + * f[i] * f2[i] * f3[i] * f4[i] * f5[i] + * g[i] * g2[i] * g3[i] * g4[i] * g5[i] + * gg[i] * gg2[i] * gg3[i] * gg4[i] * gg5[i] + * ggg[i] * ggg2[i] * ggg3[i] * ggg4[i] * ggg5[i] + * gggg[i] * gggg2[i] * gggg3[i] * gggg4[i] * gggg5[i] + * ggggg[i] * ggggg2[i] * ggggg3[i] * ggggg4[i] * ggggg5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-6.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-6.c new file mode 100644 index 00000000000..51d05f2bec9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-6.c @@ -0,0 +1,150 @@ + +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict c, + int8_t *__restrict a2, int8_t *__restrict b2, int8_t *__restrict c2, + int8_t *__restrict a3, int8_t *__restrict b3, int8_t *__restrict c3, + int8_t *__restrict a4, int8_t *__restrict b4, int8_t *__restrict c4, + int8_t *__restrict a5, int8_t *__restrict b5, int8_t *__restrict c5, + int8_t *__restrict d, + int8_t *__restrict d2, + int8_t *__restrict d3, + int8_t *__restrict d4, + int8_t *__restrict d5, + int8_t *__restrict e, + int8_t *__restrict e2, + int8_t *__restrict e3, + int8_t *__restrict e4, + int8_t *__restrict e5, + int8_t *__restrict f, + int8_t *__restrict f2, + int8_t *__restrict f3, + int8_t *__restrict f4, + int8_t *__restrict f5, + int8_t *__restrict g, + int8_t *__restrict g2, + int8_t *__restrict g3, + int8_t *__restrict g4, + int8_t *__restrict g5, + + int8_t *__restrict gg, + int8_t *__restrict gg2, + int8_t *__restrict gg3, + int8_t *__restrict gg4, + int8_t *__restrict gg5, + + int8_t *__restrict ggg, + int8_t *__restrict ggg2, + int8_t *__restrict ggg3, + int8_t *__restrict ggg4, + int8_t *__restrict ggg5, + + int8_t *__restrict gggg, + int8_t *__restrict gggg2, + int8_t *__restrict gggg3, + int8_t *__restrict gggg4, + int8_t *__restrict gggg5, + + int8_t *__restrict ggggg, + int8_t *__restrict ggggg2, + int8_t *__restrict ggggg3, + int8_t *__restrict ggggg4, + int8_t *__restrict ggggg5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + + e[i] = c2[i] + c2[i]; + e2[i] = c2[i] + d2[i]; + e3[i] = d3[i] + d3[i]; + e4[i] = c4[i] + a4[i]; + e5[i] = a[i] + a4[i]; + a5[i] = a[i] + a4[i]; + + f[i] = e2[i] + c2[i]; + f2[i] = e2[i] + d2[i]; + f3[i] = e3[i] + d3[i]; + f4[i] = e4[i] + a4[i]; + f5[i] = e[i] + a4[i]; + f5[i] = e5[i] + a4[i]; + + g[i] = f2[i] + c2[i]; + g2[i] = f2[i] + d2[i]; + g3[i] = f3[i] + d3[i]; + g4[i] = f4[i] + a4[i]; + g5[i] = f[i] + a4[i]; + g5[i] = f5[i] + a4[i]; + + + gg[i] = f2[i] + c2[i]; + gg2[i] = f2[i] + d2[i]; + gg3[i] = f3[i] + d3[i]; + gg4[i] = f4[i] + a4[i]; + gg5[i] = f[i] + a4[i]; + gg5[i] = f5[i] + a4[i]; + + + ggg[i] = f2[i] + c2[i]; + ggg2[i] = f2[i] + d2[i]; + ggg3[i] = f3[i] + d3[i]; + ggg4[i] = f4[i] + a4[i]; + ggg5[i] = f[i] + a4[i]; + ggg5[i] = f5[i] + a4[i]; + + gggg[i] = f2[i] + c2[i]; + gggg2[i] = f2[i] + d2[i]; + gggg3[i] = f3[i] + d3[i]; + gggg4[i] = f4[i] + a4[i]; + gggg5[i] = f[i] + a4[i]; + gggg5[i] = f5[i] + a4[i]; + + ggggg[i] = f2[i] + c2[i]; + ggggg2[i] = f2[i] + d2[i]; + ggggg3[i] = f3[i] + d3[i]; + ggggg4[i] = f4[i] + a4[i]; + ggggg5[i] = f[i] + a4[i]; + ggggg5[i] = f5[i] + a4[i]; + + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i] + * e[i] * e2[i] * e3[i] * e4[i] * e5[i] + * f[i] * f2[i] * f3[i] * f4[i] * f5[i] + * g[i] * g2[i] * g3[i] * g4[i] * g5[i] + * gg[i] * gg2[i] * gg3[i] * gg4[i] * gg5[i] + * ggg[i] * ggg2[i] * ggg3[i] * ggg4[i] * ggg5[i] + * gggg[i] * gggg2[i] * gggg3[i] * gggg4[i] * gggg5[i] + * ggggg[i] * ggggg2[i] * ggggg3[i] * ggggg4[i] * ggggg5[i]; + } +} + +/* { dg-final { scan-assembler {e8,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c new file mode 100644 index 00000000000..dfd71414b62 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -Wno-psabi -fdump-tree-vect-details" } */ + +#include "riscv_vector.h" + +vint32m8_t +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, + int32_t *__restrict d4, int32_t *__restrict d5, int n, vint32m8_t vector) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i] + * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } + return vector; +} + +/* { dg-final { scan-assembler {e32,m1} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 1" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-1.c new file mode 100644 index 00000000000..ce83bb22324 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-1.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, + int32_t *__restrict d2, + int32_t *__restrict d3, + int32_t *__restrict d4, + int32_t *__restrict d5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-2.c new file mode 100644 index 00000000000..a80b1b1556a --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-2.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict c, + int8_t *__restrict a2, int8_t *__restrict b2, int8_t *__restrict c2, + int8_t *__restrict a3, int8_t *__restrict b3, int8_t *__restrict c3, + int8_t *__restrict a4, int8_t *__restrict b4, int8_t *__restrict c4, + int8_t *__restrict a5, int8_t *__restrict b5, int8_t *__restrict c5, + int8_t *__restrict d, + int8_t *__restrict d2, + int8_t *__restrict d3, + int8_t *__restrict d4, + int8_t *__restrict d5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } +} + +/* { dg-final { scan-assembler {e8,m2} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-3.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-3.c new file mode 100644 index 00000000000..ce83bb22324 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-3.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, + int32_t *__restrict d2, + int32_t *__restrict d3, + int32_t *__restrict d4, + int32_t *__restrict d5, + int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] + * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i] + * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c new file mode 100644 index 00000000000..9964f3fe8ba --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include "riscv_vector.h" + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, + int32_t *__restrict d4, int32_t *__restrict d5, int n) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (a, 32); + __riscv_vse32_v_i32m1 (c, v, 32); + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i] + * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-5.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-5.c new file mode 100644 index 00000000000..ab670bf0c7d --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-5.c @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include +typedef int8_t v128qi __attribute__ ((vector_size (128))); + +v128qi global_v; + +v128qi +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, + int32_t *__restrict d4, int32_t *__restrict d5, int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i] + * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } + return global_v + 3; +} + +/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler {e8,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-6.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-6.c new file mode 100644 index 00000000000..a01e32b9f8d --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-6.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include +typedef int8_t v128qi __attribute__ ((vector_size (128))); + +v128qi global_v; + +v128qi +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, + int32_t *__restrict d4, int32_t *__restrict d5, int n) +{ + for (int i = 0; i < 128; i++) + b[i] = global_v[i] + 8; + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d2[i] = a2[i] + c2[i]; + d3[i] = a3[i] + c3[i]; + d4[i] = a4[i] + c4[i]; + d5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i] + a[i]; + + c2[i] = a[i] + c[i]; + c3[i] = b5[i] * a5[i]; + c4[i] = a2[i] * a3[i]; + c5[i] = b5[i] * a2[i]; + c[i] = a[i] + c3[i]; + c2[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i] + + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i] + * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i]; + } + return global_v + 3; +} + +/* { dg-final { scan-assembler {e32,m2} } } */ +/* { dg-final { scan-assembler {e8,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-1.c new file mode 100644 index 00000000000..156ccc7f98e --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-1.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c, + int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, + int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4, + int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5, + int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3, + int32_t *__restrict d4, int32_t *__restrict d5, int n, int m) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d[i] = a[i] - a2[i]; + d2[i] = a2[i] * a[i]; + d3[i] = a3[i] * a2[i]; + d4[i] = a2[i] * d2[i]; + d5[i] = a[i] * a2[i] * a3[i] * a4[i] * d[i]; + } +} + +/* { dg-final { scan-assembler {e32,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-2.c new file mode 100644 index 00000000000..4cacc039dcb --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-2.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict c, + int8_t *__restrict a2, int8_t *__restrict b2, int8_t *__restrict c2, + int8_t *__restrict a3, int8_t *__restrict b3, int8_t *__restrict c3, + int8_t *__restrict a4, int8_t *__restrict b4, int8_t *__restrict c4, + int8_t *__restrict a5, int8_t *__restrict b5, int8_t *__restrict c5, + int8_t *__restrict d, int8_t *__restrict d2, int8_t *__restrict d3, + int8_t *__restrict d4, int8_t *__restrict d5, int n, int m) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + d[i] = a[i] - a2[i]; + d2[i] = a2[i] * a[i]; + d3[i] = a3[i] * a2[i]; + d4[i] = a2[i] * d2[i]; + d5[i] = a[i] * a2[i] * a3[i] * a4[i] * d[i]; + } +} + +/* { dg-final { scan-assembler {e8,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-3.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-3.c new file mode 100644 index 00000000000..2308109f00c --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-3.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void foo2 (int64_t *__restrict a, + int8_t *__restrict b, + int8_t *__restrict c, + int8_t *__restrict a2, + int8_t *__restrict b2, + int8_t *__restrict c2, + int8_t *__restrict a3, + int8_t *__restrict b3, + int8_t *__restrict c3, + int8_t *__restrict a4, + int8_t *__restrict b4, + int8_t *__restrict c4, + int64_t *__restrict a5, + int8_t *__restrict b5, + int8_t *__restrict c5, + int n) +{ + for (int i = 0; i < n; i++){ + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i]+ a[i]; + + a[i] = a[i] + c[i]; + b5[i] = a[i] + c[i]; + a2[i] = a[i] + c2[i]; + a3[i] = a[i] + c3[i]; + a4[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i]+ a[i]; + } +} + +/* { dg-final { scan-assembler {e64,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c new file mode 100644 index 00000000000..2a1521bffb9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void foo2 (int64_t *__restrict a, + int32_t *__restrict b, + int32_t *__restrict c, + int32_t *__restrict a2, + int32_t *__restrict b2, + int32_t *__restrict c2, + int32_t *__restrict a3, + int32_t *__restrict b3, + int32_t *__restrict c3, + int32_t *__restrict a4, + int32_t *__restrict b4, + int32_t *__restrict c4, + int64_t *__restrict a5, + int32_t *__restrict b5, + int32_t *__restrict c5, + int n) +{ + for (int i = 0; i < n; i++){ + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i]+ a[i]; + + a[i] = a[i] + c[i]; + b5[i] = a[i] + c[i]; + a2[i] = a[i] + c2[i]; + a3[i] = a[i] + c3[i]; + a4[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i]+ a[i]; + } +} + +/* { dg-final { scan-assembler {e64,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c new file mode 100644 index 00000000000..928a507a363 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void foo2 (int16_t *__restrict a, + int32_t *__restrict b, + int32_t *__restrict c, + int32_t *__restrict a2, + int32_t *__restrict b2, + int32_t *__restrict c2, + int32_t *__restrict a3, + int32_t *__restrict b3, + int32_t *__restrict c3, + int32_t *__restrict a4, + int32_t *__restrict b4, + int32_t *__restrict c4, + int16_t *__restrict a5, + int32_t *__restrict b5, + int32_t *__restrict c5, + int n) +{ + for (int i = 0; i < n; i++){ + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i]+ a[i]; + + a[i] = a[i] + c[i]; + b5[i] = a[i] + c[i]; + a2[i] = a[i] + c2[i]; + a3[i] = a[i] + c3[i]; + a4[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i]+ a[i]; + } +} + +/* { dg-final { scan-assembler {e32,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-6.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-6.c new file mode 100644 index 00000000000..f16cfb9fd08 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-6.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fselective-scheduling -fdump-tree-vect-details" } */ + +#include + +void +foo (uint8_t *restrict a, uint8_t *restrict b, int n) +{ + for (int i = 0; i < n; ++i) + { + a[i * 8] = b[i * 8 + 7] + 1; + a[i * 8 + 1] = b[i * 8 + 6] + 2; + a[i * 8 + 2] = b[i * 8 + 5] + 3; + a[i * 8 + 3] = b[i * 8 + 4] + 4; + a[i * 8 + 4] = b[i * 8 + 3] + 5; + a[i * 8 + 5] = b[i * 8 + 2] + 6; + a[i * 8 + 6] = b[i * 8 + 1] + 7; + a[i * 8 + 7] = b[i * 8 + 0] + 8; + } +} + +/* { dg-final { scan-assembler {e8,m4} } } */ +/* { dg-final { scan-assembler-times {csrr} 1 } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 8" "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-7.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-7.c new file mode 100644 index 00000000000..e324380e27b --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-7.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void foo2 (int8_t *__restrict a, + int64_t *__restrict b, + int64_t *__restrict c, + int64_t *__restrict a2, + int64_t *__restrict b2, + int64_t *__restrict c2, + int64_t *__restrict a3, + int64_t *__restrict b3, + int64_t *__restrict c3, + int64_t *__restrict a4, + int64_t *__restrict b4, + int64_t *__restrict c4, + int8_t *__restrict a5, + int64_t *__restrict b5, + int64_t *__restrict c5, + int n) +{ + for (int i = 0; i < n; i++){ + a[i] = b[i] + c[i]; + b5[i] = b[i] + c[i]; + a2[i] = b2[i] + c2[i]; + a3[i] = b3[i] + c3[i]; + a4[i] = b4[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a5[i] + b5[i]+ a[i]; + + a[i] = a[i] + c[i]; + b5[i] = a[i] + c[i]; + a2[i] = a[i] + c2[i]; + a3[i] = a[i] + c3[i]; + a4[i] = a[i] + c4[i]; + a5[i] = a[i] + a4[i]; + a[i] = a[i] + b5[i]+ a[i]; + } +} + +/* { dg-final { scan-assembler {e64,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-8.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-8.c new file mode 100644 index 00000000000..553f2aac0d6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-8.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fselective-scheduling -fdump-tree-vect-details" } */ + +#include + +void +foo (uint8_t *restrict a, uint8_t *restrict b, int n) +{ + for (int i = 0; i < n; ++i) + { + a[i * 16] = b[i * 16 + 15] + 1; + a[i * 16 + 1] = b[i * 16 + 14] + 2; + a[i * 16 + 2] = b[i * 16 + 13] + 3; + a[i * 16 + 3] = b[i * 16 + 12] + 4; + a[i * 16 + 4] = b[i * 16 + 11] + 5; + a[i * 16 + 5] = b[i * 16 + 10] + 6; + a[i * 16 + 6] = b[i * 16 + 9] + 7; + a[i * 16 + 7] = b[i * 16 + 8] + 8; + + a[i * 16 + 8] = b[i * 16 + 7] + 1; + a[i * 16 + 9] = b[i * 16 + 6] + 2; + a[i * 16 + 10] = b[i * 16 + 5] + 3; + a[i * 16 + 11] = b[i * 16 + 4] + 4; + a[i * 16 + 12] = b[i * 16 + 3] + 5; + a[i * 16 + 13] = b[i * 16 + 2] + 6; + a[i * 16 + 14] = b[i * 16 + 1] + 7; + a[i * 16 + 15] = b[i * 16 + 0] + 8; + } +} + +/* { dg-final { scan-assembler {e8,m4} } } */ +/* { dg-final { scan-assembler-times {csrr} 1 } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 8" "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-1.c new file mode 100644 index 00000000000..e2483004698 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int n) +{ + for (int i = 0; i < n; i++) + a[i] = a[i] + b[i]; +} + +/* { dg-final { scan-assembler {e32,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-10.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-10.c new file mode 100644 index 00000000000..e65abb299a5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-10.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +int +foo (int *x, int n, int res) +{ + for (int i = 0; i < n; ++i) + { + res += x[i * 2]; + res += x[i * 2 + 1]; + } + return res; +} + +/* { dg-final { scan-assembler {e32,m8} } } */ +/* { dg-final { scan-assembler-times {csrr} 1 } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c new file mode 100644 index 00000000000..a50265fc1ec --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int16_t *__restrict b, int n) +{ + for (int i = 0; i < n; i++) + a[i] = a[i] + b[i]; +} + +/* { dg-final { scan-assembler {e32,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-3.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-3.c new file mode 100644 index 00000000000..3e9751a22ed --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-3.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int n) +{ + for (int i = 0; i < n; i++) + a[i] = a[i] + b[i]; +} + +/* { dg-final { scan-assembler {e8,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-4.c new file mode 100644 index 00000000000..3b2527aad5d --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-4.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include +#include + +void +foo (size_t *__restrict a, size_t *__restrict b, int n) +{ + for (int i = 0; i < n; i++) + a[i] = a[i] + b[i]; +} + +/* { dg-final { scan-assembler {e64,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-5.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-5.c new file mode 100644 index 00000000000..d63926fa56a --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-5.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int n) +{ + for (int i = 0; i < n; i++){ + a[i] = a[i] + b[i]; + a[i] = a[i] + b[i]; + a[i] = a[i] + b[i]; + a[i] = a[i] + b[i]; + a[i] = a[i] + b[i]; + a[i] = a[i] + b[i]; + a[i] = a[i] + b[i]; + } +} + +/* { dg-final { scan-assembler {e8,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-6.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-6.c new file mode 100644 index 00000000000..5c816140950 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-6.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict a2, + int8_t *__restrict b2, int8_t *__restrict a3, int8_t *__restrict b3, + int8_t *__restrict a4, int8_t *__restrict b4, int8_t *__restrict a5, + int8_t *__restrict b5, int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] * a2[i] * b2[i] * a3[i] * b3[i] * a4[i] * b4[i] * a5[i] * b5[i]; + } +} + +/* { dg-final { scan-assembler {e8,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-7.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-7.c new file mode 100644 index 00000000000..596608bb8d3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-7.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +void +foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict a2, + int32_t *__restrict b2, int32_t *__restrict a3, int32_t *__restrict b3, + int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict a5, + int32_t *__restrict b5, int n) +{ + for (int i = 0; i < n; i++) + { + a[i] = b[i] * a2[i] * b2[i] * a3[i] * b3[i] * a4[i] * b4[i] * a5[i] * b5[i]; + } +} + +/* { dg-final { scan-assembler {e32,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-8.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-8.c new file mode 100644 index 00000000000..a859d976555 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-8.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +int8_t +foo (int8_t *__restrict a, int8_t init, int n) +{ + for (int i = 0; i < n; i++) + init += a[i]; + return init; +} + +/* { dg-final { scan-assembler {e8,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c new file mode 100644 index 00000000000..b965fd0373a --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fdump-tree-vect-details" } */ + +#include + +int64_t +foo (int64_t *__restrict a, int64_t init, int n) +{ + for (int i = 0; i < n; i++) + init += a[i]; + return init; +} + +/* { dg-final { scan-assembler {e64,m8} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 4" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 2" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Maximum lmul = 1" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/rvv-costmodel-vect.exp b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/rvv-costmodel-vect.exp new file mode 100644 index 00000000000..a3e8f50b73f --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/rvv-costmodel-vect.exp @@ -0,0 +1,52 @@ +# Copyright (C) 2023-2023 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `dg.exp' driver. + +# Load support procs. +load_lib gcc-dg.exp + +# Exit immediately if this isn't a riscv target. +if { ![istarget riscv*-*-*] } then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +set gcc_march "rv64gcv_zvfh" +set gcc_mabi "lp64d" +if [istarget riscv32-*-*] then { + set gcc_march "rv32gcv_zvfh" + set gcc_mabi "ilp32d" +} + +# Initialize `dg'. +dg-init + +# Main loop. +set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -mabi=$gcc_mabi -O3" +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/dynamic-lmul*.\[cS\]]] \ + "-O3 -ftree-vectorize --param riscv-autovec-lmul=dynamic" $CFLAGS + +# All done. +dg-finish -- 2.36.3