From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id EE9273858D1E for ; Wed, 8 Nov 2023 10:53:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE9273858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE9273858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699440811; cv=none; b=wpYa/triXn1wlKu8dSw3yycA6mRCT7w8pcJFkLR9deiYmP+2NsJPMU6+yd6OQBImyY7Sp5fVjQRoEZm67ohcg42ed4sjy+tjcWyEnrbI5fNQYKCXRj13tWG6o9lzyUdKAsLS12zGgcbt4YE3eTmkUCOyzVQ071paFWzaGiV48zA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699440811; c=relaxed/simple; bh=Yx9uVdW2TZkYSdiX4+Gyxu8hTzROFBxkKH86UljyohY=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=iFtCQdL/m5RKeKwR/7Q0yeCXG+GK89Y/+yDReLThlm1CKhVKoWRDkpm+BTwjsYDuKvLuFvPBW8f3UNla2+mScmwLYP8KIuB0pD4Dej4yzqUjxE9tDYGJPi2wdzKrB6aJIlHBm8xXyUjifHhzwbvUTra9o/cEkjtJAi8BM4e0SdE= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp65t1699440801tw82g43j Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 08 Nov 2023 18:53:19 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: +ynUkgUhZJmoDnYrYqYAmHjuVTMeY2ucnh+JIk9ERkQ7d3hTvXMEpL9ArWEBQ 4ID99qXSQYLxO+XSDzJutLiGRrC8eb4yJMhqbgxWT9kY58oDUcXwHmnXVjz4a1tQ2OqADt3 LDEDQvgYbdKEnJr1eiTCSfAit+Wng7r7E8osTfd2b4FQh6aJP2E3FMEzXhKCNSEoiuYHb3r FF+NMiI3p+KWVmv35lK22dHsxrip2v720i5oNl3Czt8QB8zbcfSY3QgqAHyofblYkdnK1pT rcBLygbu7Oxn/nhrHp1XpVsfvOUqiUio2Biw8AMH92o4EcheyYYzvHq8twrcMI41cM1VAmV rVKElRR41GPXelsImge5vIhnGMMEm85gDBI/7I2GEyhFYLYACGzV7CiaSgffJbiAxn96fk7 zWrAFm2dcuT8BQbRTCmOr8O3vgREcGUJ X-QQ-GoodBg: 2 X-BIZMAIL-ID: 11307286179470564756 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, kito.cheng@gmail.com, kito.cheng@sifive.com, Juzhe-Zhong Subject: [PATCH] Middle-end: Fix bug of induction variable vectorization for RVV Date: Wed, 8 Nov 2023 18:53:17 +0800 Message-Id: <20231108105317.1786716-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438 SELECT_VL result is not necessary always VF in non-final iteration. Current GIMPLE IR is wrong: # vect_vec_iv_.21_25 = PHI <_24(4), { 0, 1, 2, ... }(3)> ... _24 = vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }; After this patch which is correct for SELECT_VL: # vect_vec_iv_.8_22 = PHI <_21(4), { 0, 1, 2, ... }(3)> ... _35 = .SELECT_VL (ivtmp_33, POLY_INT_CST [4, 4]); _21 = vect_vec_iv_.8_22 + { POLY_INT_CST [4, 4], ... }; kito, could you give more explanation ? PR middle/112438 gcc/ChangeLog: * tree-vect-loop.cc (vectorizable_induction): Fix bug. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112438.c: New test. --- .../gcc.target/riscv/rvv/autovec/pr112438.c | 35 +++++++++++++++++ gcc/tree-vect-loop.cc | 39 +++++++++++++++---- 2 files changed, 67 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c new file mode 100644 index 00000000000..b326d56a52c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */ + +void +foo (int n, int *__restrict in, int *__restrict out) +{ + for (int i = 0; i < n; i += 1) + { + out[i] = in[i] + i; + } +} + +void +foo2 (int n, float * __restrict in, +float * __restrict out) +{ + for (int i = 0; i < n; i += 1) + { + out[i] = in[i] + i; + } +} + +void +foo3 (int n, float * __restrict in, +float * __restrict out, float x) +{ + for (int i = 0; i < n; i += 1) + { + out[i] = in[i] + i* i; + } +} + +/* We don't want to see vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }. */ +/* { dg-final { scan-tree-dump-not "\\+ \{ POLY_INT_CST" "optimized" } } */ + diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a544bc9b059..3e103946168 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -10309,10 +10309,30 @@ vectorizable_induction (loop_vec_info loop_vinfo, new_name = step_expr; else { + gimple_seq seq = NULL; + if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)) + { + /* When we're using loop_len produced by SELEC_VL, the non-final + iterations are not always processing VF elements. So vectorize + induction variable instead of + + _21 = vect_vec_iv_.6_22 + { VF, ... }; + + We should generate: + + _35 = .SELECT_VL (ivtmp_33, VF); + vect_cst__22 = [vec_duplicate_expr] _35; + _21 = vect_vec_iv_.6_22 + vect_cst__22; */ + vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo); + tree len + = vect_get_loop_len (loop_vinfo, NULL, lens, 1, vectype, 0, 0); + expr = force_gimple_operand (fold_convert (TREE_TYPE (step_expr), + unshare_expr (len)), + &seq, true, NULL_TREE); + } /* iv_loop is the loop to be vectorized. Generate: vec_step = [VF*S, VF*S, VF*S, VF*S] */ - gimple_seq seq = NULL; - if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr))) + else if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr))) { expr = build_int_cst (integer_type_node, vf); expr = gimple_build (&seq, FLOAT_EXPR, TREE_TYPE (step_expr), expr); @@ -10323,8 +10343,13 @@ vectorizable_induction (loop_vec_info loop_vinfo, expr, step_expr); if (seq) { - new_bb = gsi_insert_seq_on_edge_immediate (pe, seq); - gcc_assert (!new_bb); + if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)) + gsi_insert_seq_before (&si, seq, GSI_SAME_STMT); + else + { + new_bb = gsi_insert_seq_on_edge_immediate (pe, seq); + gcc_assert (!new_bb); + } } } @@ -10332,9 +10357,9 @@ vectorizable_induction (loop_vec_info loop_vinfo, gcc_assert (CONSTANT_CLASS_P (new_name) || TREE_CODE (new_name) == SSA_NAME); new_vec = build_vector_from_val (step_vectype, t); - vec_step = vect_init_vector (loop_vinfo, stmt_info, - new_vec, step_vectype, NULL); - + vec_step + = vect_init_vector (loop_vinfo, stmt_info, new_vec, step_vectype, + LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) ? &si : NULL); /* Create the following def-use cycle: loop prolog: -- 2.36.3