From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id B3B763858D33 for ; Tue, 27 Jun 2023 06:47:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3B763858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp72t1687848459t0wc275d Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 27 Jun 2023 14:47:38 +0800 (CST) X-QQ-SSF: 01400000000000G0S000000A0000000 X-QQ-FEAT: CR3LFp2JE4kNbG8Ckn798UKPKQWyddANkRqHA9lFyBl9qO1Vn0RhftbKiIpWR Bog26BV5IYlgwETYWhHr5h/1WxCi5QoCNalFk6/Fj4z5DnbJ+oflZJiwYRxJzlGS1kFMieh TKK/6tfQpvIGPi2DZydj/6hQ5mqO5tK4f01T+MPM6mOrilS+wMmmKTTEYZp1zIEkx9E8hCC HsjiuRFYlhRwH5odXz7ZrBSHy6B3ihHhpvzGy6FogsBjyhn/BG4VR4sEN1uXnazZSd+dypa 34fAYZcGvpio0mRlbcPmcqujWUNLUY/PI+oqJUKH1O7H/bVFinYZs/WNzh6RswTPCOU6prh WpWW2QH8RFp/iDDd5l9wzj4y6Ms+oSQLu6em7WlDGz39Y06utL4/19TlMPY+6a0OKUInxY1 X-QQ-GoodBg: 2 X-BIZMAIL-ID: 11366376520307476932 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, pan2.li@intel.com, Ju-Zhe Zhong Subject: [PATCH V4] SCCVN: Add LEN_MASK_STORE and fix LEN_STORE Date: Tue, 27 Jun 2023 14:47:37 +0800 Message-Id: <20230627064737.16257-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Ju-Zhe Zhong Hi, Richi. I tried to understand your last email and to refactor the do-while loop using VECTOR_CST_NELTS. This patch works fine for LEN_MASK_STORE and compiler can CSE redundant store. I have appended testcase in this patch to test VN for LEN_MASK_STORE. I am not sure whether I am on the same page with you. Feel free to correct me, Thanks. gcc/ChangeLog: * tree-ssa-sccvn.cc (vn_reference_lookup_3): Add LEN_MASK_STORE and fix LEN_STORE gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c: New test. --- .../rvv/autovec/partial/len_maskstore_vn-1.c | 30 +++++++++++++++++++ gcc/tree-ssa-sccvn.cc | 24 +++++++++++---- 2 files changed, 49 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c new file mode 100644 index 00000000000..0b2d03693dc --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv_zvl256b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -fdump-tree-fre5" } */ + +void __attribute__((noinline,noclone)) +foo (int *out, int *res) +{ + int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 }; + int i; + for (i = 0; i < 16; ++i) + { + if (mask[i]) + out[i] = i; + } + int o0 = out[0]; + int o7 = out[7]; + int o14 = out[14]; + int o15 = out[15]; + res[0] = o0; + res[2] = o7; + res[4] = o14; + res[6] = o15; +} + +/* Vectorization produces .LEN_MASK_STORE, unrolling will unroll the two + vector iterations. FRE5 after that should be able to CSE + out[7] and out[15], but leave out[0] and out[14] alone. */ +/* { dg-final { scan-tree-dump " = o0_\[0-9\]+;" "fre5" } } */ +/* { dg-final { scan-tree-dump " = 7;" "fre5" } } */ +/* { dg-final { scan-tree-dump " = o14_\[0-9\]+;" "fre5" } } */ +/* { dg-final { scan-tree-dump " = 15;" "fre5" } } */ diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index 11061a374a2..242d82d6274 100644 --- a/gcc/tree-ssa-sccvn.cc +++ b/gcc/tree-ssa-sccvn.cc @@ -3304,6 +3304,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias)) return (void *)-1; break; + case IFN_LEN_MASK_STORE: + len = gimple_call_arg (call, 2); + bias = gimple_call_arg (call, 5); + if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias)) + return (void *)-1; + mask = gimple_call_arg (call, internal_fn_mask_index (fn)); + mask = vn_valueize (mask); + if (TREE_CODE (mask) != VECTOR_CST) + return (void *)-1; + break; default: return (void *)-1; } @@ -3344,11 +3354,17 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, tree vectype = TREE_TYPE (def_rhs); unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype))); + /* Set initial len value is the UINT_MAX, so mask_idx < actual_len + is always true for MASK_STORE. */ + unsigned actual_len = UINT_MAX; + if (len) + actual_len = tree_to_uhwi (len) + tree_to_shwi (bias); + unsigned nunits + = MIN (actual_len, VECTOR_CST_NELTS (mask).coeffs[0]); if (mask) { HOST_WIDE_INT start = 0, length = 0; - unsigned mask_idx = 0; - do + for (unsigned mask_idx = 0; mask_idx < nunits; mask_idx++) { if (integer_zerop (VECTOR_CST_ELT (mask, mask_idx))) { @@ -3371,9 +3387,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, } else length += elsz; - mask_idx++; } - while (known_lt (mask_idx, TYPE_VECTOR_SUBPARTS (vectype))); if (length != 0) { pd.rhs_off = start; @@ -3389,7 +3403,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, { pd.offset = offset2i; pd.size = (tree_to_uhwi (len) - + -tree_to_shwi (bias)) * BITS_PER_UNIT; + + tree_to_shwi (bias)) * BITS_PER_UNIT; if (BYTES_BIG_ENDIAN) pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype)); else -- 2.36.1