From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1130) id 99BAA385801F; Tue, 18 Jan 2022 12:20:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 99BAA385801F MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Richard Sandiford To: gcc-cvs@gcc.gnu.org Subject: [gcc r12-6669] aarch64: Fix overly optimistic LDP/STP matching [PR104005] X-Act-Checkin: gcc X-Git-Author: Richard Sandiford X-Git-Refname: refs/heads/master X-Git-Oldrev: d21db05b6f44f8cb6df8da5af276df0c4bb3a6c9 X-Git-Newrev: 38ec23fafb167ddfe840d7bb22b3e943d8a7d29e Message-Id: <20220118122009.99BAA385801F@sourceware.org> Date: Tue, 18 Jan 2022 12:20:09 +0000 (GMT) X-BeenThere: gcc-cvs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2022 12:20:09 -0000 https://gcc.gnu.org/g:38ec23fafb167ddfe840d7bb22b3e943d8a7d29e commit r12-6669-g38ec23fafb167ddfe840d7bb22b3e943d8a7d29e Author: Richard Sandiford Date: Tue Jan 18 12:20:00 2022 +0000 aarch64: Fix overly optimistic LDP/STP matching [PR104005] In g:526e1639aa76b0a8496b0dc3a3ff2c450229544e I'd added support for finding more consecutive MEMs. However, the check was too eager, in that it matched MEM_REFs with the same base address even if that base address was an arbitrary SSA name. This can give wrong results if a MEM_REF from one loop iteration is compared with a MEM_REF from another (e.g. after rtl unrolling). In principle, we could still accept MEM_REFs based on the same incoming SSA name, but there seems to be no out-of-the-box API for doing that. Adding a new one at this stage in GCC 12 doesn't feel like a good risk/reward trade-off. This patch therefore restricts the MEM_EXPR comparison to base decls only, excluding all MEM_REFs. It means we lose all the new STPs in the PR testcase but keep the ones in the original stp_1.c testcase. gcc/ PR target/104005 * config/aarch64/aarch64.cc (aarch64_check_consecutive_mems): When using MEM_EXPR, require the base to be a decl. gcc/testsuite/ PR target/104005 * gcc.target/aarch64/pr104005.c: New test. Diff: --- gcc/config/aarch64/aarch64.cc | 1 + gcc/testsuite/gcc.target/aarch64/pr104005.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index fdf0c9bd5b8..296145e6008 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -24747,6 +24747,7 @@ aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, bool *reversed) &expr_offset2); if (!expr_base1 || !expr_base2 + || !DECL_P (expr_base1) || !operand_equal_p (expr_base1, expr_base2, OEP_ADDRESS_OF)) return false; diff --git a/gcc/testsuite/gcc.target/aarch64/pr104005.c b/gcc/testsuite/gcc.target/aarch64/pr104005.c new file mode 100644 index 00000000000..09dd81910eb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr104005.c @@ -0,0 +1,17 @@ +/* { dg-options "-O2 -funroll-loops" } */ + +typedef int v2 __attribute__((vector_size(8))); + +void f(void) { + v2 v[1024]; + v2 *ptr = v; + for (int i = 0; i < 512; ++i) + { + ptr[0][0] = 0; + asm volatile ("":::"memory"); + ptr[0][1] = 1; + ptr += 2; + } +} + +/* { dg-final { scan-assembler-not {\tstp\t} } } */