From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F3BB23858002 for ; Thu, 9 Mar 2023 12:10:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F3BB23858002 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 352934B3; Thu, 9 Mar 2023 04:11:36 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4230C3F67D; Thu, 9 Mar 2023 04:10:52 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,segher@kernel.crashing.org, richard.sandiford@arm.com Cc: segher@kernel.crashing.org Subject: [PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594] References: Date: Thu, 09 Mar 2023 12:10:51 +0000 In-Reply-To: (Richard Sandiford's message of "Thu, 09 Mar 2023 12:08:33 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-33.6 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: g:c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f uses nonzero_bits information to convert sign_extends into zero_extends. That change is semantically correct in itself, but for the testcase in the PR, it leads to a series of unfortunate events, as described below. We try to combine: Trying 24 -> 25: 24: r116:DI=sign_extend(r115:SI) REG_DEAD r115:SI 25: r117:SI=[r116:DI*0x4+r118:DI] REG_DEAD r116:DI REG_EQUAL [r116:DI*0x4+`constellation_64qam'] which previously succeeded, giving: (set (reg:SI 117 [ constellation_64qam[_5] ]) (mem/u:SI (plus:DI (mult:DI (sign_extend:DI (reg:SI 115)) (const_int 4 [0x4])) (reg/f:DI 118)) [1 constellation_64qam[_5]+0 S4 A32])) However, nonzero_bits can tell that only the low 6 bits of r115 can be nonzero. The commit above therefore converts the sign_extend to a zero_extend. Using the same nonzero_bits information, we then "expand" the zero_extend to: (and:DI (subreg:DI (reg:SI r115) 0) (const_int 63)) Substituting into the mult gives the unsimplified expression: (mult:DI (and:DI (subreg:DI (reg:SI r115) 0) (const_int 63)) (const_int 4)) The simplification rules for mult convert this to an ashift by 2. Then, this rule in simplify_shift_const_1: /* If we have (shift (logical)), move the logical to the outside to allow it to possibly combine with another logical and the shift to combine with another shift. This also canonicalizes to what a ZERO_EXTRACT looks like. Also, some machines have (and (shift)) insns. */ moves the shift inside the "and", so that the expression becomes: (and:DI (ashift:DI (subreg:DI (reg:SI r115) 0) (const_int 2)) (const_int 252)) We later recanonicalise to a mult (since this is an address): (and:DI (mult:DI (subreg:DI (reg:SI r115) 0) (const_int 4)) (const_int 252)) But we fail to transform this back to the natural substitution: (mult:DI (zero_extend:DI (reg:SI r115)) (const_int 4)) There are several other cases in which make_compound_operation needs to look more than one level down in order to complete a compound operation. For example: (a) the ashiftrt handling uses extract_left_shift to look through things like logic ops in order to find a partnering ashift operation (b) the "and" handling looks through subregs, xors and iors to find a partnerning lshiftrt This patch takes the same approach for mult. gcc/ PR rtl-optimization/106594 * combine.cc (make_compound_operation_and): Look through multiplications by a power of two. gcc/testsuite/ * gcc.target/aarch64/pr106594.c: New test. --- gcc/combine.cc | 17 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/pr106594.c | 20 ++++++++++++++++++++ 2 files changed, 37 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr106594.c diff --git a/gcc/combine.cc b/gcc/combine.cc index 7d446d02cb4..36d04ad6703 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -7996,6 +7996,23 @@ make_compound_operation_and (scalar_int_mode mode, rtx x, break; } + case MULT: + /* Recurse through a power of 2 multiplication (as can be found + in an address), using the relationship: + + (and (mult X 2**N1) N2) == (mult (and X (lshifrt N2 N1)) 2**N1). */ + if (CONST_INT_P (XEXP (x, 1)) + && pow2p_hwi (INTVAL (XEXP (x, 1)))) + { + int shift = exact_log2 (INTVAL (XEXP (x, 1))); + rtx sub = make_compound_operation_and (mode, XEXP (x, 0), + mask >> shift, in_code, + next_code); + if (sub) + return gen_rtx_MULT (mode, sub, XEXP (x, 1)); + } + break; + default: break; } diff --git a/gcc/testsuite/gcc.target/aarch64/pr106594.c b/gcc/testsuite/gcc.target/aarch64/pr106594.c new file mode 100644 index 00000000000..beda8e050a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr106594.c @@ -0,0 +1,20 @@ +/* { dg-options "-O2" } */ + +extern const int constellation_64qam[64]; + +void foo(int nbits, + const char *p_src, + int *p_dst) { + + while (nbits > 0U) { + char first = *p_src++; + + char index1 = ((first & 0x3) << 4) | (first >> 4); + + *p_dst++ = constellation_64qam[index1]; + + nbits--; + } +} + +/* { dg-final { scan-assembler {ldr\tw[0-9]+, \[x[0-9]+, w[0-9]+, [su]xtw #?2\]} } } */ -- 2.25.1