From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id 3796A38582B0 for ; Thu, 7 Jul 2022 17:32:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3796A38582B0 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 267HVfOv010787; Thu, 7 Jul 2022 12:31:41 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 267HVf9v010786; Thu, 7 Jul 2022 12:31:41 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Thu, 7 Jul 2022 12:31:40 -0500 From: Segher Boessenkool To: HAO CHEN GUI Cc: gcc-patches , David , "Kewen.Lin" , Peter Bergner Subject: Re: [PATCH v2] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453] Message-ID: <20220707173140.GY25951@gate.crashing.org> References: <368de06c-f6d6-e759-0f91-5df170687346@linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <368de06c-f6d6-e759-0f91-5df170687346@linux.ibm.com> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2022 17:32:43 -0000 Hi! On Thu, Jul 07, 2022 at 04:30:50PM +0800, HAO CHEN GUI wrote: > This patch modifies the combine pattern after recog fails. With a helper It modifies combine itself, not just a pattern in the machine description. > - change_pseudo_and_mask, it converts a single pseudo to the pseudo AND with > a mask when the outer operator is IOR/XOR/PLUS and inner operator is ASHIFT > or AND. The conversion helps pattern to match rotate and mask insn on some > targets. > For test case rlwimi-2.c, current trunk fails on > "scan-assembler-times (?n)^\\s+[a-z]". It reports 21305 times. So my patch > reduces the total number of insns from 21305 to 21279. That is incorrect. You need to figure out what actually changed, and if that is wanted or not, and then write some explanation about that. > * config/rs6000/rs6000.md (plus_ior_xor): Removed. > (anonymous split pattern for plus_ior_xor): Removed. "Remove.", in both cases. Always use imperative in changelogs and commit messages and the like, not some passive tense. > +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is > + ASHIFT/AND, convert a pseudo to psuedo AND with a mask if its nonzero_bits s/psuedo/pseudo/ > + is less than its mode mask. The nonzero_bits in other pass doesn't return > + the same value as it does in combine pass. */ That isn't quite the problem. Later passes can return a mask for nonzero_bits (which means: bits that are not known to be zero) that is not a superset of what was known during combine. If you use nonzero_bits in the insn condition of a define_insn (or define_insn_and_split, same thing under the covers) you then can end up with an insns that is fine during combine, but no longer recog()ed later. > +static bool > +change_pseudo_and_mask (rtx pat) > +{ > + rtx src = SET_SRC (pat); > + if ((GET_CODE (src) == IOR > + || GET_CODE (src) == XOR > + || GET_CODE (src) == PLUS) > + && (((GET_CODE (XEXP (src, 0)) == ASHIFT > + || GET_CODE (XEXP (src, 0)) == AND) > + && REG_P (XEXP (src, 1))))) > + { > + rtx *reg = &XEXP (SET_SRC (pat), 1); Why the extra indirection? SUBST is a macro, it can take lvalues just fine :-) > + machine_mode mode = GET_MODE (*reg); > + unsigned HOST_WIDE_INT nonzero = nonzero_bits (*reg, mode); > + if (nonzero < GET_MODE_MASK (mode)) > + { > + int shift; > + > + if (GET_CODE (XEXP (src, 0)) == ASHIFT) > + shift = INTVAL (XEXP (XEXP (src, 0), 1)); > + else > + shift = ctz_hwi (INTVAL (XEXP (XEXP (src, 0), 1))); > + > + if (shift > 0 > + && ((HOST_WIDE_INT_1U << shift) - 1) >= nonzero) Too many parens. > + { > + unsigned HOST_WIDE_INT mask = (HOST_WIDE_INT_1U << shift) - 1; > + rtx x = gen_rtx_AND (mode, *reg, GEN_INT (mask)); > + SUBST (*reg, x); > + maybe_swap_commutative_operands (SET_SRC (pat)); > + return true; > + } > + } > + } > + return false; Broken indentation. > --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c > +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c > @@ -12,7 +12,7 @@ void rotins (unsigned int x) > b.y = (x<<12) | (x>>20); > } > > -/* { dg-final { scan-assembler-not {\mrlwinm} } } */ > +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */ Why? > +/* { dg-final { scan-assembler-times {\mrlwimi\M} 2 { target ilp32 } } } */ > +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 { target lp64 } } } */ Can this just be /* { dg-final { scan-assembler-times {\mrl[wd]imi\M} 2 } } */ or is it necessary to not want rlwimi on 64-bit? > --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > @@ -2,14 +2,14 @@ > /* { dg-options "-O2" } */ > > /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } } } */ > -/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } } */ You are saying there should be 21279 instructions generated by this test case. What makes you say that? Yes, we regressed some time ago, we generate too many insns in many cases, but that is *bad*. > /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } } } */ > -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } } */ This needs an explanation (and then the 32-bit and 64-bit checks can be merged). This probably needs changes after 4306339798b6 (if it is still wanted?) Segher