Re: [PATCH] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jeff Law <jeffreyalaw@gmail.com>
To: Roger Sayle <roger@nextmovesoftware.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation.
Date: Mon, 16 Oct 2023 13:27:22 -0600	[thread overview]
Message-ID: <e78f2a27-401d-49f7-9454-dad9f3b835a3@gmail.com> (raw)
In-Reply-To: <007b01d9ff4c$ea0231b0$be069510$@nextmovesoftware.com>



On 10/15/23 03:49, Roger Sayle wrote:
> 
> Hi Jeff,
> Thanks for the speedy review(s).
> 
>> From: Jeff Law <jeffreyalaw@gmail.com>
>> Sent: 15 October 2023 00:03
>> To: Roger Sayle <roger@nextmovesoftware.com>; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in
>> make_compound_operation.
>>
>> On 10/14/23 16:14, Roger Sayle wrote:
>>>
>>> This patch is my proposed solution to PR rtl-optimization/91865.
>>> Normally RTX simplification canonicalizes a ZERO_EXTEND of a
>>> ZERO_EXTEND to a single ZERO_EXTEND, but as shown in this PR it is
>>> possible for combine's make_compound_operation to unintentionally
>>> generate a non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is
>>> unlikely to be matched by the backend.
>>>
>>> For the new test case:
>>>
>>> const int table[2] = {1, 2};
>>> int foo (char i) { return table[i]; }
>>>
>>> compiling with -O2 -mlarge on msp430 we currently see:
>>>
>>> Trying 2 -> 7:
>>>       2: r25:HI=zero_extend(R12:QI)
>>>         REG_DEAD R12:QI
>>>       7: r28:PSI=sign_extend(r25:HI)#0
>>>         REG_DEAD r25:HI
>>> Failed to match this instruction:
>>> (set (reg:PSI 28 [ iD.1772 ])
>>>       (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]))))
>>>
>>> which results in the following code:
>>>
>>> foo:    AND     #0xff, R12
>>>           RLAM.A #4, R12 { RRAM.A #4, R12
>>>           RLAM.A  #1, R12
>>>           MOVX.W  table(R12), R12
>>>           RETA
>>>
>>> With this patch, we now see:
>>>
>>> Trying 2 -> 7:
>>>       2: r25:HI=zero_extend(R12:QI)
>>>         REG_DEAD R12:QI
>>>       7: r28:PSI=sign_extend(r25:HI)#0
>>>         REG_DEAD r25:HI
>>> Successfully matched this instruction:
>>> (set (reg:PSI 28 [ iD.1772 ])
>>>       (zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ]))) allowing
>>> combination of insns 2 and 7 original costs 4 + 8 = 12 replacement
>>> cost 8
>>>
>>> foo:    MOV.B   R12, R12
>>>           RLAM.A  #1, R12
>>>           MOVX.W  table(R12), R12
>>>           RETA
>>>
>>>
>>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>>> and make -k check, both with and without --target_board=unix{-m32}
>>> with no new failures.  Ok for mainline?
>>>
>>> 2023-10-14  Roger Sayle  <roger@nextmovesoftware.com>
>>>
>>> gcc/ChangeLog
>>>           PR rtl-optimization/91865
>>>           * combine.cc (make_compound_operation): Avoid creating a
>>>           ZERO_EXTEND of a ZERO_EXTEND.
>>>
>>> gcc/testsuite/ChangeLog
>>>           PR rtl-optimization/91865
>>>           * gcc.target/msp430/pr91865.c: New test case.
>> Neither an ACK or NAK at this point.
>>
>> The bug report includes a patch from Segher which purports to fix this in simplify-
>> rtx.  Any thoughts on Segher's approach and whether or not it should be
>> considered?
>>
>> The BZ also indicates that removal of 2 patterns from msp430.md would solve this
>> too (though it may cause regressions elsewhere?).  Any thoughts on that approach
>> as well?
>>
> 
> Great questions.  I believe Segher's proposed patch (in comment #4) was an
> msp430-specific proof-of-concept workaround rather than intended to be fix.
> Eliminating a ZERO_EXTEND simply by changing the mode of a hard register
> is not a solution that'll work on many platforms (and therefore not really suitable
> for target-independent middle-end code in the RTL optimizers).
Thanks.  I didn't really look at Segher's patch, so thanks for digging 
into it.  Certainly just flipping the mode of the hard register isn't 
correct.


> 
> The underlying issue, which is applicable to all targets, is that combine.cc's
> make_compound_operation is expected to reverse the local transformations
> made by expand_compound_operation.  Hence, if an RTL expression is
> canonical going into expand_compound_operation, it is expected (hoped)
> to be canonical (and equivalent) coming out of make_compound_operation.
In theory, correct.


> 
> Hence, rather than be a MSP430 specific issue, no target should expect (or
> be expected to see) a ZERO_EXTEND of a ZERO_EXTEND, or a SIGN_EXTEND
> of a ZERO_EXTEND in the RTL stream.  Much like a binary operator with two
> CONST_INT operands, or a shift by zero, it's something the middle-end might
> reasonably be expected to clean-up. [Yeah, I know... 😊]
Agreed.



> 
>>> (set (reg:PSI 28 [ iD.1772 ])
>>>       (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]))))
> 
> As a rule of thumb, if the missed optimization bug report includes combine's
> diagnostic "Failed to match this instruction:", things can be improved by adding
> a pattern (often a define_insn_and_split) that matches the shown RTL.
Yes, but we also need to ponder if that's the right way to fix any given 
problem.  Sometimes we're going to be better off simplifying elsewhere 
in the pipeline.  I think we can agree this is one of the cases where 
matching the RTL in the backend is not the desired approach.

Occasionally things like those two patterns show up for various reasons. 
  Hopefully they can be removed :-)  Though the first looks awful close 
to something I did for the mn102 (not to be confused with the mn103) 
eons ago.  Partial modes aren't exactly handled well.

> 
> In this case the perhaps reasonable assumption is that the above would/should
> (normally) match the backend's existing (zero_extend:PSI (reg:QI ...)) insn pattern.
> Or that's my understanding of why this PR is classified as rtl-optimization and
> not target.
I wouldn't put a lot of faith in the classification ;-)


> 
> Finally, my patch shouldn't block a (updated corrected) variant of Segher's patch
> or other solution to PR 91865.  The more (safe) solutions the merrier.
Generally agreed.

jeff

next prev parent reply	other threads:[~2023-10-16 19:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-14 22:14 Roger Sayle
2023-10-14 23:02 ` Jeff Law
2023-10-15  9:49   ` Roger Sayle
2023-10-16 19:27     ` Jeff Law [this message]
2023-10-17  8:43       ` Richard Biener
2023-10-19 15:20 ` Jeff Law
2023-10-25  9:21   ` [PATCH v2] " Roger Sayle
2023-10-25 16:26     ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e78f2a27-401d-49f7-9454-dad9f3b835a3@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=roger@nextmovesoftware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).