public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jeffreyalaw@gmail.com>
To: gcc-patches@gcc.gnu.org, jlaw@ventanamicro.com,
	rdapp.gcc@gmail.com, richard.sandiford@arm.com
Subject: Re: [PATCH] Add a late-combine pass [PR106594]
Date: Sun, 7 Jan 2024 22:03:37 -0700	[thread overview]
Message-ID: <328721cb-13f6-402a-b967-a085a81393bd@gmail.com> (raw)
In-Reply-To: <mptil473cq9.fsf@arm.com>



On 1/5/24 10:35, Richard Sandiford wrote:
> Jeff Law <jeffreyalaw@gmail.com> writes:
>> On 10/24/23 12:49, Richard Sandiford wrote:
>>> This patch adds a combine pass that runs late in the pipeline.
>>> There are two instances: one between combine and split1, and one
>>> after postreload.
>> So have you done any investigation on cases caught by your new pass
>> between combine and split1 to characterize them?  In particular do they
>> point at solvable problems in combine?  Or do you forsee this subsuming
>> the old combiner pass at some point in the distant future?
> 
> Examples like the PR are the main motivation for the pre-RA pass.
> There we had an extension that could be combined into an address,
> but no longer was after GCC 13.
> 
> The PR itself could in principle be fixed in combine (various
> approaches were suggested, but not accepted).  But the same problem
> applies to multiple uses of extensions.  fwprop can't handle it because
> individual propagations are not a win in isolation.  And combine has
> a limit of combining 4 insns (with a maximum of 2 output insns, IIRC).
> So I don't think either of the existing passes scale to the general case.
Oh, that discussion :(


>
> 
>> rth and I sketched out an SSA based RTL combine at some point in the
>> deep past.  The key goal we were trying to achieve was combining across
>> blocks.  We didn't have a functioning RTL SSA form at the time, so it
>> never went to any implementation work.  It looks like yours would solve
>> the class of problems rth and I were considering.
> 
> Yeah, I do see some cases where combining across blocks helps.
> The case above is one example of that.  Another is:
Great.  The cases I think rth and I were looking at were inspired by a 
talk at one of the early Cauldrons -- whichever one we had in 
California.  Someone did a talk on cross-block combinations, mostly 
motivated by missed combinations on ARM.


> 
>>> The patch therefore enables the pass by default only on AArch64.
>>> However, I did test the patch with it enabled on x86_64-linux-gnu
>>> as well, which was useful for debugging.
>>>
>>> Bootstrapped & regression-tested on aarch64-linux-gnu and
>>> x86_64-linux-gnu (as posted, with no regressions, and with the
>>> pass enabled by default, with some gcc.target/i386 regressions).
>>> OK to install?
>> I'm going to adjust this slightly so that it's enabled across the board
>> and throw it into the tester tomorrow (tester is busy tonight).  Even if
>> we make it opt-in on a per-port basis, the alternate target testing does
>> seems to find stuff that needs fixing ;-)
> 
> Thanks!  As per our off-list discussion, the cris-elf failures showed
> up a bug in the handling of call arguments.  Here's an updated version
> with that fixed.
Perfect.  I'll spin that in the tester overnight.  Hopefully that fixes 
the other ports that failed to build (as opposed to the testsuite 
failures which I think you've covered as not an issue with the new 
combine pass, but which are instead port issues.




> 
> Yeah.  If I'd posted this earlier in stage 1 (rather than October),
> I might have tried teaching shorten_branches how to handle this.
> But it felt like it could be a bit of a rabbit hole at this stage.
> 
>> Nothing jumps out at horribly wrong.  You might want/need to reject
>> frame related insns in optimizable_set, though I guess if the dwarf2
>> writer isn't complaining, then we haven't mucked things up too bad.
> 
> Ah, yeah, I wondered about that.  But I suppose most prologue insns
> don't really create combination opportunities, since most of them
> either set up the stack (which we wouldn't combine anyway) or sink
> incoming data.  So if there cases where this is a problem, it might
> unfortunately be a case of "see what breaks".
The other issue that's been in the back of my mind is costing.  But I 
think the model here is combine without regards to cost.

Let's let the tester chew on the updated version overnight, see what 
else may pop out, but barring any big surprises, I think we're good to go.



jeff

  reply	other threads:[~2024-01-08  5:03 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-24 18:49 Richard Sandiford
2023-11-30 14:10 ` Ping: " Richard Sandiford
2023-12-11 15:23 ` Richard Sandiford
2023-12-11 16:18   ` Robin Dapp
2023-12-30 15:35 ` Ping^3: " Richard Sandiford
2024-01-01  3:11   ` YunQiang Su
2024-01-05 10:10     ` YunQiang Su
2023-12-30 18:13 ` Segher Boessenkool
2024-01-02  9:47   ` Richard Sandiford
2024-06-24 19:37     ` Segher Boessenkool
2024-06-25 10:31       ` Richard Biener
2024-06-25 17:22         ` YunQiang Su
2024-01-03  4:20 ` Jeff Law
2024-01-05 17:35   ` Richard Sandiford
2024-01-08  5:03     ` Jeff Law [this message]
2024-01-08 11:52       ` Richard Sandiford
2024-01-08 16:14         ` Jeff Law
2024-01-08 16:59           ` Richard Sandiford
2024-01-08 17:10             ` Jeff Law
2024-01-08 19:11               ` Richard Sandiford
2024-01-08 21:42                 ` Jeff Law
2024-01-10 13:01     ` Richard Sandiford
2024-01-10 13:35       ` Richard Biener
2024-01-10 16:27         ` Jeff Law
2024-01-10 16:40       ` Jeff Law
2024-06-21  4:50 ` Hongtao Liu
2024-06-27 16:49 ` nvptx vs. " Thomas Schwinge
2024-06-27 20:27   ` Thomas Schwinge
2024-06-27 21:20     ` Thomas Schwinge
2024-06-27 22:41       ` Thomas Schwinge
2024-06-28 14:01         ` Richard Sandiford
2024-06-28 16:48           ` Richard Sandiford
2024-07-01 11:55             ` Thomas Schwinge
2024-07-01 11:55         ` WIP Move 'pass_fast_rtl_dce' from 'pass_postreload' into 'pass_late_compilation' (was: nvptx vs. [PATCH] Add a late-combine pass [PR106594]) Thomas Schwinge
2024-06-28  6:07       ` nvptx vs. [PATCH] Add a late-combine pass [PR106594] Roger Sayle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=328721cb-13f6-402a-b967-a085a81393bd@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=rdapp.gcc@gmail.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).