From: Richard Biener <richard.guenther@gmail.com>
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Aldy Hernandez <aldyh@redhat.com>, GCC Mailing List <gcc@gcc.gnu.org>
Subject: Re: replacing the backwards threader and more
Date: Mon, 14 Jun 2021 08:40:01 +0200 [thread overview]
Message-ID: <CAFiYyc24R2GjXTtPeN0jOQKF6jtQvijnFF1gktzDTMMPMrb1nw@mail.gmail.com> (raw)
In-Reply-To: <f127eac2-5d65-390a-ddea-3df1006fdd84@gmail.com>
On Mon, Jun 14, 2021 at 12:02 AM Jeff Law via Gcc <gcc@gcc.gnu.org> wrote:
>
>
>
> On 6/9/2021 5:48 AM, Aldy Hernandez wrote:
> > Hi Jeff. Hi folks.
> >
> > What started as a foray into severing the old (forward) threader's
> > dependency on evrp, turned into a rewrite of the backwards threader
> > code. I'd like to discuss the possibility of replacing the current
> > backwards threader with a new one that gets far more threads and can
> > potentially subsume all threaders in the future.
> >
> > I won't include code here, as it will just detract from the high level
> > discussion. But if it helps, I could post what I have, which just
> > needs some cleanups and porting to the latest trunk changes Andrew has
> > made.
> >
> > Currently the backwards threader works by traversing DEF chains
> > through PHIs leading to possible paths that start in a constant. When
> > such a path is found, it is checked to see if it is profitable, and if
> > so, the constant path is threaded. The current implementation is
> > rather limited since backwards paths must end in a constant. For
> > example, the backwards threader can't get any of the tests in
> > gcc.dg/tree-ssa/ssa-thread-14.c:
> >
> > if (a && b)
> > foo ();
> > if (!b && c)
> > bar ();
> >
> > etc.
> Right. And these kinds of cases are particularly interesting to capture
> -- not only do you remove the runtime test/compare, all the setup code
> usually dies as well. I can't remember who, but someone added some bits
> to detect these cases in DOM a while back and while the number of
> additional jumps threaded wasn't great, the overall impact was much
> better than we initially realized. Instead of allowign removal of a
> single compare/branch, it typically allowed removal of a chain of
> logicals that fed the conditional.
>
> >
> > After my refactoring patches to the threading code, it is now possible
> > to drop in an alternate implementation that shares the profitability
> > code (is this path profitable?), the jump registry, and the actual
> > jump threading code. I have leveraged this to write a ranger-based
> > threader that gets every single thread the current code gets, plus
> > 90-130% more.
> Sweet.
>
> >
> > Here are the details from the branch, which should be very similar to
> > trunk. I'm presenting the branch numbers because they contain
> > Andrew's upcoming relational query which significantly juices up the
> > results.
> Yea, I'm not surprised that the relational query helps significantly
> here. And I'm not surprised that we can do much better with the
> backwards threader with a rewrite.
>
> Much of the ranger design was with the idea behind using it in the
> backwards jump threader in mind. Backwards threading is, IMHO, a much
> better way to think about the problem. THe backwards threader also has
> a much stronger region copier -- so we don't have to live with the
> various limitations of the old jump threading approach.
>
>
>
> >
> > New threader:
> > ethread:65043 (+3.06%)
> > dom:32450 (-13.3%)
> > backwards threader:72482 (+89.6%)
> > vrp:40532 (-30.7%)
> > Total threaded: 210507 (+6.70%)
> >
> > This means that the new code gets 89.6% more jump threading
> > opportunities than the code I want to replace. In doing so, it
> > reduces the amount of DOM threading opportunities by 13.3% and by
> > 30.7% from the VRP jump threader. The total improvement across the
> > jump threading opportunities in the compiler is 6.70%.
> This looks good at first glance. It's worth noting that the backwards
> threader runs before the others, so, yea, as it captures more stuff I
> would expect DOM/VRP to capture fewer things. It would be interesting
> to know the breakdown of things caught by VRP1/VRP2 and how much of that
> is secondary opportunities that are only appearing because we've done a
> better job earlier.
>
> And just to be clear, I expect that we're going to leave some of those
> secondary opportunities on the table -- we just don't want it to be too
> many :-) When I last looked at this my sense was wiring the backwards
> threader and ranger together should be enough to subsume VRP1/VRP2 jump
> threading.
>
> >
> > However, these are pessimistic numbers...
> >
> > I have noticed that some of the threading opportunities that DOM and
> > VRP now get are not because they're smarter, but because they're
> > picking up opportunities that the new code exposes. I experimented
> > with running an iterative threader, and then seeing what VRP and DOM
> > could actually get. This is too expensive to do in real life, but it
> > at least shows what the effect of the new code is on DOM/VRP's abilities:
> >
> > Iterative threader:
> > ethread:65043 (+3.06%)
> > dom:31170 (-16.7%)
> > thread:86717 (+127%)
> > vrp:33851 (-42.2%)
> > Total threaded: 216781 (+9.90%)
> >
> > This means that the new code not only gets 127% more cases, but it
> > reduces the DOM and VRP opportunities considerably (16.7% and 42.2%
> > respectively). The end result is that we have the possibility of
> > getting almost 10% more jump threading opportunities in the entire
> > compilation run.
> >
> > (Note that the new code gets even more opportunities, but I'm only
> > reporting the profitable ones that made it all the way through to the
> > threader backend, and actually eliminated a branch.)
> Thanks for clarifying that. It was one of the questions that first
> popped into my mind.
>
> >
> > The overall compilation hit from this work is currently 1.38% as
> > measured by callgrind. We should be able to reduce this a bit, plus
> > we could get some of that back if we can replace the DOM and VRP
> > threaders (future work).
> Given how close we were to dropping the VRP threaders before, I would
> support dropping them at the same time. That gives you a bit more
> compile-time budget.
>
> >
> > My proposed implementation should be able to get any threading
> > opportunity, and will get more as range-ops and ranger improve.
> >
> > I can go into the details if necessary, but the gist of it is that we
> > leverage the import facility in the ranger to only look up paths that
> > have a direct repercussion in the conditional being threaded, thus
> > reducing the search space. This enhanced path discovery, plus an
> > engine to resolve conditionals based on knowledge from a CFG path, is
> > all that is needed to register new paths. There is no limit to how
> > far back we look, though in practice, we stop looking once a path is
> > too expensive to continue the search in a given direction.
> Right. That's one of the great things about Ranger -- it can
> dramatically reduce the search space.
>
> >
> > The solver API is simple:
> >
> > // This class is a thread path solver. Given a set of BBs indicating
> > // a path through the CFG, range_in_path() will return the range
> > // of an SSA as if the BBs in the path would have been executed in
> > // order.
> > //
> > // Note that the blocks are in reverse order, thus the exit block is
> > path[0].
> >
> > class thread_solver : gori_compute
> > {
> >
> > public:
> > thread_solver (gimple_ranger &ranger);
> > virtual ~thread_solver ();
> > void set_path (const vec<basic_block> *, const bitmap_head *imports);
> > void range_in_path (irange &, tree name);
> > void range_in_path (irange &, gimple *);
> > ...
> > };
> >
> > Basically, as we're discovering paths, we ask the solver what the
> > value of the final conditional in a BB is in a given path. If it
> > resolves, we register the path.
> Exactly. Given a path, do we know enough to resolve the conditional at
> the end. If so register the path as a potential jump threading opportunity.
>
> >
> > A follow-up project would be to analyze what DOM/VRP are actually
> > getting that we don't, because in theory with an enhanced ranger, we
> > should be able to get everything they do (minus some float stuff, and
> > some CSE things DOM does). However, IMO, this is good enough to at
> > least replace the current backwards threading code.
> I bet it's going to be tougher to remove DOM's threader. It knows how
> to do thinks like resolve memory references using temporary equivalences
> and such. But I bet it's enough to drop the VRP based threaders.
Yes. In fact I am wondering if adding threading to the not iterating FRE
would make it possible to drop DOM, replacing it with instances of FRE.
Richard.
next prev parent reply other threads:[~2021-06-14 6:40 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-09 11:48 Aldy Hernandez
2021-06-09 12:09 ` Richard Biener
2021-06-09 15:34 ` Aldy Hernandez
2021-06-09 16:18 ` Richard Biener
2021-06-09 19:47 ` Jeff Law
2021-06-09 20:39 ` Aldy Hernandez
2021-06-09 20:43 ` Jeff Law
2021-06-21 14:40 ` Aldy Hernandez
2021-06-24 16:14 ` Jeff Law
2021-06-25 7:54 ` Richard Biener
2021-06-25 7:58 ` Richard Biener
2021-06-25 16:20 ` Aldy Hernandez
2021-06-25 17:19 ` Martin Sebor
2021-06-28 6:17 ` Aldy Hernandez
2021-06-28 22:29 ` Martin Sebor
2021-06-29 0:32 ` Aldy Hernandez
2021-06-29 22:16 ` Martin Sebor
2021-06-30 6:07 ` Aldy Hernandez
2021-06-28 8:22 ` Richard Biener
2021-06-28 12:05 ` Aldy Hernandez
2021-06-13 21:59 ` Jeff Law
2021-06-14 6:40 ` Richard Biener [this message]
2021-06-15 4:03 ` Jeff Law
2021-06-15 5:39 ` Aldy Hernandez
2021-06-15 16:35 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc24R2GjXTtPeN0jOQKF6jtQvijnFF1gktzDTMMPMrb1nw@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=aldyh@redhat.com \
--cc=gcc@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).