[RFC] Remove VRP threader passes in exchange for better threading pre-VRP.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Aldy Hernandez <aldyh@redhat.com>
To: Jeff Law <jeffreyalaw@gmail.com>,
	Richard Biener <richard.guenther@gmail.com>,
	GCC patches <gcc-patches@gcc.gnu.org>
Subject: [RFC] Remove VRP threader passes in exchange for better threading pre-VRP.
Date: Mon, 18 Oct 2021 15:41:17 +0200	[thread overview]
Message-ID: <20211018134117.410106-1-aldyh@redhat.com> (raw)

The jump threading bits seem to have stabilized.  The one or two open
PRs can be fixed by the pending loop threading restrictions to loop
rotation and loop headers.  With all the pieces in play, we can
finally explore altering the pipeline to reduce the jump threading
passes.

I know the jump threaders have become a confusing mess, and I have
added to this sphaghetti, but please bear with me, the goal is to
reduce the threaders to one code base.

As a quick birds-eye view, we have 2 engines:

1. The new backward threader using a path solver based on the ranger.
It can work in two modes-- a quick mode that assumes any SSA outside
the path is VARYING, and a fully resolving mode.

All the *.thread passes are running with this engine, but in quick
mode.

2. The old-old forward threader used by VRP and DOM.  The DOM client
uses its internal structures as well as evrp to resolve conditionals.
Whereas the VRP threader uses the engine in #1 in fully resolving
mode.

The VRP threaders are running with this engine, but using the solver
in #1.  That is, the VRP threaders use the old forward threader for
path discovery, but the new backward threader to solve candidate
paths (hybrid-threader).  This was always a stop-gap while we moved
everyone to #1.

The DOM threader is the last remaining threader with no changes
whatsoever from the previous release.  It uses the forward threader,
with all the evrp + DOM goo.

It doesn't matter if you're all confused, we're about to make things
simpler.

I've been experimenting with reducing the total number of threading
passes, and I'd like to see if there's consensus/stomach for altering
the pipeline.  Note, that the goal is to remove forward threader clients,
not the other way around.  So, we should prefer to remove a VRP threader
instance over a *.thread one immediately before VRP.

After some playing, it looks like if we enable fully-resolving mode in
the *.thread passes immediately preceeding VRP, we can remove the VRP
threading passes altogether, thus removing 2 threading passes (and
forward threading passes at that!).

The numbers look really good.  We get 6874 more jump threading passes
over my boostrap .ii files for a total 3.74% increase.  And we get that
while running marginally faster (0.19% faster, so noise).

The details are:

*** Mainline (with the loop rotation patch):
          ethread:64722
	  dom:31246
	  thread:73709
	  vrp-thread:14357
	  total:  184034

*** Removing all the VRP threaders.
         ethread:64722
         thread-full:76493
         dom:33648
         thread:16045
         total:  190908

Notice that not only do we get a lot more threads in thread-full
(resolving mode), but even DOM can get more jump threads.

This doesn't come without risks though.  The main issue is that we would
be removing one engine (forward threader), with another one (backward
threader).  But the good news is that (a) we've been using the new
backward threader for a while now (b) even the VRP threader in
mainline is using the backward threader solver.  So, all that would
really be changing would be the path discovery bits and custom copier
in the forward threader, with the backward threader bit and the
generic copier.

I personally don't think this is a big risk, because we've done all
the hard work already and it's all being stressed in one way or another.

The untested patch below is all that would need to happen, albeit with
copius changes to tests.

I'd like to see where we all stand on this before I start chugging away
at testing and other time consuming tasks.

Note, that all the relevant bits will still be tested in this release,
so I'm not gonna cry one way or another.  But it'd be nice to start
reducing passes, especially if we get a 3.74% increase in jump threads
for no time penalty.

Finally, even if we all agree, I think we should give us a week after the
loop rotation restrictions go in, because threading changes always cause
a party of unexpected things to happen.

Shoot!

diff --git a/gcc/passes.def b/gcc/passes.def
index c11c237f6d2..96fc230e780 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -210,9 +210,8 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_return_slot);
       NEXT_PASS (pass_fre, true /* may_iterate */);
       NEXT_PASS (pass_merge_phi);
-      NEXT_PASS (pass_thread_jumps);
+      NEXT_PASS (pass_thread_jumps_full);
       NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
-      NEXT_PASS (pass_vrp_threader);
       NEXT_PASS (pass_dse);
       NEXT_PASS (pass_dce);
       /* pass_stdarg is always run and at this point we execute
@@ -336,9 +335,9 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_thread_jumps);
       NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
       NEXT_PASS (pass_strlen);
-      NEXT_PASS (pass_thread_jumps);
+      NEXT_PASS (pass_thread_jumps_full);
       NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
-      NEXT_PASS (pass_vrp_threader);
+      /* ?? Is this still needed.  ?? */
       /* Threading can leave many const/copy propagations in the IL.
 	 Clean them up.  Instead of just copy_prop, we use ccp to
 	 compute alignment and nonzero bits.  */

gcc/ChangeLog:

	* passes.def: Replace pass_thread_jumps that happen before VRP
	with pass_thread_jumps_full.  Remove pass_vrp_threader passes.
---
 gcc/passes.def | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index c11c237f6d2..96fc230e780 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -210,9 +210,8 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_return_slot);
       NEXT_PASS (pass_fre, true /* may_iterate */);
       NEXT_PASS (pass_merge_phi);
-      NEXT_PASS (pass_thread_jumps);
+      NEXT_PASS (pass_thread_jumps_full);
       NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
-      NEXT_PASS (pass_vrp_threader);
       NEXT_PASS (pass_dse);
       NEXT_PASS (pass_dce);
       /* pass_stdarg is always run and at this point we execute
@@ -336,9 +335,9 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_thread_jumps);
       NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
       NEXT_PASS (pass_strlen);
-      NEXT_PASS (pass_thread_jumps);
+      NEXT_PASS (pass_thread_jumps_full);
       NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
-      NEXT_PASS (pass_vrp_threader);
+      /* ?? Is this still needed.  ?? */
       /* Threading can leave many const/copy propagations in the IL.
 	 Clean them up.  Instead of just copy_prop, we use ccp to
 	 compute alignment and nonzero bits.  */
-- 
2.31.1

next             reply	other threads:[~2021-10-18 13:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18 13:41 Aldy Hernandez [this message]
2021-10-18 14:03 ` Aldy Hernandez
2021-10-19  6:52   ` Richard Biener
2021-10-19  7:33     ` Aldy Hernandez
2021-10-19  8:40       ` Richard Biener
2021-10-19  9:06         ` Aldy Hernandez
2021-10-19  9:16           ` Aldy Hernandez
2021-10-19 23:06       ` Jeff Law
2021-10-19 23:00   ` Jeff Law
2021-10-20  9:27     ` Aldy Hernandez
2021-10-20 12:32       ` Andrew MacLeod
2021-10-20 13:08         ` Aldy Hernandez
2021-10-20 17:55       ` Jeff Law
2021-10-19 22:58 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211018134117.410106-1-aldyh@redhat.com \
    --to=aldyh@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).