public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] More jump threading restrictions in the presence of loops.
@ 2021-10-04  9:43 Aldy Hernandez
  2021-10-04 13:31 ` Jeff Law
  2021-10-06 10:22 ` Aldy Hernandez
  0 siblings, 2 replies; 27+ messages in thread
From: Aldy Hernandez @ 2021-10-04  9:43 UTC (permalink / raw)
  To: Jeff Law, Richard Biener, matz
  Cc: Andrew MacLeod, GCC patches, Aldy Hernandez

It is frustrating that virtually all the regressions with the hybrid
threader for VRP, have not been with the engine itself, but with the
independent restrictions we have agreed upon.

The following patch is a collection of discussions with Richi, Jeff,
and Michael Matz regarding jump threading limitations in the presence
of loops, that I hope can lead to further refinements.

As I have mentioned before, most of our threading tests are too
fragile, so in this patch I have distilled various restrictions into
gimple FE tests that I hope can help in maintaining the threader going
forward.  The goal is to have one test with no valid threads
whatsover, and one with exclusively one valid thread per function.
This should make it trivial to maintain this going forward.

I would like to request the relevant experts to not only examine the
patch, but review the tests in this patch, to make sure we agree upon
these restrictions.  I have distilled the smallest possible test for
each restriction and have annotated said tests to make reviewing easy.

Note that the test in ssa-thread-valid.c is a thread that Jeff has
suggested should be an exception to the path crossing loops
restriction, but I have not implemented it yet, because even if we
could loosen the restriction, it would violate Richi's restriction of
a path that rotates a loop.  Comments are highly welcome.

By the way, these restrictions trigger *a lot*.  We seem to be
rotating loops left and right.  So I wouldn't be surprised if this
requires (as usual) a lot of test tweaking.

Untested patch follows.

p.s. Note that I'm just facilitating the discussion.  I'm highly
dependent on the loop experts here ;-).

gcc/ChangeLog:

	* tree-ssa-threadupdate.c
	(jt_path_registry::cancel_invalid_paths): Tweak.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-thread-invalid.c: New test.
	* gcc.dg/tree-ssa/ssa-thread-valid.c: New test.
---
 .../gcc.dg/tree-ssa/ssa-thread-invalid.c      | 102 ++++++++++++++++++
 .../gcc.dg/tree-ssa/ssa-thread-valid.c        |  57 ++++++++++
 gcc/tree-ssa-threadupdate.c                   |  29 ++++-
 3 files changed, 186 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c
new file mode 100644
index 00000000000..bd56a62a4b4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c
@@ -0,0 +1,102 @@
+// { dg-do compile }
+// { dg-options "-O2 -fgimple -fdump-statistics" }
+//
+// This is a collection of seemingly threadble paths that should not be allowed.
+
+void foobar (int);
+
+// Possible thread from 2->4->3, but it would rotate the loop.
+void __GIMPLE (ssa)
+f1 ()
+{
+  int i;
+
+  // Pre-header.
+  __BB(2):
+  goto __BB4;
+
+  // Latch.
+  __BB(3):
+  foobar (i_1);
+  i_5 = i_1 + 1;
+  goto __BB4;
+
+  __BB(4,loop_header(1)):
+  i_1 = __PHI (__BB2: 0, __BB3: i_5);
+  if (i_1 != 101)
+    goto __BB3;
+  else
+    goto __BB5;
+
+  __BB(5):
+  return;
+
+}
+
+// Possible thread from 2->3->5 but threading through the empty latch
+// would create a non-empty latch.
+void __GIMPLE (ssa)
+f2 ()
+{
+  int i;
+
+  // Pre-header.
+  __BB(2):
+  goto __BB3;
+
+  __BB(3,loop_header(1)):
+  i_8 = __PHI (__BB5: i_5, __BB2: 0);
+  foobar (i_8);
+  i_5 = i_8 + 1;
+  if (i_5 != 256)
+    goto __BB5;
+  else
+    goto __BB4;
+
+  // Latch.
+  __BB(5):
+  goto __BB3;
+
+  __BB(4):
+  return;
+
+}
+
+// Possible thread from 3->5->6->3 but this would thread through the
+// header but not exit the loop.
+int __GIMPLE (ssa)
+f3 (int a)
+{
+  int i;
+
+  __BB(2):
+  goto __BB6;
+
+  __BB(3):
+  if (i_1 != 0)
+    goto __BB4;
+  else
+    goto __BB5;
+
+  __BB(4):
+  foobar (5);
+  goto __BB5;
+
+  // Latch.
+  __BB(5):
+  i_7 = i_1 + 1;
+  goto __BB6;
+
+  __BB(6,loop_header(1)):
+  i_1 = __PHI (__BB2: 1, __BB5: i_7);
+  if (i_1 <= 99)
+    goto __BB3;
+  else
+    goto __BB7;
+
+  __BB(7):
+  return;
+
+}
+
+// { dg-final { scan-tree-dump-not "Jumps threaded" "statistics" } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c
new file mode 100644
index 00000000000..e10e877d01a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c
@@ -0,0 +1,57 @@
+// { dg-do compile }
+// { dg-options "-O2 -fgimple -fdump-statistics" }
+//
+// This is a collection of threadable paths.  To simplify maintenance,
+// there should only be one threadable path per function.
+
+int global;
+
+// ** FIXME: This currently fails**
+//
+// There is a path crossing loops through 3->4->5.
+//
+// Jeff has stipulated that...
+//
+// This might be an exception since this goes from inside the loop to
+// outside the loop without entering another loop.  That is, we have
+// found an early exit from the loop that doesn't require us to go
+// through the latch.  In fact, the whole threaded path is logically
+// outside the loop.  So the primary effect is to reduce the number of
+// branches necessary when we exit the loop via this path.
+//
+// This has nice secondary effects.  For example, objects on the
+// threaded path will no longer necessarily be live throughout the
+// loop -- so we can get register allocation improvements.  The
+// threaded path can physically move outside the loop resulting in
+// better icache efficiency, etc.
+//
+// The question we'd have to answer is whether or not exposing this
+// alternate exit path mucks up other loop optimizations, but if we
+// restrict to after the loop optimizer is done, then that's a
+// non-issue.
+int __GIMPLE (ssa)
+f1 (int x)
+{
+  int a;
+
+  __BB(2):
+  a_4 = ~x_3(D);
+  goto __BB4;
+
+  // Latch.
+  __BB(3):
+  global = a_1;
+  goto __BB4;
+
+  __BB(4,loop_header(1)):
+  a_1 = __PHI (__BB2: a_4, __BB3: 0);
+  if (a_1 != 0)
+    goto __BB3;
+  else
+    goto __BB5;
+
+  __BB(5):
+  return;
+}
+
+// { dg-final { scan-tree-dump "Jumps threaded: 1" "statistics" { xfail *-*-* } } }
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index dcabfdb30d2..414ebabe895 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2766,10 +2766,12 @@ bool
 jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path)
 {
   gcc_checking_assert (!path.is_empty ());
-  edge taken_edge = path[path.length () - 1]->e;
-  loop_p loop = taken_edge->src->loop_father;
+  edge entry = path[0]->e;
+  edge exit = path[path.length () - 1]->e;
+  loop_p loop = exit->src->loop_father;
   bool seen_latch = false;
   bool path_crosses_loops = false;
+  bool path_crosses_loop_header = false;
 
   for (unsigned int i = 0; i < path.length (); i++)
     {
@@ -2793,6 +2795,14 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path)
 	  || e->dest->loop_father != loop)
 	path_crosses_loops = true;
 
+      // ?? Avoid threading through loop headers that remain in the
+      // loop, as such threadings tend to create sub-loops which
+      // _might_ be OK ??.
+      if (e->dest->loop_father->header == e->dest
+	  && !flow_loop_nested_p (exit->dest->loop_father,
+				  e->dest->loop_father))
+	path_crosses_loop_header = true;
+
       if (flag_checking && !m_backedge_threads)
 	gcc_assert ((path[i]->e->flags & EDGE_DFS_BACK) == 0);
     }
@@ -2811,6 +2821,21 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path)
       cancel_thread (&path, "Path crosses loops");
       return true;
     }
+  // The path should either start and end in the same loop or exit the
+  // loop it starts in but never enter a loop.  This also catches
+  // creating irreducible loops, not only rotation.
+  if (entry->src->loop_father != exit->dest->loop_father
+      && !flow_loop_nested_p (exit->src->loop_father,
+			      entry->dest->loop_father))
+    {
+      cancel_thread (&path, "Path rotates loop");
+      return true;
+    }
+  if (path_crosses_loop_header)
+    {
+      cancel_thread (&path, "Path crosses loop header but does not exit it");
+      return true;
+    }
   return false;
 }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-10-18 10:14 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04  9:43 [RFC] More jump threading restrictions in the presence of loops Aldy Hernandez
2021-10-04 13:31 ` Jeff Law
2021-10-04 13:36   ` Aldy Hernandez
2021-10-04 15:30     ` Jeff Law
2021-10-04 16:29       ` Michael Matz
2021-10-05 11:22         ` Richard Biener
2021-10-05 12:43           ` Michael Matz
2021-10-05 14:56           ` Jeff Law
2021-10-05 13:33         ` Aldy Hernandez
2021-10-05 15:10           ` Jeff Law
2021-10-05 16:08           ` Jeff Law
2021-10-05 16:22             ` Aldy Hernandez
2021-10-06 13:15           ` Andreas Schwab
2021-10-06 13:47             ` Aldy Hernandez
2021-10-06 15:03               ` Martin Sebor
2021-10-06 16:22                 ` Aldy Hernandez
2021-10-06 17:03                   ` Aldy Hernandez
2021-10-06 19:11                     ` Martin Sebor
2021-10-06 23:00                   ` Jeff Law
2021-10-06 23:06               ` Jeff Law
2021-10-07  8:15                 ` Aldy Hernandez
2021-10-07 13:35                   ` Jeff Law
2021-10-06 10:22 ` Aldy Hernandez
2021-10-14  9:25   ` Aldy Hernandez
2021-10-14 14:23     ` Jeff Law
2021-10-17  1:32   ` Jeff Law
2021-10-18 10:14     ` Aldy Hernandez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).