[gcc r12-3430] Improve LIM fill_always_executed

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

* [gcc r12-3430] Improve LIM fill_always_executed_in computation
@ 2021-09-09 10:56 Richard Biener
  0 siblings, 0 replies; only message in thread
From: Richard Biener @ 2021-09-09 10:56 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:013cfc648405a8a118d07436f103e4d70224fe00

commit r12-3430-g013cfc648405a8a118d07436f103e4d70224fe00
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Sep 9 11:50:20 2021 +0200

    Improve LIM fill_always_executed_in computation
    
    Currently the DOM walk over a loop body does not walk into not
    always executed subloops to avoid scalability issues since doing
    so makes the walk quadratic in the loop depth.  It turns out this
    is not an issue in practice and even with a loop depth of 1800
    this function is way off the radar.
    
    So the following patch removes the limitation, replacing it with
    a comment.
    
    2021-09-09  Richard Biener  <rguenther@suse.de>
    
            * tree-ssa-loop-im.c (fill_always_executed_in_1): Walk
            into all subloops.
    
            * gcc.dg/tree-ssa/ssa-lim-17.c: New testcase.

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c | 20 ++++++++++++++++++++
 gcc/tree-ssa-loop-im.c                     | 16 +++++++---------
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c
new file mode 100644
index 00000000000..1c840e554b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-lim2-details" } */
+
+volatile int flag, bar;
+double foo (double *valp)
+{
+  double sum = 0;
+  for (int i = 0; i < 256; ++i)
+    {
+      if (flag)
+	for (int j = 0; j < 256; ++j)
+	  bar = flag;
+      if (flag)
+        sum += 1.;
+      sum += *valp; // we should move the load of *valp out of the loop
+    }
+  return sum;
+}
+
+/* { dg-final { scan-tree-dump-times "Moving statement" 1 "lim2" } } */
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 5d6845478e7..4b187c2cdaf 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3074,15 +3074,13 @@ fill_always_executed_in_1 (class loop *loop, sbitmap contains_call)
 	    break;
 
 	  if (bb->loop_father->header == bb)
-	    {
-	      if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
-		break;
-
-	      /* In a loop that is always entered we may proceed anyway.
-		 But record that we entered it and stop once we leave it
-		 since it might not be finite.  */
-	      inn_loop = bb->loop_father;
-	    }
+	    /* Record that we enter into a subloop since it might not
+	       be finite.  */
+	    /* ???  Entering into a not always executed subloop makes
+	       fill_always_executed_in quadratic in loop depth since
+	       we walk those loops N times.  This is not a problem
+	       in practice though, see PR102253 for a worst-case testcase.  */
+	    inn_loop = bb->loop_father;
 
 	  /* Walk the body of LOOP sorted by dominance relation.  Additionally,
 	     if a basic block S dominates the latch, then only blocks dominated


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-09 10:56 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 10:56 [gcc r12-3430] Improve LIM fill_always_executed_in computation Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).