public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Allow loop header copying when first iteration condition is known.
@ 2021-11-10 18:20 Aldy Hernandez
  2021-11-10 20:42 ` Jeff Law
  0 siblings, 1 reply; 5+ messages in thread
From: Aldy Hernandez @ 2021-11-10 18:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC patches, Andrew MacLeod, Aldy Hernandez

As discussed in the PR, the loop header copying pass avoids doing so
when optimizing for size.  However, sometimes we can determine the
loop entry conditional statically for the first iteration of the loop.

This patch uses the path solver to determine the outgoing edge
out of preheader->header->xx.  If so, it allows header copying.  Doing
this in the loop optimizer saves us from doing gymnastics in the
threader which doesn't have the context to determine if a loop
transformation is profitable.

I am only returning true in entry_loop_condition_is_static for
a true conditional.  Technically a false conditional is also
provably static, but allowing any boolean value causes a regression
in gfortran.dg/vector_subscript_1.f90.

I would have preferred not passing around the query object, but the
layout of pass_ch and should_duplicate_loop_header_p make it a bit
awkward to get it right without an outright refactor to the
pass.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

	PR tree-optimization/102906
	* tree-ssa-loop-ch.c (entry_loop_condition_is_static): New.
	(should_duplicate_loop_header_p): Call entry_loop_condition_is_static.
	(class ch_base): Add m_ranger and m_query.
	(ch_base::copy_headers): Pass m_query to
	entry_loop_condition_is_static.
	(pass_ch::execute): Allocate and deallocate m_ranger and
	m_query.
	(pass_ch_vect::execute): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr102906.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr102906.c | 17 ++++++++
 gcc/tree-ssa-loop-ch.c                   | 51 ++++++++++++++++++++----
 2 files changed, 60 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102906.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102906.c b/gcc/testsuite/gcc.dg/tree-ssa/pr102906.c
new file mode 100644
index 00000000000..1846f0b6dba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102906.c
@@ -0,0 +1,17 @@
+// { dg-do compile }
+// { dg-options "-Os -fdump-tree-ch-details" }
+
+extern unsigned int foo (int*) __attribute__((pure));
+
+unsigned int
+tr2 (int array[], int n)
+{
+  unsigned int sum = 0;
+  int x;
+  if (n > 0)
+    for (x = 0; x < n; x++)
+      sum += foo (&array[x]);
+  return sum;
+}
+
+// { dg-final { scan-tree-dump-not "Not duplicating.*optimizing for size" "ch2" } }
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index ffb0aa85118..c7d86d751d4 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -35,30 +35,52 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-sccvn.h"
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
+#include "value-range.h"
+#include "gimple-range.h"
+#include "gimple-range-path.h"
 
 /* Duplicates headers of loops if they are small enough, so that the statements
    in the loop body are always executed when the loop is entered.  This
    increases effectiveness of code motion optimizations, and reduces the need
    for loop preconditioning.  */
 
+/* Return true if the condition on the first iteration of the loop can
+   be statically determined.  */
+
+static bool
+entry_loop_condition_is_static (class loop *l, path_range_query *query)
+{
+  edge e = loop_preheader_edge (l);
+  gcond *last = safe_dyn_cast <gcond *> (last_stmt (e->dest));
+
+  if (!last
+      || !irange::supports_type_p (TREE_TYPE (gimple_cond_lhs (last))))
+    return false;
+
+  int_range<2> r;
+  query->compute_ranges (e);
+  query->range_of_stmt (r, last);
+  return r == int_range<2> (boolean_true_node, boolean_true_node);
+}
+
 /* Check whether we should duplicate HEADER of LOOP.  At most *LIMIT
    instructions should be duplicated, limit is decreased by the actual
    amount.  */
 
 static bool
 should_duplicate_loop_header_p (basic_block header, class loop *loop,
-				int *limit)
+				int *limit, path_range_query *query)
 {
   gimple_stmt_iterator bsi;
 
   gcc_assert (!header->aux);
 
-  /* Loop header copying usually increases size of the code.  This used not to
-     be true, since quite often it is possible to verify that the condition is
-     satisfied in the first iteration and therefore to eliminate it.  Jump
-     threading handles these cases now.  */
+  /* Avoid loop header copying when optimizing for size unless we can
+     determine that the loop condition is static in the first
+     iteration.  */
   if (optimize_loop_for_size_p (loop)
-      && !loop->force_vectorize)
+      && !loop->force_vectorize
+      && !entry_loop_condition_is_static (loop, query))
     {
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file,
@@ -267,6 +289,9 @@ class ch_base : public gimple_opt_pass
 
   /* Return true to copy headers of LOOP or false to skip.  */
   virtual bool process_loop_p (class loop *loop) = 0;
+
+  gimple_ranger *m_ranger = NULL;
+  path_range_query *m_query = NULL;
 };
 
 const pass_data pass_data_ch =
@@ -389,7 +414,8 @@ ch_base::copy_headers (function *fun)
 
       exit = NULL;
       n_bbs = 0;
-      while (should_duplicate_loop_header_p (header, loop, &remaining_limit))
+      while (should_duplicate_loop_header_p (header, loop, &remaining_limit,
+					     m_query))
 	{
 	  /* Find a successor of header that is inside a loop; i.e. the new
 	     header after the condition is copied.  */
@@ -526,9 +552,13 @@ pass_ch::execute (function *fun)
   loop_optimizer_init (LOOPS_HAVE_PREHEADERS
 		       | LOOPS_HAVE_SIMPLE_LATCHES
 		       | LOOPS_HAVE_RECORDED_EXITS);
+  m_ranger = new gimple_ranger;
+  m_query = new path_range_query (*m_ranger, /*resolve=*/true);
 
   unsigned int res = copy_headers (fun);
 
+  delete m_query;
+  delete m_ranger;
   loop_optimizer_finalize ();
   return res;
 }
@@ -540,7 +570,12 @@ pass_ch::execute (function *fun)
 unsigned int
 pass_ch_vect::execute (function *fun)
 {
-  return copy_headers (fun);
+  m_ranger = new gimple_ranger;
+  m_query = new path_range_query (*m_ranger, /*resolve=*/true);
+  unsigned int res = copy_headers (fun);
+  delete m_query;
+  delete m_ranger;
+  return res;
 }
 
 /* Apply header copying according to a very simple test of do-while shape.  */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Allow loop header copying when first iteration condition is known.
  2021-11-10 18:20 [PATCH] Allow loop header copying when first iteration condition is known Aldy Hernandez
@ 2021-11-10 20:42 ` Jeff Law
  2021-11-11  7:30   ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Law @ 2021-11-10 20:42 UTC (permalink / raw)
  To: Aldy Hernandez, Richard Biener; +Cc: GCC patches



On 11/10/2021 11:20 AM, Aldy Hernandez via Gcc-patches wrote:
> As discussed in the PR, the loop header copying pass avoids doing so
> when optimizing for size.  However, sometimes we can determine the
> loop entry conditional statically for the first iteration of the loop.
>
> This patch uses the path solver to determine the outgoing edge
> out of preheader->header->xx.  If so, it allows header copying.  Doing
> this in the loop optimizer saves us from doing gymnastics in the
> threader which doesn't have the context to determine if a loop
> transformation is profitable.
>
> I am only returning true in entry_loop_condition_is_static for
> a true conditional.  Technically a false conditional is also
> provably static, but allowing any boolean value causes a regression
> in gfortran.dg/vector_subscript_1.f90.
>
> I would have preferred not passing around the query object, but the
> layout of pass_ch and should_duplicate_loop_header_p make it a bit
> awkward to get it right without an outright refactor to the
> pass.
>
> Tested on x86-64 Linux.
>
> OK?
>
> gcc/ChangeLog:
>
> 	PR tree-optimization/102906
> 	* tree-ssa-loop-ch.c (entry_loop_condition_is_static): New.
> 	(should_duplicate_loop_header_p): Call entry_loop_condition_is_static.
> 	(class ch_base): Add m_ranger and m_query.
> 	(ch_base::copy_headers): Pass m_query to
> 	entry_loop_condition_is_static.
> 	(pass_ch::execute): Allocate and deallocate m_ranger and
> 	m_query.
> 	(pass_ch_vect::execute): Same.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.dg/tree-ssa/pr102906.c: New test.
OK.  It also makes a nice little example of how to use a Ranger within 
an existing pass.

Jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Allow loop header copying when first iteration condition is known.
  2021-11-10 20:42 ` Jeff Law
@ 2021-11-11  7:30   ` Richard Biener
  2021-11-11 10:33     ` Aldy Hernandez
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2021-11-11  7:30 UTC (permalink / raw)
  To: Jeff Law; +Cc: Aldy Hernandez, GCC patches

On Wed, Nov 10, 2021 at 9:42 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 11/10/2021 11:20 AM, Aldy Hernandez via Gcc-patches wrote:
> > As discussed in the PR, the loop header copying pass avoids doing so
> > when optimizing for size.  However, sometimes we can determine the
> > loop entry conditional statically for the first iteration of the loop.
> >
> > This patch uses the path solver to determine the outgoing edge
> > out of preheader->header->xx.  If so, it allows header copying.  Doing
> > this in the loop optimizer saves us from doing gymnastics in the
> > threader which doesn't have the context to determine if a loop
> > transformation is profitable.
> >
> > I am only returning true in entry_loop_condition_is_static for
> > a true conditional.  Technically a false conditional is also
> > provably static, but allowing any boolean value causes a regression
> > in gfortran.dg/vector_subscript_1.f90.
> >
> > I would have preferred not passing around the query object, but the
> > layout of pass_ch and should_duplicate_loop_header_p make it a bit
> > awkward to get it right without an outright refactor to the
> > pass.
> >
> > Tested on x86-64 Linux.
> >
> > OK?
> >
> > gcc/ChangeLog:
> >
> >       PR tree-optimization/102906
> >       * tree-ssa-loop-ch.c (entry_loop_condition_is_static): New.
> >       (should_duplicate_loop_header_p): Call entry_loop_condition_is_static.
> >       (class ch_base): Add m_ranger and m_query.
> >       (ch_base::copy_headers): Pass m_query to
> >       entry_loop_condition_is_static.
> >       (pass_ch::execute): Allocate and deallocate m_ranger and
> >       m_query.
> >       (pass_ch_vect::execute): Same.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.dg/tree-ssa/pr102906.c: New test.
> OK.  It also makes a nice little example of how to use a Ranger within
> an existing pass.

Note if you just test for the condition to be true it will only catch 50%
of the desired cases since we have no idea whether the 'true' edge
is the edge existing the loop or the edge remaining in the loop.
For loop header copying we like to resolve statically to the edge
remaining in the loop, so you want

extract_true_false_edges_from_block (gimple_bb (last), &true_e, &false_e);

/* If neither edge is the exit edge this is not a case we'd like to
   special-case.  */
if (!loop_exit_edge_p (l, true_e) && !loop_exit_edge_p (l, false_e))
 return false;

tree desired_static_value;
if (loop_exit_edge_p (l, true_e))
 desired_static_value = boolean_false_node;
else
  desired_static_value = boolean_true_node;

and test for desired_static_value.

Richard.

> Jeff
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Allow loop header copying when first iteration condition is known.
  2021-11-11  7:30   ` Richard Biener
@ 2021-11-11 10:33     ` Aldy Hernandez
  2021-11-11 10:58       ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Aldy Hernandez @ 2021-11-11 10:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, GCC patches

[-- Attachment #1: Type: text/plain, Size: 3172 bytes --]

On Thu, Nov 11, 2021 at 8:30 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Nov 10, 2021 at 9:42 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
> >
> >
> >
> > On 11/10/2021 11:20 AM, Aldy Hernandez via Gcc-patches wrote:
> > > As discussed in the PR, the loop header copying pass avoids doing so
> > > when optimizing for size.  However, sometimes we can determine the
> > > loop entry conditional statically for the first iteration of the loop.
> > >
> > > This patch uses the path solver to determine the outgoing edge
> > > out of preheader->header->xx.  If so, it allows header copying.  Doing
> > > this in the loop optimizer saves us from doing gymnastics in the
> > > threader which doesn't have the context to determine if a loop
> > > transformation is profitable.
> > >
> > > I am only returning true in entry_loop_condition_is_static for
> > > a true conditional.  Technically a false conditional is also
> > > provably static, but allowing any boolean value causes a regression
> > > in gfortran.dg/vector_subscript_1.f90.
> > >
> > > I would have preferred not passing around the query object, but the
> > > layout of pass_ch and should_duplicate_loop_header_p make it a bit
> > > awkward to get it right without an outright refactor to the
> > > pass.
> > >
> > > Tested on x86-64 Linux.
> > >
> > > OK?
> > >
> > > gcc/ChangeLog:
> > >
> > >       PR tree-optimization/102906
> > >       * tree-ssa-loop-ch.c (entry_loop_condition_is_static): New.
> > >       (should_duplicate_loop_header_p): Call entry_loop_condition_is_static.
> > >       (class ch_base): Add m_ranger and m_query.
> > >       (ch_base::copy_headers): Pass m_query to
> > >       entry_loop_condition_is_static.
> > >       (pass_ch::execute): Allocate and deallocate m_ranger and
> > >       m_query.
> > >       (pass_ch_vect::execute): Same.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >       * gcc.dg/tree-ssa/pr102906.c: New test.
> > OK.  It also makes a nice little example of how to use a Ranger within
> > an existing pass.
>
> Note if you just test for the condition to be true it will only catch 50%
> of the desired cases since we have no idea whether the 'true' edge
> is the edge existing the loop or the edge remaining in the loop.
> For loop header copying we like to resolve statically to the edge
> remaining in the loop, so you want

Ahh, I figured there was some block shuffling needed.

I was cautious not to touch much because of the
gfortran.dg/vector_subscript_1.f90 regression, but now I see that the
test fails for all optimization levels except -Os.  With this fix we
properly fail for all levels.  I assume this is expected ;-).

>
> extract_true_false_edges_from_block (gimple_bb (last), &true_e, &false_e);
>
> /* If neither edge is the exit edge this is not a case we'd like to
>    special-case.  */
> if (!loop_exit_edge_p (l, true_e) && !loop_exit_edge_p (l, false_e))
>  return false;
>
> tree desired_static_value;
> if (loop_exit_edge_p (l, true_e))
>  desired_static_value = boolean_false_node;
> else
>   desired_static_value = boolean_true_node;
>
> and test for desired_static_value.

Thanks for the code!

OK pending tests?

[-- Attachment #2: 0001-Resolve-entry-loop-condition-for-the-edge-remaining-.patch --]
[-- Type: text/x-patch, Size: 1835 bytes --]

From 9609cff278d3ddea9f74b805b395d5c0293a126c Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <aldyh@redhat.com>
Date: Thu, 11 Nov 2021 11:27:07 +0100
Subject: [PATCH] Resolve entry loop condition for the edge remaining in the
 loop.

There is a known failure for gfortran.dg/vector_subscript_1.f90.  It
was previously failing for all optimization levels except -Os.
Getting the loop header copying right, now makes it fail for all
levels :-).

Co-authored-by: Richard Biener <rguenther@suse.de>

gcc/ChangeLog:

	* tree-ssa-loop-ch.c (entry_loop_condition_is_static): Resolve
	statically to the edge remaining in the loop.
---
 gcc/tree-ssa-loop-ch.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index c7d86d751d4..af3401f112c 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -57,10 +57,24 @@ entry_loop_condition_is_static (class loop *l, path_range_query *query)
       || !irange::supports_type_p (TREE_TYPE (gimple_cond_lhs (last))))
     return false;
 
+  edge true_e, false_e;
+  extract_true_false_edges_from_block (e->dest, &true_e, &false_e);
+
+  /* If neither edge is the exit edge, this is not a case we'd like to
+     special-case.  */
+  if (!loop_exit_edge_p (l, true_e) && !loop_exit_edge_p (l, false_e))
+    return false;
+
+  tree desired_static_value;
+  if (loop_exit_edge_p (l, true_e))
+    desired_static_value = boolean_false_node;
+  else
+    desired_static_value = boolean_true_node;
+
   int_range<2> r;
   query->compute_ranges (e);
   query->range_of_stmt (r, last);
-  return r == int_range<2> (boolean_true_node, boolean_true_node);
+  return r == int_range<2> (desired_static_value, desired_static_value);
 }
 
 /* Check whether we should duplicate HEADER of LOOP.  At most *LIMIT
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Allow loop header copying when first iteration condition is known.
  2021-11-11 10:33     ` Aldy Hernandez
@ 2021-11-11 10:58       ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2021-11-11 10:58 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: Jeff Law, GCC patches

On Thu, Nov 11, 2021 at 11:33 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>
> On Thu, Nov 11, 2021 at 8:30 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Nov 10, 2021 at 9:42 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
> > >
> > >
> > >
> > > On 11/10/2021 11:20 AM, Aldy Hernandez via Gcc-patches wrote:
> > > > As discussed in the PR, the loop header copying pass avoids doing so
> > > > when optimizing for size.  However, sometimes we can determine the
> > > > loop entry conditional statically for the first iteration of the loop.
> > > >
> > > > This patch uses the path solver to determine the outgoing edge
> > > > out of preheader->header->xx.  If so, it allows header copying.  Doing
> > > > this in the loop optimizer saves us from doing gymnastics in the
> > > > threader which doesn't have the context to determine if a loop
> > > > transformation is profitable.
> > > >
> > > > I am only returning true in entry_loop_condition_is_static for
> > > > a true conditional.  Technically a false conditional is also
> > > > provably static, but allowing any boolean value causes a regression
> > > > in gfortran.dg/vector_subscript_1.f90.
> > > >
> > > > I would have preferred not passing around the query object, but the
> > > > layout of pass_ch and should_duplicate_loop_header_p make it a bit
> > > > awkward to get it right without an outright refactor to the
> > > > pass.
> > > >
> > > > Tested on x86-64 Linux.
> > > >
> > > > OK?
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >       PR tree-optimization/102906
> > > >       * tree-ssa-loop-ch.c (entry_loop_condition_is_static): New.
> > > >       (should_duplicate_loop_header_p): Call entry_loop_condition_is_static.
> > > >       (class ch_base): Add m_ranger and m_query.
> > > >       (ch_base::copy_headers): Pass m_query to
> > > >       entry_loop_condition_is_static.
> > > >       (pass_ch::execute): Allocate and deallocate m_ranger and
> > > >       m_query.
> > > >       (pass_ch_vect::execute): Same.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >       * gcc.dg/tree-ssa/pr102906.c: New test.
> > > OK.  It also makes a nice little example of how to use a Ranger within
> > > an existing pass.
> >
> > Note if you just test for the condition to be true it will only catch 50%
> > of the desired cases since we have no idea whether the 'true' edge
> > is the edge existing the loop or the edge remaining in the loop.
> > For loop header copying we like to resolve statically to the edge
> > remaining in the loop, so you want
>
> Ahh, I figured there was some block shuffling needed.
>
> I was cautious not to touch much because of the
> gfortran.dg/vector_subscript_1.f90 regression, but now I see that the
> test fails for all optimization levels except -Os.  With this fix we
> properly fail for all levels.  I assume this is expected ;-).
>
> >
> > extract_true_false_edges_from_block (gimple_bb (last), &true_e, &false_e);
> >
> > /* If neither edge is the exit edge this is not a case we'd like to
> >    special-case.  */
> > if (!loop_exit_edge_p (l, true_e) && !loop_exit_edge_p (l, false_e))
> >  return false;
> >
> > tree desired_static_value;
> > if (loop_exit_edge_p (l, true_e))
> >  desired_static_value = boolean_false_node;
> > else
> >   desired_static_value = boolean_true_node;
> >
> > and test for desired_static_value.
>
> Thanks for the code!
>
> OK pending tests?

OK, thanks!
Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-11 10:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10 18:20 [PATCH] Allow loop header copying when first iteration condition is known Aldy Hernandez
2021-11-10 20:42 ` Jeff Law
2021-11-11  7:30   ` Richard Biener
2021-11-11 10:33     ` Aldy Hernandez
2021-11-11 10:58       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).