public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
@ 2019-05-17  4:17 Feng Xue OS
  2019-05-17 16:47 ` Jeff Law
  2020-04-01 13:36 ` [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++ Richard Biener
  0 siblings, 2 replies; 45+ messages in thread
From: Feng Xue OS @ 2019-05-17  4:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: Feng Xue OS

This patch is meant to give user a way to optimize away those empty loops which are impossible to be recognized by compiler, such as C++ STL container-based loop,

    void f (std::map<int, int> &m)
    {
        for (auto it = m.begin (); it != m.end (); ++it);
    }
 
An option "-ffinite-loop" is added to tell compiler about finiteness of loops so that compiler can apply the optimization.

Feng

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d8bed3a..c55f2e6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2019-05-16  Feng Xue  <fxue@os.amperecomputing.com>
+
+        PR tree-optimization/89713
+        * doc/invoke.texi (-ffinite-loop): Document new option.
+        * common.opt (-ffinite-loop): New option.
+        * passes.def: Replace pass_cd_dce with pass_cd_dce2.
+        * tree-pass.h (pass_cd_dce2): New declaration.
+        * tree-ssa-dce.c (loop_has_true_exits): New function.
+        (find_obviously_necessary_stmts): Add aggressive_loop_removal
+        parameter.
+        (perform_tree_ssa_dce, tree_ssa_cd_dce): Likewise.
+        (class pass_cd_dce): Add new member aggressive_loop_removal.
+        (pass_cd_dce::pass_cd_dce: Add alr parameter.
+        (make_pass_cd_dce2): New function.
+
 2019-05-16  Jakub Jelinek  <jakub@redhat.com>
 
 	PR c++/90484
diff --git a/gcc/common.opt b/gcc/common.opt
index d342c4f..e98a34d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization
+Assume loops are finite if can not be analytically determined.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5e3e887..9a3882c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -9501,6 +9502,20 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Allow the compiler to assume that if finiteness of a loop can not be
+analytically determined, the loop must be finite. With the assumption, some
+aggressive transformation could be possible, such as removal of this kind
+of empty loop by dead code elimination (DCE).
+
+This option is not turned on by any @option{-O} option since it might result
+in incorrect behaviour for programs that contain seemly finite, but actually
+infinite loop.
+
+The default is @option{-fno-finite-loop}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/passes.def b/gcc/passes.def
index ad2efab..b84ee34 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -322,7 +322,7 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_copy_prop);
       NEXT_PASS (pass_warn_restrict);
       NEXT_PASS (pass_dse);
-      NEXT_PASS (pass_cd_dce);
+      NEXT_PASS (pass_cd_dce2);
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt, false /* early_p */);
       NEXT_PASS (pass_fold_builtins);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 3a0b380..2392bc5 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -395,6 +395,7 @@ extern gimple_opt_pass *make_pass_build_ealias (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_dominator (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_dce (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_cd_dce (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_cd_dce2 (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_call_cdce (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_merge_phi (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_thread_jumps (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..d4659df 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -356,6 +356,27 @@ mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
     bitmap_set_bit (visited_control_parents, bb->index);
 }
 
+/* Check whether a loop has any non-EH exit. */
+
+static bool
+loop_has_true_exits (const struct loop *loop)
+{
+  vec<edge> exits = get_loop_exit_edges (loop);
+  bool found = false;
+  edge e;
+  unsigned i;
+
+  FOR_EACH_VEC_ELT (exits, i, e)
+    if (!(e->flags & EDGE_EH))
+      {
+        found = true;
+        break;
+      }
+
+  exits.release ();
+  return found;
+}
+
 
 /* Find obviously necessary statements.  These are things like most function
    calls, and stores to file level variables.
@@ -365,7 +386,7 @@ mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
    dependence analysis.  */
 
 static void
-find_obviously_necessary_stmts (bool aggressive)
+find_obviously_necessary_stmts (bool aggressive, bool aggressive_loop_removal)
 {
   basic_block bb;
   gimple_stmt_iterator gsi;
@@ -417,7 +438,8 @@ find_obviously_necessary_stmts (bool aggressive)
 	  }
 
       FOR_EACH_LOOP (loop, 0)
-	if (!finite_loop_p (loop))
+	if (!finite_loop_p (loop)
+	    && (!aggressive_loop_removal || !loop_has_true_exits (loop)))
 	  {
 	    if (dump_file)
 	      fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num);
@@ -1532,6 +1554,8 @@ tree_dce_done (bool aggressive)
 /* Main routine to eliminate dead code.
 
    AGGRESSIVE controls the aggressiveness of the algorithm.
+   If aggressive mode is on, AGGRESSIVE_LOOP_REMOVAL enables more aggressive
+   removal of empty loop whose finiteness can not be statically determined.
    In conservative mode, we ignore control dependence and simply declare
    all but the most trivially dead branches necessary.  This mode is fast.
    In aggressive mode, control dependences are taken into account, which
@@ -1544,7 +1568,7 @@ tree_dce_done (bool aggressive)
 	  start experimenting with pass ordering.  */
 
 static unsigned int
-perform_tree_ssa_dce (bool aggressive)
+perform_tree_ssa_dce (bool aggressive, bool aggressive_loop_removal = false)
 {
   bool something_changed = 0;
 
@@ -1576,7 +1600,7 @@ perform_tree_ssa_dce (bool aggressive)
       mark_dfs_back_edges ();
     }
 
-  find_obviously_necessary_stmts (aggressive);
+  find_obviously_necessary_stmts (aggressive, aggressive_loop_removal);
 
   if (aggressive && ! in_loop_pipeline)
     {
@@ -1631,9 +1655,10 @@ tree_ssa_dce (void)
 }
 
 static unsigned int
-tree_ssa_cd_dce (void)
+tree_ssa_cd_dce (bool aggressive_loop_removal)
 {
-  return perform_tree_ssa_dce (/*aggressive=*/optimize >= 2);
+  return perform_tree_ssa_dce (/*aggressive=*/optimize >= 2,
+                               aggressive_loop_removal && flag_finite_loop);
 }
 
 namespace {
@@ -1690,15 +1715,20 @@ const pass_data pass_data_cd_dce =
 
 class pass_cd_dce : public gimple_opt_pass
 {
+  bool aggressive_loop_removal;
 public:
-  pass_cd_dce (gcc::context *ctxt)
+  pass_cd_dce (gcc::context *ctxt, bool alr = false)
     : gimple_opt_pass (pass_data_cd_dce, ctxt)
+    , aggressive_loop_removal (alr)
   {}
 
   /* opt_pass methods: */
   opt_pass * clone () { return new pass_cd_dce (m_ctxt); }
   virtual bool gate (function *) { return flag_tree_dce != 0; }
-  virtual unsigned int execute (function *) { return tree_ssa_cd_dce (); }
+  virtual unsigned int execute (function *)
+  {
+    return tree_ssa_cd_dce (aggressive_loop_removal);
+  }
 
 }; // class pass_cd_dce
 
@@ -1710,6 +1740,12 @@ make_pass_cd_dce (gcc::context *ctxt)
   return new pass_cd_dce (ctxt);
 }
 
+gimple_opt_pass *
+make_pass_cd_dce2 (gcc::context *ctxt)
+{
+  return new pass_cd_dce (ctxt, true);
+}
+
 
 /* A cheap DCE interface.  WORKLIST is a list of possibly dead stmts and
    is consumed by this function.  The function has linear complexity in

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-17  4:17 [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Feng Xue OS
@ 2019-05-17 16:47 ` Jeff Law
  2019-05-17 18:50   ` Richard Biener
  2020-04-01 13:36 ` [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++ Richard Biener
  1 sibling, 1 reply; 45+ messages in thread
From: Jeff Law @ 2019-05-17 16:47 UTC (permalink / raw)
  To: Feng Xue OS, gcc-patches

On 5/16/19 10:17 PM, Feng Xue OS wrote:
> This patch is meant to give user a way to optimize away those empty loops which are impossible to be recognized by compiler, such as C++ STL container-based loop,
> 
>     void f (std::map<int, int> &m)
>     {
>         for (auto it = m.begin (); it != m.end (); ++it);
>     }
>  
> An option "-ffinite-loop" is added to tell compiler about finiteness of loops so that compiler can apply the optimization.
> 
> Feng
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index d8bed3a..c55f2e6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2019-05-16  Feng Xue  <fxue@os.amperecomputing.com>
> +
> +        PR tree-optimization/89713
> +        * doc/invoke.texi (-ffinite-loop): Document new option.
> +        * common.opt (-ffinite-loop): New option.
> +        * passes.def: Replace pass_cd_dce with pass_cd_dce2.
> +        * tree-pass.h (pass_cd_dce2): New declaration.
> +        * tree-ssa-dce.c (loop_has_true_exits): New function.
> +        (find_obviously_necessary_stmts): Add aggressive_loop_removal
> +        parameter.
> +        (perform_tree_ssa_dce, tree_ssa_cd_dce): Likewise.
> +        (class pass_cd_dce): Add new member aggressive_loop_removal.
> +        (pass_cd_dce::pass_cd_dce: Add alr parameter.
> +        (make_pass_cd_dce2): New function.
I'm not a big fan of this patch.  I'd rather be looking at either
improving our analysis or adding attributes to the loops to help the
analysis bits prove termination.

Jeff

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-17 16:47 ` Jeff Law
@ 2019-05-17 18:50   ` Richard Biener
  2019-05-18 14:00     ` Marc Glisse
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-17 18:50 UTC (permalink / raw)
  To: gcc-patches, Jeff Law, Feng Xue OS, gcc-patches

On May 17, 2019 6:47:00 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 5/16/19 10:17 PM, Feng Xue OS wrote:
>> This patch is meant to give user a way to optimize away those empty
>loops which are impossible to be recognized by compiler, such as C++
>STL container-based loop,
>> 
>>     void f (std::map<int, int> &m)
>>     {
>>         for (auto it = m.begin (); it != m.end (); ++it);
>>     }
>>  
>> An option "-ffinite-loop" is added to tell compiler about finiteness
>of loops so that compiler can apply the optimization.
>> 
>> Feng
>> 
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index d8bed3a..c55f2e6 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,18 @@
>> +2019-05-16  Feng Xue  <fxue@os.amperecomputing.com>
>> +
>> +        PR tree-optimization/89713
>> +        * doc/invoke.texi (-ffinite-loop): Document new option.
>> +        * common.opt (-ffinite-loop): New option.
>> +        * passes.def: Replace pass_cd_dce with pass_cd_dce2.
>> +        * tree-pass.h (pass_cd_dce2): New declaration.
>> +        * tree-ssa-dce.c (loop_has_true_exits): New function.
>> +        (find_obviously_necessary_stmts): Add
>aggressive_loop_removal
>> +        parameter.
>> +        (perform_tree_ssa_dce, tree_ssa_cd_dce): Likewise.
>> +        (class pass_cd_dce): Add new member aggressive_loop_removal.
>> +        (pass_cd_dce::pass_cd_dce: Add alr parameter.
>> +        (make_pass_cd_dce2): New function.
>I'm not a big fan of this patch.  I'd rather be looking at either
>improving our analysis or adding attributes to the loops to help the
>analysis bits prove termination.

And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC. 

Richard. 

>Jeff

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-17 18:50   ` Richard Biener
@ 2019-05-18 14:00     ` Marc Glisse
  2019-05-20  7:50       ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Marc Glisse @ 2019-05-18 14:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jeff Law, Feng Xue OS

(@Feng Xue, it is better to include testcases in your patches)

>> I'm not a big fan of this patch.  I'd rather be looking at either
>> improving our analysis

Better analysis cannot hurt.

>> or adding attributes to the loops to help the analysis bits prove 
>> termination.

There may not be a loop in the source code, it may be a recursive function 
call that gcc turned into a loop. Plus, I know that it applies to all 
loops in my program.

We could have a function to compute the length of a linked list:
struct A { A*p; };
unsigned l(A*a){return a?l(a->p)+1:0;}

and because of other optimizations, it turns out that we do not actually 
use this length
void g(A*a){l(a);}

wouldn't it be nice if gcc could remove all that useless code, instead of 
keeping a useless loop on the off chance that it might be infinite?

> And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.

IIUC that was slightly different: "This option tells the loop optimizer to 
assume that loop indices do not overflow, and that loops with nontrivial 
exit condition are not infinite."

The assumption on indices looks unsafe indeed if it applied to unsigned 
indices in non-empty loops.

But the C++ standard went to the trouble of banning infinite loops without 
side effects specifically to enable DCE on this type of code... Actually, 
an infinite loop with a trivial exit condition looks like a great 
opportunity to use __builtin_unreachable() to me ;-) (I have been wanting 
a -fmain-does-return -fno-funny-business for years, since I looked at 
replacing some malloc with stack allocations, but that's all out of scope 
for this patch)

Why exactly are we trying so hard to preserve no-side-effect, infinite 
loops? What are they good for? Note that reading an atomic or volatile 
variable counts as a side effect for this purpose. It looks like some kind 
of busy waiting, but without checking a flag, so it can only be stopped by 
some external action (say a signal), so if the OS has any notion of sleep 
for a thread, blocking would be better. Or maybe it is running through a 
circular list, ensuring that it stays in RAM? Anyway it seems specific 
enough that that strange code should be the one needing an annotation.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-18 14:00     ` Marc Glisse
@ 2019-05-20  7:50       ` Richard Biener
  2019-05-20  8:27         ` Feng Xue OS
  2019-05-20 13:04         ` [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Marc Glisse
  0 siblings, 2 replies; 45+ messages in thread
From: Richard Biener @ 2019-05-20  7:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law, Feng Xue OS

On Sat, May 18, 2019 at 4:00 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> (@Feng Xue, it is better to include testcases in your patches)
>
> >> I'm not a big fan of this patch.  I'd rather be looking at either
> >> improving our analysis
>
> Better analysis cannot hurt.
>
> >> or adding attributes to the loops to help the analysis bits prove
> >> termination.
>
> There may not be a loop in the source code, it may be a recursive function
> call that gcc turned into a loop. Plus, I know that it applies to all
> loops in my program.
>
> We could have a function to compute the length of a linked list:
> struct A { A*p; };
> unsigned l(A*a){return a?l(a->p)+1:0;}
>
> and because of other optimizations, it turns out that we do not actually
> use this length
> void g(A*a){l(a);}
>
> wouldn't it be nice if gcc could remove all that useless code, instead of
> keeping a useless loop on the off chance that it might be infinite?
>
> > And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.
>
> IIUC that was slightly different: "This option tells the loop optimizer to
> assume that loop indices do not overflow, and that loops with nontrivial
> exit condition are not infinite."
>
> The assumption on indices looks unsafe indeed if it applied to unsigned
> indices in non-empty loops.

The question is of couse what a "nontrivial exit condition" is.  Indeed
the general handling of unsigned wrapping was what made the option
useless in practice.

But we thoroughly need to specify "nontrivial exit condition", if going
as far as non-constant exit condition, that is, only do {} while (1) is required
to be detected as infinite then this breaks down to unsigned wrapping IVs
not be infinite.  Otherwise it requires the compiler to be able to correctly
analyze all unsigned IVs which I know we do not at the moment (SCEV
has limits).

So - any suggestion as to how define "nontrivial exit condition"?

> But the C++ standard went to the trouble of banning infinite loops without
> side effects specifically to enable DCE on this type of code... Actually,
> an infinite loop with a trivial exit condition looks like a great
> opportunity to use __builtin_unreachable() to me ;-) (I have been wanting
> a -fmain-does-return -fno-funny-business for years, since I looked at
> replacing some malloc with stack allocations, but that's all out of scope
> for this patch)
>
> Why exactly are we trying so hard to preserve no-side-effect, infinite
> loops? What are they good for? Note that reading an atomic or volatile
> variable counts as a side effect for this purpose. It looks like some kind
> of busy waiting, but without checking a flag, so it can only be stopped by
> some external action (say a signal), so if the OS has any notion of sleep
> for a thread, blocking would be better. Or maybe it is running through a
> circular list, ensuring that it stays in RAM? Anyway it seems specific
> enough that that strange code should be the one needing an annotation.

I guess we preserve them because we have to?

I suppose we could add a flag that allows us to elide
loops with no side-effect and non-constant exit condition
(so only preserve do{}while (1))?

Not sure where that would fit best though - certainly not
in niter / IV analysis but in CD-DCE then?

Richard.

> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20  7:50       ` Richard Biener
@ 2019-05-20  8:27         ` Feng Xue OS
  2019-05-20  9:19           ` Richard Biener
  2019-05-20 13:04         ` [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Marc Glisse
  1 sibling, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-05-20  8:27 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Jeff Law


>>
>> IIUC that was slightly different: "This option tells the loop optimizer to
>> assume that loop indices do not overflow, and that loops with nontrivial
>> exit condition are not infinite."
>>
>> The assumption on indices looks unsafe indeed if it applied to unsigned
>> indices in non-empty loops.

> The question is of couse what a "nontrivial exit condition" is.  Indeed
> the general handling of unsigned wrapping was what made the option
> useless in practice.

> But we thoroughly need to specify "nontrivial exit condition", if going
> as far as non-constant exit condition, that is, only do {} while (1) is required
> to be detected as infinite then this breaks down to unsigned wrapping IVs
> not be infinite.  Otherwise it requires the compiler to be able to correctly
> analyze all unsigned IVs which I know we do not at the moment (SCEV
> has limits).

> So - any suggestion as to how define "nontrivial exit condition"?

>>
>> Why exactly are we trying so hard to preserve no-side-effect, infinite
>> loops? What are they good for? Note that reading an atomic or volatile
>> variable counts as a side effect for this purpose. It looks like some kind
>> of busy waiting, but without checking a flag, so it can only be stopped by
>> some external action (say a signal), so if the OS has any notion of sleep
>> for a thread, blocking would be better. Or maybe it is running through a
>> circular list, ensuring that it stays in RAM? Anyway it seems specific
>> enough that that strange code should be the one needing an annotation.

> I guess we preserve them because we have to?

> I suppose we could add a flag that allows us to elide
> loops with no side-effect and non-constant exit condition
> (so only preserve do{}while (1))?

> Not sure where that would fit best though - certainly not
> in niter / IV analysis but in CD-DCE then?

Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.

Moreover, in CD-DCE, if a loop contains side-effect statements, w/o finiteness assertion does not trigger loop removal.

Feng

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20  8:27         ` Feng Xue OS
@ 2019-05-20  9:19           ` Richard Biener
  2019-05-20  9:48             ` Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-20  9:19 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Jeff Law

On Mon, May 20, 2019 at 10:27 AM Feng Xue OS
<fxue@os.amperecomputing.com> wrote:
>
>
> >>
> >> IIUC that was slightly different: "This option tells the loop optimizer to
> >> assume that loop indices do not overflow, and that loops with nontrivial
> >> exit condition are not infinite."
> >>
> >> The assumption on indices looks unsafe indeed if it applied to unsigned
> >> indices in non-empty loops.
>
> > The question is of couse what a "nontrivial exit condition" is.  Indeed
> > the general handling of unsigned wrapping was what made the option
> > useless in practice.
>
> > But we thoroughly need to specify "nontrivial exit condition", if going
> > as far as non-constant exit condition, that is, only do {} while (1) is required
> > to be detected as infinite then this breaks down to unsigned wrapping IVs
> > not be infinite.  Otherwise it requires the compiler to be able to correctly
> > analyze all unsigned IVs which I know we do not at the moment (SCEV
> > has limits).
>
> > So - any suggestion as to how define "nontrivial exit condition"?
>
> >>
> >> Why exactly are we trying so hard to preserve no-side-effect, infinite
> >> loops? What are they good for? Note that reading an atomic or volatile
> >> variable counts as a side effect for this purpose. It looks like some kind
> >> of busy waiting, but without checking a flag, so it can only be stopped by
> >> some external action (say a signal), so if the OS has any notion of sleep
> >> for a thread, blocking would be better. Or maybe it is running through a
> >> circular list, ensuring that it stays in RAM? Anyway it seems specific
> >> enough that that strange code should be the one needing an annotation.
>
> > I guess we preserve them because we have to?
>
> > I suppose we could add a flag that allows us to elide
> > loops with no side-effect and non-constant exit condition
> > (so only preserve do{}while (1))?
>
> > Not sure where that would fit best though - certainly not
> > in niter / IV analysis but in CD-DCE then?
>
> Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.

Is that so?  The early pipeline contains a CD-DCE pass as well.  Note we also
have pure/const discovery affected by this.

> Moreover, in CD-DCE, if a loop contains side-effect statements, w/o finiteness assertion does not trigger loop removal.

That's of course true.

Now we still need to define "non-trivial exit condition" and a way to
actually test for that.

Richard.

> Feng
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20  9:19           ` Richard Biener
@ 2019-05-20  9:48             ` Feng Xue OS
  2019-05-20 11:54               ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-05-20  9:48 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jeff Law

>> Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.

> Is that so?  The early pipeline contains a CD-DCE pass as well.  Note we also
> have pure/const discovery affected by this.

I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.

>> Now we still need to define "non-trivial exit condition" and a way to
actually test for that.

Still not very clear on "non-trival exit condition". I think if a loop contains non-EH exit, it is terminable.

Feng


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20  9:48             ` Feng Xue OS
@ 2019-05-20 11:54               ` Richard Biener
  2019-05-20 14:00                 ` Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-20 11:54 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Jeff Law

On Mon, May 20, 2019 at 11:48 AM Feng Xue OS
<fxue@os.amperecomputing.com> wrote:
>
> >> Now finiteness assertion is only used in a very late CD-DCE, which is located after all loop optimizations are done. And we can even place it as late as just before RTL-expansion. This might be safe enough to let hidden infinite loops exposed.
>
>
> > Is that so?  The early pipeline contains a CD-DCE pass as well.  Note we also
> > have pure/const discovery affected by this.
>
> I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.

Oh, but why not generally do this?

> >> Now we still need to define "non-trivial exit condition" and a way to
> actually test for that.
>
> Still not very clear on "non-trival exit condition". I think if a loop contains non-EH exit, it is terminable.

Possibly, but that's not terminology for user-facing documentation ;)
I'd even say if a loop contains an exit
it is terminable (remove the non-EH qualification).

Richard.

> Feng
>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20  7:50       ` Richard Biener
  2019-05-20  8:27         ` Feng Xue OS
@ 2019-05-20 13:04         ` Marc Glisse
  2019-05-20 13:26           ` Richard Biener
  1 sibling, 1 reply; 45+ messages in thread
From: Marc Glisse @ 2019-05-20 13:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jeff Law, Feng Xue OS

On Mon, 20 May 2019, Richard Biener wrote:

> On Sat, May 18, 2019 at 4:00 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>>
>> (@Feng Xue, it is better to include testcases in your patches)
>>
>>>> I'm not a big fan of this patch.  I'd rather be looking at either
>>>> improving our analysis
>>
>> Better analysis cannot hurt.
>>
>>>> or adding attributes to the loops to help the analysis bits prove
>>>> termination.
>>
>> There may not be a loop in the source code, it may be a recursive function
>> call that gcc turned into a loop. Plus, I know that it applies to all
>> loops in my program.
>>
>> We could have a function to compute the length of a linked list:
>> struct A { A*p; };
>> unsigned l(A*a){return a?l(a->p)+1:0;}
>>
>> and because of other optimizations, it turns out that we do not actually
>> use this length
>> void g(A*a){l(a);}
>>
>> wouldn't it be nice if gcc could remove all that useless code, instead of
>> keeping a useless loop on the off chance that it might be infinite?
>>
>>> And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.
>>
>> IIUC that was slightly different: "This option tells the loop optimizer to
>> assume that loop indices do not overflow, and that loops with nontrivial
>> exit condition are not infinite."
>>
>> The assumption on indices looks unsafe indeed if it applied to unsigned
>> indices in non-empty loops.
>
> The question is of couse what a "nontrivial exit condition" is.  Indeed
> the general handling of unsigned wrapping was what made the option
> useless in practice.
>
> But we thoroughly need to specify "nontrivial exit condition", if going
> as far as non-constant exit condition, that is, only do {} while (1) is required
> to be detected as infinite then this breaks down to unsigned wrapping IVs
> not be infinite.  Otherwise it requires the compiler to be able to correctly
> analyze all unsigned IVs which I know we do not at the moment (SCEV
> has limits).

We also want to handle pointer-chasing loops (lists, trees), not 
specifically unsigned IV.

> So - any suggestion as to how define "nontrivial exit condition"?
>
>> But the C++ standard went to the trouble of banning infinite loops without
>> side effects specifically to enable DCE on this type of code... Actually,
>> an infinite loop with a trivial exit condition looks like a great
>> opportunity to use __builtin_unreachable() to me ;-) (I have been wanting
>> a -fmain-does-return -fno-funny-business for years, since I looked at
>> replacing some malloc with stack allocations, but that's all out of scope
>> for this patch)
>>
>> Why exactly are we trying so hard to preserve no-side-effect, infinite
>> loops? What are they good for? Note that reading an atomic or volatile
>> variable counts as a side effect for this purpose. It looks like some kind
>> of busy waiting, but without checking a flag, so it can only be stopped by
>> some external action (say a signal), so if the OS has any notion of sleep
>> for a thread, blocking would be better. Or maybe it is running through a
>> circular list, ensuring that it stays in RAM? Anyway it seems specific
>> enough that that strange code should be the one needing an annotation.
>
> I guess we preserve them because we have to?
>
> I suppose we could add a flag that allows us to elide
> loops with no side-effect and non-constant exit condition
> (so only preserve do{}while (1))?

The C++ standard says that do{}while(1) is __builtin_unreachable(), we 
don't have to preserve it. There is no mention of anything like a 
"nontrivial exit condition". Other languages may have a different opinion 
though, so it would probably need a flag indeed... But I am curious what 
the point of such a loop is.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 13:04         ` [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Marc Glisse
@ 2019-05-20 13:26           ` Richard Biener
  2019-05-20 14:49             ` Michael Matz
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-20 13:26 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jeff Law, Feng Xue OS

On Mon, May 20, 2019 at 3:04 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> On Mon, 20 May 2019, Richard Biener wrote:
>
> > On Sat, May 18, 2019 at 4:00 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> >>
> >> (@Feng Xue, it is better to include testcases in your patches)
> >>
> >>>> I'm not a big fan of this patch.  I'd rather be looking at either
> >>>> improving our analysis
> >>
> >> Better analysis cannot hurt.
> >>
> >>>> or adding attributes to the loops to help the analysis bits prove
> >>>> termination.
> >>
> >> There may not be a loop in the source code, it may be a recursive function
> >> call that gcc turned into a loop. Plus, I know that it applies to all
> >> loops in my program.
> >>
> >> We could have a function to compute the length of a linked list:
> >> struct A { A*p; };
> >> unsigned l(A*a){return a?l(a->p)+1:0;}
> >>
> >> and because of other optimizations, it turns out that we do not actually
> >> use this length
> >> void g(A*a){l(a);}
> >>
> >> wouldn't it be nice if gcc could remove all that useless code, instead of
> >> keeping a useless loop on the off chance that it might be infinite?
> >>
> >>> And we had sth similar in the past and ended up removing it. -funsafe-loop-optimizations IIRC.
> >>
> >> IIUC that was slightly different: "This option tells the loop optimizer to
> >> assume that loop indices do not overflow, and that loops with nontrivial
> >> exit condition are not infinite."
> >>
> >> The assumption on indices looks unsafe indeed if it applied to unsigned
> >> indices in non-empty loops.
> >
> > The question is of couse what a "nontrivial exit condition" is.  Indeed
> > the general handling of unsigned wrapping was what made the option
> > useless in practice.
> >
> > But we thoroughly need to specify "nontrivial exit condition", if going
> > as far as non-constant exit condition, that is, only do {} while (1) is required
> > to be detected as infinite then this breaks down to unsigned wrapping IVs
> > not be infinite.  Otherwise it requires the compiler to be able to correctly
> > analyze all unsigned IVs which I know we do not at the moment (SCEV
> > has limits).
>
> We also want to handle pointer-chasing loops (lists, trees), not
> specifically unsigned IV.
>
> > So - any suggestion as to how define "nontrivial exit condition"?
> >
> >> But the C++ standard went to the trouble of banning infinite loops without
> >> side effects specifically to enable DCE on this type of code... Actually,
> >> an infinite loop with a trivial exit condition looks like a great
> >> opportunity to use __builtin_unreachable() to me ;-) (I have been wanting
> >> a -fmain-does-return -fno-funny-business for years, since I looked at
> >> replacing some malloc with stack allocations, but that's all out of scope
> >> for this patch)
> >>
> >> Why exactly are we trying so hard to preserve no-side-effect, infinite
> >> loops? What are they good for? Note that reading an atomic or volatile
> >> variable counts as a side effect for this purpose. It looks like some kind
> >> of busy waiting, but without checking a flag, so it can only be stopped by
> >> some external action (say a signal), so if the OS has any notion of sleep
> >> for a thread, blocking would be better. Or maybe it is running through a
> >> circular list, ensuring that it stays in RAM? Anyway it seems specific
> >> enough that that strange code should be the one needing an annotation.
> >
> > I guess we preserve them because we have to?
> >
> > I suppose we could add a flag that allows us to elide
> > loops with no side-effect and non-constant exit condition
> > (so only preserve do{}while (1))?
>
> The C++ standard says that do{}while(1) is __builtin_unreachable(), we
> don't have to preserve it. There is no mention of anything like a
> "nontrivial exit condition". Other languages may have a different opinion
> though, so it would probably need a flag indeed... But I am curious what
> the point of such a loop is.

busy wait until wakeup by signal or interrupt.

Richard.

> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 11:54               ` Richard Biener
@ 2019-05-20 14:00                 ` Feng Xue OS
  2019-05-20 14:04                   ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-05-20 14:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jeff Law

>> I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.

> Oh, but why not generally do this?

It is nature that real finiteness of loop should override assumed one. As you said, CD-DCE could be invocated as an early pass. At that time, we might make mistake deduction on finiteness for a loop whose bound is in form of variable, but could be transformed to constant by later optimizations. So it is not "safe" to generally do this in CD-DCE. We need one late and specific pass.

Feng

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 14:00                 ` Feng Xue OS
@ 2019-05-20 14:04                   ` Richard Biener
  2019-05-20 14:51                     ` Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-20 14:04 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Jeff Law

On Mon, May 20, 2019 at 4:00 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> >> I specialized a CD-DCE pass, named CD-DCE2, and only in this pass, loop removal based on assumed finteness is performed. Please check the patch.
>
>
> > Oh, but why not generally do this?
>
> It is nature that real finiteness of loop should override assumed one. As you said, CD-DCE could be invocated as an early pass. At that time, we might make mistake deduction on finiteness for a loop whose bound is in form of variable, but could be transformed to constant by later optimizations. So it is not "safe" to generally do this in CD-DCE. We need one late and specific pass.

I don't see how it is safe in a late pass when it is not safe in an
earlier one.  Optimization is imperfect - we could fail to remove
an "obvious" never taken exit and still have a loop that appears to be
finite according to our definition.  The only way
to define it would be if there was, at any point, an exit from the
loop (and there it _may_ be exclude EH edges) then
the loop is assumed to be finite.

Richard.

> Feng

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 13:26           ` Richard Biener
@ 2019-05-20 14:49             ` Michael Matz
  2019-05-21  8:06               ` Marc Glisse
  0 siblings, 1 reply; 45+ messages in thread
From: Michael Matz @ 2019-05-20 14:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jeff Law, Feng Xue OS

Hi,

On Mon, 20 May 2019, Richard Biener wrote:

> > The C++ standard says that do{}while(1) is __builtin_unreachable(), we 
> > don't have to preserve it. There is no mention of anything like a 
> > "nontrivial exit condition". Other languages may have a different 
> > opinion though, so it would probably need a flag indeed... But I am 
> > curious what the point of such a loop is.
> 
> busy wait until wakeup by signal or interrupt.

I'd actually turn it around from what C++ says.  If the user wrote, as 
is, "do{}while(1);" or "while(1);" or "for(;;);" then we can assume 
something funky going on and not remove the loop.  For any other loop we 
assume that they are finite.  I.e. we mark loops as to-be-preserved (which 
we set on a few known patterns), and just remove all other loops when they 
contain no observable side effects after optimization.

And of course we'd still have to determine what acceptable side effects 
are.  E.g. in a pointer chasing loop containing no body, is the 
segfault when the pointer chain is not in fact circular, a side effect we 
should retain, or should we be allowed to remove the loop?  I'd say we 
should remove the loop, of course.

(And yes, I've always found our obsession with preserving infinite loops, 
outside of the above "obvious" cases, overly anal as well)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 14:04                   ` Richard Biener
@ 2019-05-20 14:51                     ` Feng Xue OS
  2019-05-21 10:12                       ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-05-20 14:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jeff Law

> I don't see how it is safe in a late pass when it is not safe in an

> earlier one.  Optimization is imperfect - we could fail to remove
> an "obvious" never taken exit and still have a loop that appears to be
> finite according to our definition.

Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
Compiler tries to do something based on hint you tell it, but does not ensure correctness.

> The only way
> to define it would be if there was, at any point, an exit from the
> loop (and there it _may_ be exclude EH edges) then
> the loop is assumed to be finite.

No catch your point. If we treat an infinite loop as finite, it's bad because the loop might be removed.

Suppose we have a function:

    void foo(int bound)
     { for (int i = 0; i <= bound; i++); }

 In an early CD-DCE pass, "bound" is represented as a variable, and loop has a exit, so it is assumed to finite, and is removed.

But in a late pass, this function is inlined into another one, and "bound" has value of INT_MAX, this loop is infinite, and here we can know it should not be removed.

This is why I suggest doing the optimization as late as possible.

Feng

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 14:49             ` Michael Matz
@ 2019-05-21  8:06               ` Marc Glisse
  0 siblings, 0 replies; 45+ messages in thread
From: Marc Glisse @ 2019-05-21  8:06 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Biener, GCC Patches, Jeff Law, Feng Xue OS

On Mon, 20 May 2019, Michael Matz wrote:

> On Mon, 20 May 2019, Richard Biener wrote:
>
>>> The C++ standard says that do{}while(1) is __builtin_unreachable(), we
>>> don't have to preserve it. There is no mention of anything like a
>>> "nontrivial exit condition". Other languages may have a different
>>> opinion though, so it would probably need a flag indeed... But I am
>>> curious what the point of such a loop is.
>>
>> busy wait until wakeup by signal or interrupt.
>
> I'd actually turn it around from what C++ says.  If the user wrote, as
> is, "do{}while(1);" or "while(1);" or "for(;;);" then we can assume
> something funky going on and not remove the loop.  For any other loop we
> assume that they are finite.  I.e. we mark loops as to-be-preserved (which
> we set on a few known patterns), and just remove all other loops when they
> contain no observable side effects after optimization.

Seems sensible, although marking the trivial infinite loops in gimple 
seems simpler than doing it in the front-ends, and a good enough 
approximation unless we are willing to replace some other infinite loops 
with unreachable (or trap).

> And of course we'd still have to determine what acceptable side effects
> are.  E.g. in a pointer chasing loop containing no body, is the
> segfault when the pointer chain is not in fact circular, a side effect we
> should retain, or should we be allowed to remove the loop?  I'd say we
> should remove the loop, of course.

That may depend on flags like -fnon-call-exceptions (maybe not the right 
one) I guess, although I would also want the removal to happen in as many 
cases as possible. We do usually remove memory reads if the value read is 
unused.

> (And yes, I've always found our obsession with preserving infinite loops,
> outside of the above "obvious" cases, overly anal as well)

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-20 14:51                     ` Feng Xue OS
@ 2019-05-21 10:12                       ` Richard Biener
  2019-05-21 14:24                         ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-21 10:12 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Jeff Law

On Mon, May 20, 2019 at 4:51 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> > I don't see how it is safe in a late pass when it is not safe in an
>
> > earlier one.  Optimization is imperfect - we could fail to remove
> > an "obvious" never taken exit and still have a loop that appears to be
> > finite according to our definition.
>
> Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
> Compiler tries to do something based on hint you tell it, but does not ensure correctness.
>
> > The only way
> > to define it would be if there was, at any point, an exit from the
> > loop (and there it _may_ be exclude EH edges) then
> > the loop is assumed to be finite.
>
> No catch your point. If we treat an infinite loop as finite, it's bad because the loop might be removed.
>
> Suppose we have a function:
>
>     void foo(int bound)
>      { for (int i = 0; i <= bound; i++); }
>
>  In an early CD-DCE pass, "bound" is represented as a variable, and loop has a exit, so it is assumed to finite, and is removed.
>
> But in a late pass, this function is inlined into another one, and "bound" has value of INT_MAX, this loop is infinite, and here we can know it should not be removed.

But if "bound" is always INT_MAX but that's not visible to the
compiler we will still remove the
loop so I see no difference with removing it always.

> This is why I suggest doing the optimization as late as possible.

But this will defeat the purpose of allowing followup optimizations.

IMHO the only "sensible" thing is to do

Index: gcc/tree-ssa-dce.c
===================================================================
--- gcc/tree-ssa-dce.c  (revision 271415)
+++ gcc/tree-ssa-dce.c  (working copy)
@@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
          }

       FOR_EACH_LOOP (loop, 0)
-       if (!finite_loop_p (loop))
+       if (!loop_has_exit_edges (loop))
          {
            if (dump_file)
              fprintf (dump_file, "cannot prove finiteness of loop
%i\n", loop->num);

that also has the obvious advantage that we don't need to replace the loop
with a trap() but have a place to forward control flow to.  The loop in the
following testcase is then successfully removed:

int main(int argc, char **argv)
{
  unsigned i = argc;
  while (i+=2);
  return 0;
}

Likewise is the loop

void **q;
int main(int argc, char **argv)
{
  void **p = q;
  while (p = (void **)*p);
  return 0;
}

(that's the pointer-chasing).  Not with -fnon-call-exceptions
-fexceptions though.

Richard.

> Feng
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-21 10:12                       ` Richard Biener
@ 2019-05-21 14:24                         ` Richard Biener
  2019-05-22 13:44                           ` Michael Matz
  2019-05-24  9:15                           ` [PATCH V2] " Feng Xue OS
  0 siblings, 2 replies; 45+ messages in thread
From: Richard Biener @ 2019-05-21 14:24 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Jeff Law

On Tue, May 21, 2019 at 12:12 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, May 20, 2019 at 4:51 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
> >
> > > I don't see how it is safe in a late pass when it is not safe in an
> >
> > > earlier one.  Optimization is imperfect - we could fail to remove
> > > an "obvious" never taken exit and still have a loop that appears to be
> > > finite according to our definition.
> >
> > Yes. it is. This is somewhat similar to strict-alias option/loop dep pragma.
> > Compiler tries to do something based on hint you tell it, but does not ensure correctness.
> >
> > > The only way
> > > to define it would be if there was, at any point, an exit from the
> > > loop (and there it _may_ be exclude EH edges) then
> > > the loop is assumed to be finite.
> >
> > No catch your point. If we treat an infinite loop as finite, it's bad because the loop might be removed.
> >
> > Suppose we have a function:
> >
> >     void foo(int bound)
> >      { for (int i = 0; i <= bound; i++); }
> >
> >  In an early CD-DCE pass, "bound" is represented as a variable, and loop has a exit, so it is assumed to finite, and is removed.
> >
> > But in a late pass, this function is inlined into another one, and "bound" has value of INT_MAX, this loop is infinite, and here we can know it should not be removed.
>
> But if "bound" is always INT_MAX but that's not visible to the
> compiler we will still remove the
> loop so I see no difference with removing it always.
>
> > This is why I suggest doing the optimization as late as possible.
>
> But this will defeat the purpose of allowing followup optimizations.
>
> IMHO the only "sensible" thing is to do
>
> Index: gcc/tree-ssa-dce.c
> ===================================================================
> --- gcc/tree-ssa-dce.c  (revision 271415)
> +++ gcc/tree-ssa-dce.c  (working copy)
> @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
>           }
>
>        FOR_EACH_LOOP (loop, 0)
> -       if (!finite_loop_p (loop))
> +       if (!loop_has_exit_edges (loop))
>           {
>             if (dump_file)
>               fprintf (dump_file, "cannot prove finiteness of loop
> %i\n", loop->num);

Bootstrapped / tested on x86_64-unknown-linux-gnu.  Fallout:

FAIL: gcc.dg/loop-unswitch-1.c scan-tree-dump unswitch ";; Unswitching loop"
FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
match heuristics: 2.20%" 3
FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
match heuristics: 5.50%" 1
FAIL: gcc.dg/uninit-28-gimple.c  (test for bogus messages, line 9)
FAIL: gcc.dg/graphite/scop-19.c scan-tree-dump-times graphite "number
of SCoPs: 0" 2
...
UNRESOLVED: gcc.dg/tree-ssa/20040211-1.c scan-tree-dump cddce2 "if "
FAIL: gcc.dg/tree-ssa/loop-10.c scan-tree-dump-times optimized "if " 3
FAIL: gcc.dg/tree-ssa/pr84648.c scan-tree-dump-times cddce1 "Found
loop 1 to be finite: upper bound found" 1
FAIL: gcc.dg/tree-ssa/split-path-6.c scan-tree-dump-times split-paths
"Duplicating join block" 3
FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread2 "FSM"
FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread3 "FSM"

I didn't look if the testcases are sensible for loop removal (or what
actually happens).

Richard.

> that also has the obvious advantage that we don't need to replace the loop
> with a trap() but have a place to forward control flow to.  The loop in the
> following testcase is then successfully removed:
>
> int main(int argc, char **argv)
> {
>   unsigned i = argc;
>   while (i+=2);
>   return 0;
> }
>
> Likewise is the loop
>
> void **q;
> int main(int argc, char **argv)
> {
>   void **p = q;
>   while (p = (void **)*p);
>   return 0;
> }
>
> (that's the pointer-chasing).  Not with -fnon-call-exceptions
> -fexceptions though.
>
> Richard.
>
> > Feng
> >

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-21 14:24                         ` Richard Biener
@ 2019-05-22 13:44                           ` Michael Matz
  2019-05-24 16:02                             ` [PATCH V3] " Feng Xue OS
  2019-05-24  9:15                           ` [PATCH V2] " Feng Xue OS
  1 sibling, 1 reply; 45+ messages in thread
From: Michael Matz @ 2019-05-22 13:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: Feng Xue OS, gcc-patches, Jeff Law

Hi,

On Tue, 21 May 2019, Richard Biener wrote:

> > Index: gcc/tree-ssa-dce.c
> > ===================================================================
> > --- gcc/tree-ssa-dce.c  (revision 271415)
> > +++ gcc/tree-ssa-dce.c  (working copy)
> > @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool agg
> >           }
> >
> >        FOR_EACH_LOOP (loop, 0)
> > -       if (!finite_loop_p (loop))
> > +       if (!loop_has_exit_edges (loop))
> >           {
> >             if (dump_file)
> >               fprintf (dump_file, "cannot prove finiteness of loop
> > %i\n", loop->num);
> 
> Bootstrapped / tested on x86_64-unknown-linux-gnu.  Fallout:
> 
> FAIL: gcc.dg/loop-unswitch-1.c scan-tree-dump unswitch ";; Unswitching loop"
> FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
> match heuristics: 2.20%" 3
> FAIL: gcc.dg/predict-9.c scan-tree-dump-times profile_estimate "first
> match heuristics: 5.50%" 1

These contain infinite loops without other sideeffects.

> FAIL: gcc.dg/graphite/scop-19.c scan-tree-dump-times graphite "number
> of SCoPs: 0" 2

Loop without sideeffects.

> ...
> UNRESOLVED: gcc.dg/tree-ssa/20040211-1.c scan-tree-dump cddce2 "if "

why unresolved?  Anyway, conditionally infinite loop, but without side 
effects.

> FAIL: gcc.dg/tree-ssa/loop-10.c scan-tree-dump-times optimized "if " 3

Probably removes one more loop, which is conditionally infinite, but no 
side-effects.

> FAIL: gcc.dg/tree-ssa/pr84648.c scan-tree-dump-times cddce1 "Found
> loop 1 to be finite: upper bound found" 1

finite, no side-effect.

> FAIL: gcc.dg/tree-ssa/split-path-6.c scan-tree-dump-times split-paths
> "Duplicating join block" 3

AFAICS all loops therein contain side-effects, though the one in 
lookharder() only an invalid one, which might be optimized away, so that 
might be it.  But this would need to be looked at.

> FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread2 "FSM"
> FAIL: gcc.dg/tree-ssa/ssa-thread-12.c scan-tree-dump thread3 "FSM"

If everything is properly inlined, this contains two nested 
side-effect-free loops.

> FAIL: gcc.dg/uninit-28-gimple.c  (test for bogus messages, line 9)

But this one doesn't contain a loop at all!?

> I didn't look if the testcases are sensible for loop removal (or what
> actually happens).

IMHO most testcases above (perhaps except gcc.dg/uninit-28-gimple.c and 
tree-ssa/split-path-6.c) are sensible for loop removal if mere memory 
accesses and possible infiniteness don't count as side-effects.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V2] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-21 14:24                         ` Richard Biener
  2019-05-22 13:44                           ` Michael Matz
@ 2019-05-24  9:15                           ` Feng Xue OS
  2019-05-29 11:16                             ` Richard Biener
  1 sibling, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-05-24  9:15 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jeff Law

This version is based on the proposal of Richard. And fix a bug on OpenACC loop when this opt is turned on.
Also add some test cases

Feng
-----

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9f0f889..d1c1e3a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-05-23  Feng Xue  <fxue@os.amperecomputing.com>
+
+ PR tree-optimization/89713
+ * doc/invoke.texi (-ffinite-loop): Document new option.
+ * common.opt (-ffinite-loop): New option.
+ * tree-ssa-dce.c (loop_has_true_exits): New function.
+ (mark_stmt_if_obviously_necessary): Mark IFN_GOACC_LOOP
+ calls as neccessary.
+ (find_obviously_necessary_stmts): Use flag to control
+ removal of a loop with assumed finiteness.
+ (eliminate_unnecessary_stmts): Do not delete dead result
+ of IFN_GOACC_LOOP calls.
+
 2019-05-22  David Malcolm  <dmalcolm@redhat.com>

  PR c++/90462
diff --git a/gcc/common.opt b/gcc/common.opt
index d342c4f..e98a34d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.

+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization
+Assume loops are finite if can not be analytically determined.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register> Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6c89843..caa0852 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -9501,6 +9502,20 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.

+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Allow the compiler to assume that if finiteness of a loop can not be
+analytically determined, the loop must be finite. With the assumption, some
+aggressive transformation could be possible, such as removal of this kind
+of empty loop by dead code elimination (DCE).
+
+This option is not turned on by any @option{-O} option since it might result
+in incorrect behaviour for programs that contain seemly finite, but actually
+infinite loop.
+
+The default is @option{-fno-finite-loop}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..e374155
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..ffca49c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
+
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..bb143a3 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -245,6 +245,18 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
      mark_stmt_necessary (stmt, true);
      return;
    }
+        /* IFN_GOACC_LOOP calls are neccessary in that they are used to
+           represent parameter (i.e. step, bound) of a lowered OpenACC
+           partitioned loop. But this kind of partitioned loop might not
+           survive from aggressive loop removal for it has loop exit and
+           is assumed to be finite. Therefore, we need to explictly mark
+           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
+        if (gimple_call_internal_p (stmt)
+            && gimple_call_internal_fn (stmt) == IFN_GOACC_LOOP)
+          {
+            mark_stmt_necessary (stmt, true);
+            return;
+          }
  if (!gimple_call_lhs (stmt))
    return;
  break;
@@ -357,6 +369,28 @@ mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
 }


+/* Check whether a loop has any non-EH exit. */
+
+static bool
+loop_has_true_exits (const struct loop *loop)
+{
+  vec<edge> exits = get_loop_exit_edges (loop);
+  bool found = false;
+  edge e;
+  unsigned i;
+
+  FOR_EACH_VEC_ELT (exits, i, e)
+    if (!(e->flags & EDGE_EH))
+      {
+        found = true;
+        break;
+      }
+
+  exits.release ();
+  return found;
+}
+
+
 /* Find obviously necessary statements.  These are things like most function
    calls, and stores to file level variables.

@@ -417,8 +451,15 @@ find_obviously_necessary_stmts (bool aggressive)
    }

       FOR_EACH_LOOP (loop, 0)
- if (!finite_loop_p (loop))
+        if (!finite_loop_p (loop))
    {
+            if (flag_finite_loop && loop_has_true_exits (loop))
+              {
+                if (dump_file)
+                  fprintf (dump_file, "assume loop %i to be finite\n", loop->num);
+                continue;
+              }
+
      if (dump_file)
        fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num);
      mark_control_dependent_edges_necessary (loop->latch, false);
@@ -1313,7 +1354,12 @@ eliminate_unnecessary_stmts (void)
    && DECL_FUNCTION_CODE (call) != BUILT_IN_MALLOC
    && DECL_FUNCTION_CODE (call) != BUILT_IN_CALLOC
    && !ALLOCA_FUNCTION_CODE_P
-       (DECL_FUNCTION_CODE (call)))))
+       (DECL_FUNCTION_CODE (call))))
+   /* Avoid doing so for OpenACC abstraction calls
+      (IFN_GOACC_LOOP), because later pass that lowers those
+      calls need to access lhs of calls. */
+   && (!gimple_call_internal_p (stmt)
+       || gimple_call_internal_fn (stmt) != IFN_GOACC_LOOP))
  {
    something_changed = true;
    if (dump_file && (dump_flags & TDF_DETAILS))
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..845268b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+ ;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+ ;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V3] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-22 13:44                           ` Michael Matz
@ 2019-05-24 16:02                             ` Feng Xue OS
  0 siblings, 0 replies; 45+ messages in thread
From: Feng Xue OS @ 2019-05-24 16:02 UTC (permalink / raw)
  To: Michael Matz, Richard Biener; +Cc: gcc-patches, Jeff Law

Doing aggressive loop removal at early CD-DCE might break OpenACC transformation. To the issue, this version 3 gave a somewhat simpler solution different from previous version 2. It uses a flag to postpone loop removal to late CD-DCE pass.

Feng
---------

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9f0f889..f18abdc 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2019-05-23  Feng Xue  <fxue@os.amperecomputing.com>
+
+	PR tree-optimization/89713
+	* doc/invoke.texi (-ffinite-loop): Document new option.
+	* common.opt (-ffinite-loop): New option.
+	* tree-ssa-dce.c (loop_has_true_exits): New function.
+	(find_obviously_necessary_stmts): Use flag to control
+	removal of a loop with assumed finiteness. 
+	* tree-ssa-loop.c (tree_ssa_loop_done): Update
+	flag_finite_loop.
+
 2019-05-22  David Malcolm  <dmalcolm@redhat.com>
 
 	PR c++/90462
diff --git a/gcc/common.opt b/gcc/common.opt
index d342c4f..e98a34d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization
+Assume loops are finite if can not be analytically determined.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6c89843..caa0852 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -9501,6 +9502,20 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Allow the compiler to assume that if finiteness of a loop can not be
+analytically determined, the loop must be finite. With the assumption, some
+aggressive transformation could be possible, such as removal of this kind
+of empty loop by dead code elimination (DCE).
+
+This option is not turned on by any @option{-O} option since it might result
+in incorrect behaviour for programs that contain seemly finite, but actually
+infinite loop.
+
+The default is @option{-fno-finite-loop}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..f6564a2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce3 -ffinite-loop" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }  
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce3"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..d9b9fd2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce3 -ffinite-loop" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce3"} } */
+
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..25f722f 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -357,6 +357,28 @@ mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
 }
 
 
+/* Check whether a loop has any non-EH exit. */
+
+static bool
+loop_has_true_exits (const struct loop *loop)
+{
+  vec<edge> exits = get_loop_exit_edges (loop);
+  bool found = false;
+  edge e;
+  unsigned i;
+
+  FOR_EACH_VEC_ELT (exits, i, e)
+    if (!(e->flags & EDGE_EH))
+      {
+        found = true;
+        break;
+      }
+
+  exits.release ();
+  return found;
+}
+
+
 /* Find obviously necessary statements.  These are things like most function
    calls, and stores to file level variables.
 
@@ -419,6 +441,25 @@ find_obviously_necessary_stmts (bool aggressive)
       FOR_EACH_LOOP (loop, 0)
 	if (!finite_loop_p (loop))
 	  {
+	    /* If done at early CD-DCE pass, aggressive loop removal using
+	       assumed finiteness could break OpenACC transformation. It might
+	       incorrectly change or even remove IFN_GOACC_LOOP calls, which
+	       represent parameters (i.e. step, bound) of an abstract OpenACC
+	       partitioned loop, therefore are required by OpenACC loop
+	       offload lowering. Here we need a way to postpone aggressive
+	       loop removal to late pass.
+
+	       When flag_finite_loop is 1, it means the optimization is
+	       enabled, but time is not ready. Once ready, flag_finite_loop
+	       will be updated with larger integer somewhere, and loop
+	       removal is truely allowed. */
+	    if (flag_finite_loop > 1 && loop_has_true_exits (loop))
+	      {
+		if (dump_file)
+		  fprintf (dump_file, "assume loop %i to be finite\n", loop->num);
+		continue;
+	      }
+
 	    if (dump_file)
 	      fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num);
 	    mark_control_dependent_edges_necessary (loop->latch, false);
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 1ac6cee..4175f9e 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -527,6 +527,11 @@ make_pass_iv_optimize (gcc::context *ctxt)
 static unsigned int
 tree_ssa_loop_done (void)
 {
+  /* After all loop optimizations are done, increase flag_finite_loop to
+     truely allow aggressive loop removal in CD-DCE pass. */
+  if (flag_finite_loop)
+    flag_finite_loop++;
+
   free_numbers_of_iterations_estimates (cfun);
   scev_finalize ();
   loop_optimizer_finalize ();
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..65a34d8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce3 -ffinite-loop" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+	;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+	;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce3"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V2] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-24  9:15                           ` [PATCH V2] " Feng Xue OS
@ 2019-05-29 11:16                             ` Richard Biener
  2019-06-04  6:49                               ` [PATCH V4] " Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-05-29 11:16 UTC (permalink / raw)
  To: Feng Xue OS, Thomas Schwinge; +Cc: gcc-patches, Jeff Law

On Fri, May 24, 2019 at 11:15 AM Feng Xue OS
<fxue@os.amperecomputing.com> wrote:
>
> This version is based on the proposal of Richard. And fix a bug on OpenACC loop when this opt is turned on.
> Also add some test cases

Comments below.

> Feng
> -----
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 9f0f889..d1c1e3a 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,16 @@
> +2019-05-23  Feng Xue  <fxue@os.amperecomputing.com>
> +
> + PR tree-optimization/89713
> + * doc/invoke.texi (-ffinite-loop): Document new option.
> + * common.opt (-ffinite-loop): New option.
> + * tree-ssa-dce.c (loop_has_true_exits): New function.
> + (mark_stmt_if_obviously_necessary): Mark IFN_GOACC_LOOP
> + calls as neccessary.
> + (find_obviously_necessary_stmts): Use flag to control
> + removal of a loop with assumed finiteness.
> + (eliminate_unnecessary_stmts): Do not delete dead result
> + of IFN_GOACC_LOOP calls.
> +
>  2019-05-22  David Malcolm  <dmalcolm@redhat.com>
>
>   PR c++/90462
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d342c4f..e98a34d 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1437,6 +1437,10 @@ ffinite-math-only
>  Common Report Var(flag_finite_math_only) Optimization SetByCombined
>  Assume no NaNs or infinities are generated.
>
> +ffinite-loop

I think it should be -ffinite-loops (plural)

> +Common Report Var(flag_finite_loop) Optimization
> +Assume loops are finite if can not be analytically determined.

This is a too vague description.  I'd rather write
'Assume that loops with an exit will terminate and not loop indefinitely.'

> +
>  ffixed-
>  Common Joined RejectNegative Var(common_deferred_options) Defer
>  -ffixed-<register> Mark <register> as being unavailable to the compiler.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 6c89843..caa0852 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fdevirtualize-at-ltrans  -fdse @gol
>  -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
>  -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
> +-ffinite-loop @gol
>  -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
>  -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
>  -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
> @@ -9501,6 +9502,20 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
>  enabled by default at @option{-O2} and higher if @option{-Os} is not also
>  specified.
>
> +@item -ffinite-loop
> +@opindex ffinite-loop
> +@opindex fno-finite-loop
> +Allow the compiler to assume that if finiteness of a loop can not be
> +analytically determined, the loop must be finite. With the assumption, some
> +aggressive transformation could be possible, such as removal of this kind
> +of empty loop by dead code elimination (DCE).

"Assume that a loop with an exit will eventually take the exit and not loop
indefinitely.  This allows the compiler to remove loops that otherwise have
no side-effects, not considering eventual endless looping as such."

> +This option is not turned on by any @option{-O} option since it might result
> +in incorrect behaviour for programs that contain seemly finite, but actually
> +infinite loop.

I think we should turn this option on by default, document that and note
that some languages (C++) say loops terminate.

> +
> +The default is @option{-fno-finite-loop}.
> +
>  @item -ftree-dominator-opts
>  @opindex ftree-dominator-opts
>  Perform a variety of simple scalar cleanups (constant/copy
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> new file mode 100644
> index 0000000..e374155
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
> +
> +#include <string>
> +#include <vector>
> +#include <list>
> +#include <set>
> +#include <map>
> +
> +using namespace std;
> +
> +int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
> +{
> +  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
> +    it->length();
> +
> +  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
> +    it->length();
> +
> +  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
> +    it->first + it->second.length();
> +
> +  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
> +    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
> +      {
> +        it0->length();
> +        it1->length();
> +      }
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> new file mode 100644
> index 0000000..ffca49c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
> +
> +typedef struct list {
> +    char pad[15];
> +    struct list *next;
> +} list;
> +
> +int data;
> +
> +list *head, *tail;
> +
> +int __attribute__((pure)) pfn (int);
> +
> +int foo (unsigned u, int s)
> +{
> +  unsigned i;
> +  list *p;
> +  int j;
> +
> +  for (i = 0; i < u; i += 2)
> +    ;
> +
> +  for (p = head; p; p = p->next)
> +    ;
> +
> +  for (j = data; j & s; j = pfn (j + 3))
> +    ;
> +
> +  for (p = head; p != tail; p = p->next)
> +    for (j = data + 1; j > s; j = pfn (j + 2))
> +      ;
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
> +
> diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
> index 2478219..bb143a3 100644
> --- a/gcc/tree-ssa-dce.c
> +++ b/gcc/tree-ssa-dce.c
> @@ -245,6 +245,18 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
>       mark_stmt_necessary (stmt, true);
>       return;
>     }
> +        /* IFN_GOACC_LOOP calls are neccessary in that they are used to
> +           represent parameter (i.e. step, bound) of a lowered OpenACC
> +           partitioned loop. But this kind of partitioned loop might not
> +           survive from aggressive loop removal for it has loop exit and
> +           is assumed to be finite. Therefore, we need to explictly mark
> +           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
> +        if (gimple_call_internal_p (stmt)
> +            && gimple_call_internal_fn (stmt) == IFN_GOACC_LOOP)
> +          {
> +            mark_stmt_necessary (stmt, true);
> +            return;
> +          }

Thomas - this looks like an artifact of "bad" testcases like the cited one?

>   if (!gimple_call_lhs (stmt))
>     return;
>   break;
> @@ -357,6 +369,28 @@ mark_control_dependent_edges_necessary (basic_block bb, bool ignore_self)
>  }
>
>
> +/* Check whether a loop has any non-EH exit. */
> +
> +static bool
> +loop_has_true_exits (const struct loop *loop)
> +{
> +  vec<edge> exits = get_loop_exit_edges (loop);
> +  bool found = false;
> +  edge e;
> +  unsigned i;
> +
> +  FOR_EACH_VEC_ELT (exits, i, e)
> +    if (!(e->flags & EDGE_EH))
> +      {
> +        found = true;
> +        break;
> +      }
> +
> +  exits.release ();
> +  return found;
> +}
> +
> +
>  /* Find obviously necessary statements.  These are things like most function
>     calls, and stores to file level variables.
>
> @@ -417,8 +451,15 @@ find_obviously_necessary_stmts (bool aggressive)
>     }
>
>        FOR_EACH_LOOP (loop, 0)
> - if (!finite_loop_p (loop))
> +        if (!finite_loop_p (loop))
>     {
> +            if (flag_finite_loop && loop_has_true_exits (loop))

finite_loop_p is a predicate that should honor flag_finite_loop so please
adjust that instead.  Simply inline loop_has_true_exits there at the
end with a comment explaining it.

> +              {
> +                if (dump_file)
> +                  fprintf (dump_file, "assume loop %i to be finite\n", loop->num);
> +                continue;
> +              }
> +
>       if (dump_file)
>         fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num);
>       mark_control_dependent_edges_necessary (loop->latch, false);
> @@ -1313,7 +1354,12 @@ eliminate_unnecessary_stmts (void)
>     && DECL_FUNCTION_CODE (call) != BUILT_IN_MALLOC
>     && DECL_FUNCTION_CODE (call) != BUILT_IN_CALLOC
>     && !ALLOCA_FUNCTION_CODE_P
> -       (DECL_FUNCTION_CODE (call)))))
> +       (DECL_FUNCTION_CODE (call))))
> +   /* Avoid doing so for OpenACC abstraction calls
> +      (IFN_GOACC_LOOP), because later pass that lowers those
> +      calls need to access lhs of calls. */
> +   && (!gimple_call_internal_p (stmt)
> +       || gimple_call_internal_fn (stmt) != IFN_GOACC_LOOP))

You can use gimple_call_internal_p (stmt, IFN_GOACC_LOOP)

Thomas?  This part looks OK to me but it seems lowering could deal with this
as well?

Thanks for working on this.

Richard.

>   {
>     something_changed = true;
>     if (dump_file && (dump_flags & TDF_DETAILS))
> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
> new file mode 100644
> index 0000000..845268b
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile }  */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
> +
> +int
> +f1 (void)
> +{
> +  int i, j;
> +
> +#pragma acc parallel loop tile(2,3)
> +  for (i = 1; i < 10; i++)
> +    for (j = 1; j < 10; j++)
> +      for (;;)
> + ;
> +
> +  return i + j;
> +}
> +
> +int
> +f2 (void)
> +{
> +  int i, j, k;
> +
> +#pragma acc parallel loop tile(2,3)
> +  for (i = 1; i < 10; i++)
> +    for (j = 1; j < 10; j++)
> +      for (k = 1; k < 10; k++)
> + ;
> +
> +  return i + j;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V4] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-05-29 11:16                             ` Richard Biener
@ 2019-06-04  6:49                               ` Feng Xue OS
  2019-06-04  8:24                                 ` Marc Glisse
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-06-04  6:49 UTC (permalink / raw)
  To: Richard Biener, Thomas Schwinge; +Cc: gcc-patches, Jeff Law


> I think we should turn this option on by default, document that and note
> that some languages (C++) say loops terminate.

To enable this option at -O2 is not very suitable, seems to be more aggressive. Better to turn it on at -O3.

>> +   /* Avoid doing so for OpenACC abstraction calls
>> +      (IFN_GOACC_LOOP), because later pass that lowers those
>> +      calls need to access lhs of calls. */
>> +   && (!gimple_call_internal_p (stmt)
>> +       || gimple_call_internal_fn (stmt) != IFN_GOACC_LOOP))

> You can use gimple_call_internal_p (stmt, IFN_GOACC_LOOP)

> Thomas?  This part looks OK to me but it seems lowering could deal with this
> as well?

I remove the change here, and fix the problem in oacc lowering.

Feng

----
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37aab79..1ad2a6d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
+
+	PR tree-optimization/89713
+	* doc/invoke.texi (-ffinite-loop): Document new option.
+	* common.opt (-ffinite-loop): New option.
+	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
+	IFN_GOACC_LOOP calls as necessary.
+	* tree-ssa-loop-niter.c (finite_loop): Assume loop with an exit is
+	finite.
+	* omp-offload.c (oacc_xform_loop): Skip lowering if return value of
+	IFN_GOACC_LOOP call is not used.
+	* toplev.c (process_options): Enable -ffinite-loop by default at -O3.
+
 2019-06-04  Alan Modra  <amodra@gmail.com>
 
 	PR target/90689
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e72fd0..66a1ff2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization Init(-1)
+Assume that loops with an exit will terminate and not loop indefinitely.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 91c9bb8..0a36b6c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -8327,6 +8328,7 @@ by @option{-O2} and also turns on the following optimization flags:
 -ftree-loop-distribute-patterns @gol
 -ftree-loop-distribution @gol
 -ftree-loop-vectorize @gol
+-ffinite-loop @gol
 -ftree-partial-pre @gol
 -ftree-slp-vectorize @gol
 -funswitch-loops @gol
@@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Assume that a loop with an exit will eventually take the exit and not loop
+indefinitely.  This allows the compiler to remove loops that otherwise have
+no side-effects, not considering eventual endless looping as such.
+
+This option is enabled by default at @option{-O3}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 97ae47b..369122f 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
   tree chunk_size = NULL_TREE;
   unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
   tree lhs = gimple_call_lhs (call);
-  tree type = TREE_TYPE (lhs);
+  tree type = NULL_TREE;
   tree diff_type = TREE_TYPE (range);
   tree r = NULL_TREE;
   gimple_seq seq = NULL;
@@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
   unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
   unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
 
+  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
+  if (!lhs)
+    {
+      gsi_replace_with_seq (&gsi, seq, true);
+      return;
+    }
+
+  type = TREE_TYPE (lhs);
+ 
 #ifdef ACCEL_COMPILER
   chunk_size = gimple_call_arg (call, 4);
   if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..e374155
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }  
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..ffca49c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
+
diff --git a/gcc/toplev.c b/gcc/toplev.c
index d300ac2..1c82ac4 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1708,6 +1708,10 @@ process_options (void)
       flag_prefetch_loop_arrays = 0;
     }
 
+  /* Enable -ffinite-loop with -O3 optimization level. */
+  if (flag_finite_loop == -1)
+    flag_finite_loop = optimize >= 3;
+
   /* The presence of IEEE signaling NaNs, implies all math can trap.  */
   if (flag_signaling_nans)
     flag_trapping_math = 1;
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..179605e 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
 	    mark_stmt_necessary (stmt, true);
 	    return;
 	  }
+        /* IFN_GOACC_LOOP calls are necessary in that they are used to
+           represent parameter (i.e. step, bound) of a lowered OpenACC
+           partitioned loop. But this kind of partitioned loop might not
+           survive from aggressive loop removal for it has loop exit and
+           is assumed to be finite. Therefore, we need to explicitly mark
+           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
+        if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
+          {
+            mark_stmt_necessary (stmt, true);
+            return;
+          }
 	if (!gimple_call_lhs (stmt))
 	  return;
 	break;
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 470b6a2..c25cb1d 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
 		 loop->num);
       return true;
     }
+
+  if (flag_finite_loop)
+    {
+      unsigned i;
+      vec<edge> exits = get_loop_exit_edges (loop);
+      edge ex;
+
+      /* If the loop has any non-EH exit, we can assume it will terminate. */
+      FOR_EACH_VEC_ELT (exits, i, ex)
+	if (!(ex->flags & EDGE_EH))
+	  {
+	    exits.release ();
+	    if (dump_file)
+	      fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
+		       "and -ffinite-loop is on.\n", loop->num);
+	    return true;
+          }
+
+      exits.release ();
+    }
+
   return false;
 }
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..845268b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+	;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+	;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */



    

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V4] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-04  6:49                               ` [PATCH V4] " Feng Xue OS
@ 2019-06-04  8:24                                 ` Marc Glisse
  2019-06-04 15:16                                   ` [PATCH V5] " Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Marc Glisse @ 2019-06-04  8:24 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: Richard Biener, Thomas Schwinge, gcc-patches, Jeff Law

On Tue, 4 Jun 2019, Feng Xue OS wrote:

>>  I think we should turn this option on by default, document that and note
>>  that some languages (C++) say loops terminate.
>
> To enable this option at -O2 is not very suitable, seems to be more aggressive. Better to turn it on at -O3.

Why wouldn't it be suitable for -O2? Normally, not suitable for -O2 could 
be because it is expensive (in compile time), because it increases the 
code size a lot, because it doesn't always actually improve the running 
time, etc. I don't see any of that here. There isn't supposed to be a 
semantic difference between -O2 and -O3. Do you consider it "dangerous" in 
a similar sense as -fstrict-aliasing? We enable that by default at -O2.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V5] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-04  8:24                                 ` Marc Glisse
@ 2019-06-04 15:16                                   ` Feng Xue OS
  2019-06-04 15:24                                     ` [PATCH V6] " Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-06-04 15:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, Thomas Schwinge, Jeff Law

> Why wouldn't it be suitable for -O2? Normally, not suitable for -O2 could 
> be because it is expensive (in compile time), because it increases the 
> code size a lot, because it doesn't always actually improve the running 
> time, etc. I don't see any of that here. There isn't supposed to be a 
> semantic difference between -O2 and -O3. Do you consider it "dangerous" in 
> a similar sense as -fstrict-aliasing? We enable that by default at -O2.

Yes. I did have such concern. Now I changed that to enable the option at -O2.

Feng
---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37aab79..4fdc5c8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
+
+	PR tree-optimization/89713
+	* doc/invoke.texi (-ffinite-loop): Document new option.
+	* common.opt (-ffinite-loop): New option.
+	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
+	IFN_GOACC_LOOP calls as necessary.
+	* tree-ssa-loop-niter.c (finite_loop): Assume loop with an exit is
+	finite.
+	* omp-offload.c (oacc_xform_loop): Skip lowering if return value of
+	IFN_GOACC_LOOP call is not used.
+	* opts.c (default_options_table): Enable -ffinite-loop at -O2+.
+
 2019-06-04  Alan Modra  <amodra@gmail.com>
 
 	PR target/90689
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e72fd0..f570815 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization
+Assume that loops with an exit will terminate and not loop indefinitely.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 91c9bb8..8d3259d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -8316,7 +8317,8 @@ Optimize yet more.  @option{-O3} turns on all optimizations specified
 by @option{-O2} and also turns on the following optimization flags:
 
 @c Please keep the following list alphabetized!
-@gccoptlist{-fgcse-after-reload @gol
+@gccoptlist{-ffinite-loop @gol
+-fgcse-after-reload @gol
 -finline-functions @gol
 -fipa-cp-clone
 -floop-interchange @gol
@@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Assume that a loop with an exit will eventually take the exit and not loop
+indefinitely.  This allows the compiler to remove loops that otherwise have
+no side-effects, not considering eventual endless looping as such.
+
+This option is enabled by default at @option{-O3}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 97ae47b..369122f 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
   tree chunk_size = NULL_TREE;
   unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
   tree lhs = gimple_call_lhs (call);
-  tree type = TREE_TYPE (lhs);
+  tree type = NULL_TREE;
   tree diff_type = TREE_TYPE (range);
   tree r = NULL_TREE;
   gimple_seq seq = NULL;
@@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
   unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
   unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
 
+  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
+  if (!lhs)
+    {
+      gsi_replace_with_seq (&gsi, seq, true);
+      return;
+    }
+
+  type = TREE_TYPE (lhs);
+ 
 #ifdef ACCEL_COMPILER
   chunk_size = gimple_call_arg (call, 4);
   if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 64f94ac..0db9dda 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -494,6 +494,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_ffinite_loop, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..e374155
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }  
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+
diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c
index a5b2b16..13e3451 100644
--- a/gcc/testsuite/gcc.dg/const-1.c
+++ b/gcc/testsuite/gcc.dg/const-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target nonpic } } */
-/* { dg-options "-O2 -Wsuggest-attribute=const" } */
+/* { dg-options "-O2 -Wsuggest-attribute=const -fno-finite-loop" } */
 
 extern int extern_const(int a) __attribute__ ((const));
 
diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc.dg/graphite/graphite.exp
index ea61446..b294b9c 100644
--- a/gcc/testsuite/gcc.dg/graphite/graphite.exp
+++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp
@@ -56,7 +56,7 @@ set vect_files        [lsort [glob -nocomplain $srcdir/$subdir/vect-*.c ] ]
 
 # Tests to be compiled.
 set dg-do-what-default compile
-dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all"
+dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all -fno-finite-loop"
 dg-runtest $id_files          "" "-O2 -fgraphite-identity -ffast-math -fdump-tree-graphite-details"
 
 # Tests to be run.
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
index f6fc41d..735eeef 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
@@ -1,6 +1,6 @@
 /* For PR rtl-optimization/27735  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-finite-loop" } */
 
 void set_color(void);
 void xml_colorize_line(unsigned int *p, int state)
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index 7e5ba08..3710eef 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loop" } */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c
index fe6e2bc..6ac372b 100644
--- a/gcc/testsuite/gcc.dg/pure-2.c
+++ b/gcc/testsuite/gcc.dg/pure-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
+/* { dg-options "-O2 -Wsuggest-attribute=pure -fno-finite-loop" } */
 /* { dg-add-options bind_pic_locally } */
 
 extern int extern_const(int a) __attribute__ ((pure));
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
index d289e5d..e4d331e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cddce2" } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loop" } */
 
 struct rtx_def;
 typedef struct rtx_def *rtx;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..ffca49c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
index a29c9fb..c605005 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loop" } */
 /* { dg-require-effective-target int32plus } */
 
 int bar (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
index e9b4f26..7c47906 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w -fno-finite-loop" } */
 
 struct __sFILE
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index d829b04..b7a3d77 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loop" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..179605e 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
 	    mark_stmt_necessary (stmt, true);
 	    return;
 	  }
+        /* IFN_GOACC_LOOP calls are necessary in that they are used to
+           represent parameter (i.e. step, bound) of a lowered OpenACC
+           partitioned loop. But this kind of partitioned loop might not
+           survive from aggressive loop removal for it has loop exit and
+           is assumed to be finite. Therefore, we need to explicitly mark
+           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
+        if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
+          {
+            mark_stmt_necessary (stmt, true);
+            return;
+          }
 	if (!gimple_call_lhs (stmt))
 	  return;
 	break;
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 470b6a2..c25cb1d 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
 		 loop->num);
       return true;
     }
+
+  if (flag_finite_loop)
+    {
+      unsigned i;
+      vec<edge> exits = get_loop_exit_edges (loop);
+      edge ex;
+
+      /* If the loop has any non-EH exit, we can assume it will terminate. */
+      FOR_EACH_VEC_ELT (exits, i, ex)
+	if (!(ex->flags & EDGE_EH))
+	  {
+	    exits.release ();
+	    if (dump_file)
+	      fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
+		       "and -ffinite-loop is on.\n", loop->num);
+	    return true;
+          }
+
+      exits.release ();
+    }
+
   return false;
 }
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..845268b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+	;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+	;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V6] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-04 15:16                                   ` [PATCH V5] " Feng Xue OS
@ 2019-06-04 15:24                                     ` Feng Xue OS
  2019-06-05 11:05                                       ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-06-04 15:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, Thomas Schwinge, Jeff Law

Some changes on documentation.

Feng

----
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37aab79..4fdc5c8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
+
+	PR tree-optimization/89713
+	* doc/invoke.texi (-ffinite-loop): Document new option.
+	* common.opt (-ffinite-loop): New option.
+	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
+	IFN_GOACC_LOOP calls as necessary.
+	* tree-ssa-loop-niter.c (finite_loop): Assume loop with an exit is
+	finite.
+	* omp-offload.c (oacc_xform_loop): Skip lowering if return value of
+	IFN_GOACC_LOOP call is not used.
+	* opts.c (default_options_table): Enable -ffinite-loop at -O2+.
+
 2019-06-04  Alan Modra  <amodra@gmail.com>
 
 	PR target/90689
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e72fd0..f570815 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loop
+Common Report Var(flag_finite_loop) Optimization
+Assume that loops with an exit will terminate and not loop indefinitely.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 91c9bb8..2cb0b9a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loop @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
 -fdelete-null-pointer-checks @gol
 -fdevirtualize  -fdevirtualize-speculatively @gol
 -fexpensive-optimizations @gol
+-ffinite-loop @gol 
 -fgcse  -fgcse-lm  @gol
 -fhoist-adjacent-loads @gol
 -finline-small-functions @gol
@@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loop
+@opindex ffinite-loop
+@opindex fno-finite-loop
+Assume that a loop with an exit will eventually take the exit and not loop
+indefinitely.  This allows the compiler to remove loops that otherwise have
+no side-effects, not considering eventual endless looping as such.
+
+This option is enabled by default at @option{-O2}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 97ae47b..369122f 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
   tree chunk_size = NULL_TREE;
   unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
   tree lhs = gimple_call_lhs (call);
-  tree type = TREE_TYPE (lhs);
+  tree type = NULL_TREE;
   tree diff_type = TREE_TYPE (range);
   tree r = NULL_TREE;
   gimple_seq seq = NULL;
@@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
   unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
   unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
 
+  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
+  if (!lhs)
+    {
+      gsi_replace_with_seq (&gsi, seq, true);
+      return;
+    }
+
+  type = TREE_TYPE (lhs);
+ 
 #ifdef ACCEL_COMPILER
   chunk_size = gimple_call_arg (call, 4);
   if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 64f94ac..0db9dda 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -494,6 +494,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_ffinite_loop, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..e374155
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }  
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+
diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c
index a5b2b16..13e3451 100644
--- a/gcc/testsuite/gcc.dg/const-1.c
+++ b/gcc/testsuite/gcc.dg/const-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target nonpic } } */
-/* { dg-options "-O2 -Wsuggest-attribute=const" } */
+/* { dg-options "-O2 -Wsuggest-attribute=const -fno-finite-loop" } */
 
 extern int extern_const(int a) __attribute__ ((const));
 
diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc.dg/graphite/graphite.exp
index ea61446..b294b9c 100644
--- a/gcc/testsuite/gcc.dg/graphite/graphite.exp
+++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp
@@ -56,7 +56,7 @@ set vect_files        [lsort [glob -nocomplain $srcdir/$subdir/vect-*.c ] ]
 
 # Tests to be compiled.
 set dg-do-what-default compile
-dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all"
+dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all -fno-finite-loop"
 dg-runtest $id_files          "" "-O2 -fgraphite-identity -ffast-math -fdump-tree-graphite-details"
 
 # Tests to be run.
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
index f6fc41d..735eeef 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
@@ -1,6 +1,6 @@
 /* For PR rtl-optimization/27735  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-finite-loop" } */
 
 void set_color(void);
 void xml_colorize_line(unsigned int *p, int state)
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index 7e5ba08..3710eef 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loop" } */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c
index fe6e2bc..6ac372b 100644
--- a/gcc/testsuite/gcc.dg/pure-2.c
+++ b/gcc/testsuite/gcc.dg/pure-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
+/* { dg-options "-O2 -Wsuggest-attribute=pure -fno-finite-loop" } */
 /* { dg-add-options bind_pic_locally } */
 
 extern int extern_const(int a) __attribute__ ((pure));
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
index d289e5d..e4d331e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cddce2" } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loop" } */
 
 struct rtx_def;
 typedef struct rtx_def *rtx;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..ffca49c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
index a29c9fb..c605005 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loop" } */
 /* { dg-require-effective-target int32plus } */
 
 int bar (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
index e9b4f26..7c47906 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w -fno-finite-loop" } */
 
 struct __sFILE
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index d829b04..b7a3d77 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loop" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..179605e 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
 	    mark_stmt_necessary (stmt, true);
 	    return;
 	  }
+        /* IFN_GOACC_LOOP calls are necessary in that they are used to
+           represent parameter (i.e. step, bound) of a lowered OpenACC
+           partitioned loop. But this kind of partitioned loop might not
+           survive from aggressive loop removal for it has loop exit and
+           is assumed to be finite. Therefore, we need to explicitly mark
+           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
+        if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
+          {
+            mark_stmt_necessary (stmt, true);
+            return;
+          }
 	if (!gimple_call_lhs (stmt))
 	  return;
 	break;
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 470b6a2..c25cb1d 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
 		 loop->num);
       return true;
     }
+
+  if (flag_finite_loop)
+    {
+      unsigned i;
+      vec<edge> exits = get_loop_exit_edges (loop);
+      edge ex;
+
+      /* If the loop has any non-EH exit, we can assume it will terminate. */
+      FOR_EACH_VEC_ELT (exits, i, ex)
+	if (!(ex->flags & EDGE_EH))
+	  {
+	    exits.release ();
+	    if (dump_file)
+	      fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
+		       "and -ffinite-loop is on.\n", loop->num);
+	    return true;
+          }
+
+      exits.release ();
+    }
+
   return false;
 }
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..845268b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+	;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+	;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-04 15:24                                     ` [PATCH V6] " Feng Xue OS
@ 2019-06-05 11:05                                       ` Richard Biener
  2019-06-06 10:00                                         ` [PATCH V7] " Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-06-05 11:05 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Thomas Schwinge, Jeff Law

On Tue, Jun 4, 2019 at 5:24 PM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> Some changes on documentation.

Please name the option -ffinite-loops (plural), the patch is OK with this
change if you also adjust

> +      /* If the loop has any non-EH exit, we can assume it will terminate. */
> +      FOR_EACH_VEC_ELT (exits, i, ex)
> +       if (!(ex->flags & EDGE_EH))
> +         {

to look for "normal" edges only - in addition to EDGE_EH you
want to ignore EDGE_FAKE and EDGE_ABNORMAL.  EDGE_FAKE
are inserted by connect_infiinite_loops_to_exit and EDGE_ABNORMAL
can appear with setjmp/longjmp and friends.

I think this change also warrants mentioning in gcc-10/changes.html.

Thanks,
Richard.

> Feng
>
> ----
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 37aab79..4fdc5c8 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,16 @@
> +2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
> +
> +       PR tree-optimization/89713
> +       * doc/invoke.texi (-ffinite-loop): Document new option.
> +       * common.opt (-ffinite-loop): New option.
> +       * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
> +       IFN_GOACC_LOOP calls as necessary.
> +       * tree-ssa-loop-niter.c (finite_loop): Assume loop with an exit is
> +       finite.
> +       * omp-offload.c (oacc_xform_loop): Skip lowering if return value of
> +       IFN_GOACC_LOOP call is not used.
> +       * opts.c (default_options_table): Enable -ffinite-loop at -O2+.
> +
>  2019-06-04  Alan Modra  <amodra@gmail.com>
>
>         PR target/90689
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 0e72fd0..f570815 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1437,6 +1437,10 @@ ffinite-math-only
>  Common Report Var(flag_finite_math_only) Optimization SetByCombined
>  Assume no NaNs or infinities are generated.
>
> +ffinite-loop
> +Common Report Var(flag_finite_loop) Optimization
> +Assume that loops with an exit will terminate and not loop indefinitely.
> +
>  ffixed-
>  Common Joined RejectNegative Var(common_deferred_options) Defer
>  -ffixed-<register>     Mark <register> as being unavailable to the compiler.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 91c9bb8..2cb0b9a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fdevirtualize-at-ltrans  -fdse @gol
>  -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
>  -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
> +-ffinite-loop @gol
>  -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
>  -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
>  -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
> @@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
>  -fdelete-null-pointer-checks @gol
>  -fdevirtualize  -fdevirtualize-speculatively @gol
>  -fexpensive-optimizations @gol
> +-ffinite-loop @gol
>  -fgcse  -fgcse-lm  @gol
>  -fhoist-adjacent-loads @gol
>  -finline-small-functions @gol
> @@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
>  enabled by default at @option{-O2} and higher if @option{-Os} is not also
>  specified.
>
> +@item -ffinite-loop
> +@opindex ffinite-loop
> +@opindex fno-finite-loop
> +Assume that a loop with an exit will eventually take the exit and not loop
> +indefinitely.  This allows the compiler to remove loops that otherwise have
> +no side-effects, not considering eventual endless looping as such.
> +
> +This option is enabled by default at @option{-O2}.
> +
>  @item -ftree-dominator-opts
>  @opindex ftree-dominator-opts
>  Perform a variety of simple scalar cleanups (constant/copy
> diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
> index 97ae47b..369122f 100644
> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c
> @@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
>    tree chunk_size = NULL_TREE;
>    unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
>    tree lhs = gimple_call_lhs (call);
> -  tree type = TREE_TYPE (lhs);
> +  tree type = NULL_TREE;
>    tree diff_type = TREE_TYPE (range);
>    tree r = NULL_TREE;
>    gimple_seq seq = NULL;
> @@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
>    unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
>    unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
>
> +  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
> +  if (!lhs)
> +    {
> +      gsi_replace_with_seq (&gsi, seq, true);
> +      return;
> +    }
> +
> +  type = TREE_TYPE (lhs);
> +
>  #ifdef ACCEL_COMPILER
>    chunk_size = gimple_call_arg (call, 4);
>    if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 64f94ac..0db9dda 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -494,6 +494,7 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
> +    { OPT_LEVELS_2_PLUS, OPT_ffinite_loop, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> new file mode 100644
> index 0000000..e374155
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
> +
> +#include <string>
> +#include <vector>
> +#include <list>
> +#include <set>
> +#include <map>
> +
> +using namespace std;
> +
> +int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
> +{
> +  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
> +    it->length();
> +
> +  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
> +    it->length();
> +
> +  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
> +    it->first + it->second.length();
> +
> +  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
> +    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
> +      {
> +        it0->length();
> +        it1->length();
> +      }
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c
> index a5b2b16..13e3451 100644
> --- a/gcc/testsuite/gcc.dg/const-1.c
> +++ b/gcc/testsuite/gcc.dg/const-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target nonpic } } */
> -/* { dg-options "-O2 -Wsuggest-attribute=const" } */
> +/* { dg-options "-O2 -Wsuggest-attribute=const -fno-finite-loop" } */
>
>  extern int extern_const(int a) __attribute__ ((const));
>
> diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc.dg/graphite/graphite.exp
> index ea61446..b294b9c 100644
> --- a/gcc/testsuite/gcc.dg/graphite/graphite.exp
> +++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp
> @@ -56,7 +56,7 @@ set vect_files        [lsort [glob -nocomplain $srcdir/$subdir/vect-*.c ] ]
>
>  # Tests to be compiled.
>  set dg-do-what-default compile
> -dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all"
> +dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all -fno-finite-loop"
>  dg-runtest $id_files          "" "-O2 -fgraphite-identity -ffast-math -fdump-tree-graphite-details"
>
>  # Tests to be run.
> diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
> index f6fc41d..735eeef 100644
> --- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c
> +++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
> @@ -1,6 +1,6 @@
>  /* For PR rtl-optimization/27735  */
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
> +/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-finite-loop" } */
>
>  void set_color(void);
>  void xml_colorize_line(unsigned int *p, int state)
> diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
> index 7e5ba08..3710eef 100644
> --- a/gcc/testsuite/gcc.dg/predict-9.c
> +++ b/gcc/testsuite/gcc.dg/predict-9.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */
> +/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loop" } */
>
>  extern int global;
>  extern int global2;
> diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c
> index fe6e2bc..6ac372b 100644
> --- a/gcc/testsuite/gcc.dg/pure-2.c
> +++ b/gcc/testsuite/gcc.dg/pure-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
> +/* { dg-options "-O2 -Wsuggest-attribute=pure -fno-finite-loop" } */
>  /* { dg-add-options bind_pic_locally } */
>
>  extern int extern_const(int a) __attribute__ ((pure));
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
> index d289e5d..e4d331e 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-cddce2" } */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loop" } */
>
>  struct rtx_def;
>  typedef struct rtx_def *rtx;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> new file mode 100644
> index 0000000..ffca49c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
> +
> +typedef struct list {
> +    char pad[15];
> +    struct list *next;
> +} list;
> +
> +int data;
> +
> +list *head, *tail;
> +
> +int __attribute__((pure)) pfn (int);
> +
> +int foo (unsigned u, int s)
> +{
> +  unsigned i;
> +  list *p;
> +  int j;
> +
> +  for (i = 0; i < u; i += 2)
> +    ;
> +
> +  for (p = head; p; p = p->next)
> +    ;
> +
> +  for (j = data; j & s; j = pfn (j + 3))
> +    ;
> +
> +  for (p = head; p != tail; p = p->next)
> +    for (j = data + 1; j > s; j = pfn (j + 2))
> +      ;
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
> index a29c9fb..c605005 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loop" } */
>  /* { dg-require-effective-target int32plus } */
>
>  int bar (void);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
> index e9b4f26..7c47906 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w" } */
> +/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w -fno-finite-loop" } */
>
>  struct __sFILE
>  {
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
> index d829b04..b7a3d77 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */
> +/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loop" } */
>  /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
>  /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
>  /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
> diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
> index 2478219..179605e 100644
> --- a/gcc/tree-ssa-dce.c
> +++ b/gcc/tree-ssa-dce.c
> @@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
>             mark_stmt_necessary (stmt, true);
>             return;
>           }
> +        /* IFN_GOACC_LOOP calls are necessary in that they are used to
> +           represent parameter (i.e. step, bound) of a lowered OpenACC
> +           partitioned loop. But this kind of partitioned loop might not
> +           survive from aggressive loop removal for it has loop exit and
> +           is assumed to be finite. Therefore, we need to explicitly mark
> +           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
> +        if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
> +          {
> +            mark_stmt_necessary (stmt, true);
> +            return;
> +          }
>         if (!gimple_call_lhs (stmt))
>           return;
>         break;
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 470b6a2..c25cb1d 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
>                  loop->num);
>        return true;
>      }
> +
> +  if (flag_finite_loop)
> +    {
> +      unsigned i;
> +      vec<edge> exits = get_loop_exit_edges (loop);
> +      edge ex;
> +
> +      /* If the loop has any non-EH exit, we can assume it will terminate. */
> +      FOR_EACH_VEC_ELT (exits, i, ex)
> +       if (!(ex->flags & EDGE_EH))
> +         {
> +           exits.release ();
> +           if (dump_file)
> +             fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
> +                      "and -ffinite-loop is on.\n", loop->num);
> +           return true;
> +          }
> +
> +      exits.release ();
> +    }
> +
>    return false;
>  }
>
> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
> new file mode 100644
> index 0000000..845268b
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile }  */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
> +
> +int
> +f1 (void)
> +{
> +  int i, j;
> +
> +#pragma acc parallel loop tile(2,3)
> +  for (i = 1; i < 10; i++)
> +    for (j = 1; j < 10; j++)
> +      for (;;)
> +       ;
> +
> +  return i + j;
> +}
> +
> +int
> +f2 (void)
> +{
> +  int i, j, k;
> +
> +#pragma acc parallel loop tile(2,3)
> +  for (i = 1; i < 10; i++)
> +    for (j = 1; j < 10; j++)
> +      for (k = 1; k < 10; k++)
> +       ;
> +
> +  return i + j;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V7] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-05 11:05                                       ` Richard Biener
@ 2019-06-06 10:00                                         ` Feng Xue OS
  2019-06-11  2:40                                           ` [PATCH V8] " Feng Xue OS
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-06-06 10:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Thomas Schwinge, Jeff Law

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37aab79..87cc125 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
+
+	PR tree-optimization/89713
+	* doc/invoke.texi (-ffinite-loops): Document new option.
+	* common.opt (-ffinite-loops): New option.
+	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
+	IFN_GOACC_LOOP calls as necessary.
+	* tree-ssa-loop-niter.c (finite_loop_p): Assume loop with an exit
+	is finite.
+	* omp-offload.c (oacc_xform_loop): Skip lowering if return value of
+	IFN_GOACC_LOOP call is not used.
+	* opts.c (default_options_table): Enable -ffinite-loops at -O2+.
+
 2019-06-04  Alan Modra  <amodra@gmail.com>
 
 	PR target/90689
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e72fd0..8b0e6ad 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loops
+Common Report Var(flag_finite_loops) Optimization
+Assume that loops with an exit will terminate and not loop indefinitely.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 91c9bb8..0fe4c52 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loops @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
 -fdelete-null-pointer-checks @gol
 -fdevirtualize  -fdevirtualize-speculatively @gol
 -fexpensive-optimizations @gol
+-ffinite-loops @gol 
 -fgcse  -fgcse-lm  @gol
 -fhoist-adjacent-loads @gol
 -finline-small-functions @gol
@@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loops
+@opindex ffinite-loops
+@opindex fno-finite-loops
+Assume that a loop with an exit will eventually take the exit and not loop
+indefinitely.  This allows the compiler to remove loops that otherwise have
+no side-effects, not considering eventual endless looping as such.
+
+This option is enabled by default at @option{-O2}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 97ae47b..369122f 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
   tree chunk_size = NULL_TREE;
   unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
   tree lhs = gimple_call_lhs (call);
-  tree type = TREE_TYPE (lhs);
+  tree type = NULL_TREE;
   tree diff_type = TREE_TYPE (range);
   tree r = NULL_TREE;
   gimple_seq seq = NULL;
@@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
   unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
   unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
 
+  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
+  if (!lhs)
+    {
+      gsi_replace_with_seq (&gsi, seq, true);
+      return;
+    }
+
+  type = TREE_TYPE (lhs);
+ 
 #ifdef ACCEL_COMPILER
   chunk_size = gimple_call_arg (call, 4);
   if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 64f94ac..b38bfb1 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -494,6 +494,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_ffinite_loops, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..6b1e879
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }  
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+
diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c
index a5b2b16..2e95bd8 100644
--- a/gcc/testsuite/gcc.dg/const-1.c
+++ b/gcc/testsuite/gcc.dg/const-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target nonpic } } */
-/* { dg-options "-O2 -Wsuggest-attribute=const" } */
+/* { dg-options "-O2 -Wsuggest-attribute=const -fno-finite-loops" } */
 
 extern int extern_const(int a) __attribute__ ((const));
 
diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc.dg/graphite/graphite.exp
index ea61446..523a955 100644
--- a/gcc/testsuite/gcc.dg/graphite/graphite.exp
+++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp
@@ -56,7 +56,7 @@ set vect_files        [lsort [glob -nocomplain $srcdir/$subdir/vect-*.c ] ]
 
 # Tests to be compiled.
 set dg-do-what-default compile
-dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all"
+dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all -fno-finite-loops"
 dg-runtest $id_files          "" "-O2 -fgraphite-identity -ffast-math -fdump-tree-graphite-details"
 
 # Tests to be run.
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
index f6fc41d..de2fb2c 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
@@ -1,6 +1,6 @@
 /* For PR rtl-optimization/27735  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-finite-loops" } */
 
 void set_color(void);
 void xml_colorize_line(unsigned int *p, int state)
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index 7e5ba08..f491c51 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops" } */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c
index fe6e2bc..318cfd1 100644
--- a/gcc/testsuite/gcc.dg/pure-2.c
+++ b/gcc/testsuite/gcc.dg/pure-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
+/* { dg-options "-O2 -Wsuggest-attribute=pure -fno-finite-loops" } */
 /* { dg-add-options bind_pic_locally } */
 
 extern int extern_const(int a) __attribute__ ((pure));
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
index d289e5d..a9bdf26 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cddce2" } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loops" } */
 
 struct rtx_def;
 typedef struct rtx_def *rtx;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..18c1ddb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loops" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
index a29c9fb..3d05ad2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loops" } */
 /* { dg-require-effective-target int32plus } */
 
 int bar (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
index e9b4f26..187c084 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w -fno-finite-loops" } */
 
 struct __sFILE
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index d829b04..6752676 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loops" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..179605e 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
 	    mark_stmt_necessary (stmt, true);
 	    return;
 	  }
+        /* IFN_GOACC_LOOP calls are necessary in that they are used to
+           represent parameter (i.e. step, bound) of a lowered OpenACC
+           partitioned loop. But this kind of partitioned loop might not
+           survive from aggressive loop removal for it has loop exit and
+           is assumed to be finite. Therefore, we need to explicitly mark
+           these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
+        if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
+          {
+            mark_stmt_necessary (stmt, true);
+            return;
+          }
 	if (!gimple_call_lhs (stmt))
 	  return;
 	break;
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 470b6a2..25c2a59 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
 		 loop->num);
       return true;
     }
+
+  if (flag_finite_loops)
+    {
+      unsigned i;
+      vec<edge> exits = get_loop_exit_edges (loop);
+      edge ex;
+
+      /* If the loop has a normal exit, we can assume it will terminate. */
+      FOR_EACH_VEC_ELT (exits, i, ex)
+	if (!(ex->flags & (EDGE_EH | EDGE_ABNORMAL | EDGE_FAKE)))
+	  {
+	    exits.release ();
+	    if (dump_file)
+	      fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
+		       "and -ffinite-loop is on.\n", loop->num);
+	    return true;
+          }
+
+      exits.release ();
+    }
+
   return false;
 }
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..44767cd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+	;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+	;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V8] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-06 10:00                                         ` [PATCH V7] " Feng Xue OS
@ 2019-06-11  2:40                                           ` Feng Xue OS
  2019-06-12  9:43                                             ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Feng Xue OS @ 2019-06-11  2:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Thomas Schwinge, Jeff Law

Reformat to comply with gcc coding style.

Feng

---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37aab79..87cc125 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
+
+	PR tree-optimization/89713
+	* doc/invoke.texi (-ffinite-loops): Document new option.
+	* common.opt (-ffinite-loops): New option.
+	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
+	IFN_GOACC_LOOP calls as necessary.
+	* tree-ssa-loop-niter.c (finite_loop_p): Assume loop with an exit
+	is finite.
+	* omp-offload.c (oacc_xform_loop): Skip lowering if return value of
+	IFN_GOACC_LOOP call is not used.
+	* opts.c (default_options_table): Enable -ffinite-loops at -O2+.
+
 2019-06-04  Alan Modra  <amodra@gmail.com>
 
 	PR target/90689
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e72fd0..8b0e6ad 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loops
+Common Report Var(flag_finite_loops) Optimization
+Assume that loops with an exit will terminate and not loop indefinitely.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed-<register>	Mark <register> as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 91c9bb8..1e12595 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
+-ffinite-loops @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
 -fdelete-null-pointer-checks @gol
 -fdevirtualize  -fdevirtualize-speculatively @gol
 -fexpensive-optimizations @gol
+-ffinite-loops @gol
 -fgcse  -fgcse-lm  @gol
 -fhoist-adjacent-loads @gol
 -finline-small-functions @gol
@@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loops
+@opindex ffinite-loops
+@opindex fno-finite-loops
+Assume that a loop with an exit will eventually take the exit and not loop
+indefinitely.  This allows the compiler to remove loops that otherwise have
+no side-effects, not considering eventual endless looping as such.
+
+This option is enabled by default at @option{-O2}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 97ae47b..c8a281c 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
   tree chunk_size = NULL_TREE;
   unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
   tree lhs = gimple_call_lhs (call);
-  tree type = TREE_TYPE (lhs);
+  tree type = NULL_TREE;
   tree diff_type = TREE_TYPE (range);
   tree r = NULL_TREE;
   gimple_seq seq = NULL;
@@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
   unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
   unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
 
+  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used.  */
+  if (!lhs)
+    {
+      gsi_replace_with_seq (&gsi, seq, true);
+      return;
+    }
+
+  type = TREE_TYPE (lhs);
+
 #ifdef ACCEL_COMPILER
   chunk_size = gimple_call_arg (call, 4);
   if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 64f94ac..b38bfb1 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -494,6 +494,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_ffinite_loops, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 0000000..6b1e879
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
+
+#include <string>
+#include <vector>
+#include <list>
+#include <set>
+#include <map>
+
+using namespace std;
+
+int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
+{
+  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
+    it->length();
+
+  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
+    it->length();
+
+  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
+    it->first + it->second.length();
+
+  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
+    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
+      {
+        it0->length();
+        it1->length();
+      }  
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+
diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c
index a5b2b16..2e95bd8 100644
--- a/gcc/testsuite/gcc.dg/const-1.c
+++ b/gcc/testsuite/gcc.dg/const-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target nonpic } } */
-/* { dg-options "-O2 -Wsuggest-attribute=const" } */
+/* { dg-options "-O2 -Wsuggest-attribute=const -fno-finite-loops" } */
 
 extern int extern_const(int a) __attribute__ ((const));
 
diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc.dg/graphite/graphite.exp
index ea61446..523a955 100644
--- a/gcc/testsuite/gcc.dg/graphite/graphite.exp
+++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp
@@ -56,7 +56,7 @@ set vect_files        [lsort [glob -nocomplain $srcdir/$subdir/vect-*.c ] ]
 
 # Tests to be compiled.
 set dg-do-what-default compile
-dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all"
+dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all -fno-finite-loops"
 dg-runtest $id_files          "" "-O2 -fgraphite-identity -ffast-math -fdump-tree-graphite-details"
 
 # Tests to be run.
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
index f6fc41d..de2fb2c 100644
--- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
@@ -1,6 +1,6 @@
 /* For PR rtl-optimization/27735  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-finite-loops" } */
 
 void set_color(void);
 void xml_colorize_line(unsigned int *p, int state)
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index 7e5ba08..f491c51 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops" } */
 
 extern int global;
 extern int global2;
diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c
index fe6e2bc..318cfd1 100644
--- a/gcc/testsuite/gcc.dg/pure-2.c
+++ b/gcc/testsuite/gcc.dg/pure-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
+/* { dg-options "-O2 -Wsuggest-attribute=pure -fno-finite-loops" } */
 /* { dg-add-options bind_pic_locally } */
 
 extern int extern_const(int a) __attribute__ ((pure));
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
index d289e5d..a9bdf26 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cddce2" } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loops" } */
 
 struct rtx_def;
 typedef struct rtx_def *rtx;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
new file mode 100644
index 0000000..18c1ddb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loops" } */
+
+typedef struct list {
+    char pad[15];
+    struct list *next;
+} list;
+
+int data;
+
+list *head, *tail;
+
+int __attribute__((pure)) pfn (int);
+
+int foo (unsigned u, int s)
+{
+  unsigned i;
+  list *p;
+  int j;
+
+  for (i = 0; i < u; i += 2)
+    ;
+
+  for (p = head; p; p = p->next)
+    ;
+
+  for (j = data; j & s; j = pfn (j + 3))
+    ;
+
+  for (p = head; p != tail; p = p->next)
+    for (j = data + 1; j > s; j = pfn (j + 2))
+      ;
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
index a29c9fb..3d05ad2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loops" } */
 /* { dg-require-effective-target int32plus } */
 
 int bar (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
index e9b4f26..187c084 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w -fno-finite-loops" } */
 
 struct __sFILE
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
index d829b04..6752676 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */
+/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loops" } */
 /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
 /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 2478219..a38899e 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
 	    mark_stmt_necessary (stmt, true);
 	    return;
 	  }
+	/* IFN_GOACC_LOOP calls are necessary in that they are used to
+	   represent parameter (i.e. step, bound) of a lowered OpenACC
+	   partitioned loop.  But this kind of partitioned loop might not
+	   survive from aggressive loop removal for it has loop exit and
+	   is assumed to be finite.  Therefore, we need to explicitly mark
+	   these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
+	if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
+	  {
+	    mark_stmt_necessary (stmt, true);
+	    return;
+	  }
 	if (!gimple_call_lhs (stmt))
 	  return;
 	break;
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 470b6a2..9b9cb41 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
 		 loop->num);
       return true;
     }
+
+  if (flag_finite_loops)
+    {
+      unsigned i;
+      vec<edge> exits = get_loop_exit_edges (loop);
+      edge ex;
+
+      /* If the loop has a normal exit, we can assume it will terminate.  */
+      FOR_EACH_VEC_ELT (exits, i, ex)
+	if (!(ex->flags & (EDGE_EH | EDGE_ABNORMAL | EDGE_FAKE)))
+	  {
+	    exits.release ();
+	    if (dump_file)
+	      fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
+		       "and -ffinite-loops is on.\n", loop->num);
+	    return true;
+	  }
+
+      exits.release ();
+    }
+
   return false;
 }
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
new file mode 100644
index 0000000..44767cd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile }  */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
+
+int
+f1 (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (;;)
+	;
+
+  return i + j;
+}
+
+int
+f2 (void)
+{
+  int i, j, k;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+    for (j = 1; j < 10; j++)
+      for (k = 1; k < 10; k++)
+	;
+
+  return i + j;
+}
+/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V8] Remove empty loop with assumed finiteness (PR tree-optimization/89713)
  2019-06-11  2:40                                           ` [PATCH V8] " Feng Xue OS
@ 2019-06-12  9:43                                             ` Richard Biener
  2019-06-15 12:05                                               ` [committed][nvptx, libgomp] Update pr85381-{2,4}.c test-cases Tom de Vries
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2019-06-12  9:43 UTC (permalink / raw)
  To: Feng Xue OS; +Cc: gcc-patches, Thomas Schwinge, Jeff Law

On Tue, Jun 11, 2019 at 4:40 AM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> Reformat to comply with gcc coding style.

OK for trunk.

Thanks,
Richard.

> Feng
>
> ---
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 37aab79..87cc125 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,16 @@
> +2019-06-04  Feng Xue  <fxue@os.amperecomputing.com>
> +
> +       PR tree-optimization/89713
> +       * doc/invoke.texi (-ffinite-loops): Document new option.
> +       * common.opt (-ffinite-loops): New option.
> +       * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
> +       IFN_GOACC_LOOP calls as necessary.
> +       * tree-ssa-loop-niter.c (finite_loop_p): Assume loop with an exit
> +       is finite.
> +       * omp-offload.c (oacc_xform_loop): Skip lowering if return value of
> +       IFN_GOACC_LOOP call is not used.
> +       * opts.c (default_options_table): Enable -ffinite-loops at -O2+.
> +
>  2019-06-04  Alan Modra  <amodra@gmail.com>
>
>         PR target/90689
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 0e72fd0..8b0e6ad 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1437,6 +1437,10 @@ ffinite-math-only
>  Common Report Var(flag_finite_math_only) Optimization SetByCombined
>  Assume no NaNs or infinities are generated.
>
> +ffinite-loops
> +Common Report Var(flag_finite_loops) Optimization
> +Assume that loops with an exit will terminate and not loop indefinitely.
> +
>  ffixed-
>  Common Joined RejectNegative Var(common_deferred_options) Defer
>  -ffixed-<register>     Mark <register> as being unavailable to the compiler.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 91c9bb8..1e12595 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fdevirtualize-at-ltrans  -fdse @gol
>  -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
>  -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} @gol
> +-ffinite-loops @gol
>  -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
>  -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
>  -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
> @@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
>  -fdelete-null-pointer-checks @gol
>  -fdevirtualize  -fdevirtualize-speculatively @gol
>  -fexpensive-optimizations @gol
> +-ffinite-loops @gol
>  -fgcse  -fgcse-lm  @gol
>  -fhoist-adjacent-loads @gol
>  -finline-small-functions @gol
> @@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side effects.  This flag is
>  enabled by default at @option{-O2} and higher if @option{-Os} is not also
>  specified.
>
> +@item -ffinite-loops
> +@opindex ffinite-loops
> +@opindex fno-finite-loops
> +Assume that a loop with an exit will eventually take the exit and not loop
> +indefinitely.  This allows the compiler to remove loops that otherwise have
> +no side-effects, not considering eventual endless looping as such.
> +
> +This option is enabled by default at @option{-O2}.
> +
>  @item -ftree-dominator-opts
>  @opindex ftree-dominator-opts
>  Perform a variety of simple scalar cleanups (constant/copy
> diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
> index 97ae47b..c8a281c 100644
> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c
> @@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
>    tree chunk_size = NULL_TREE;
>    unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
>    tree lhs = gimple_call_lhs (call);
> -  tree type = TREE_TYPE (lhs);
> +  tree type = NULL_TREE;
>    tree diff_type = TREE_TYPE (range);
>    tree r = NULL_TREE;
>    gimple_seq seq = NULL;
> @@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
>    unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
>    unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
>
> +  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used.  */
> +  if (!lhs)
> +    {
> +      gsi_replace_with_seq (&gsi, seq, true);
> +      return;
> +    }
> +
> +  type = TREE_TYPE (lhs);
> +
>  #ifdef ACCEL_COMPILER
>    chunk_size = gimple_call_arg (call, 4);
>    if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 64f94ac..b38bfb1 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -494,6 +494,7 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
> +    { OPT_LEVELS_2_PLUS, OPT_ffinite_loops, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> new file mode 100644
> index 0000000..6b1e879
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
> +
> +#include <string>
> +#include <vector>
> +#include <list>
> +#include <set>
> +#include <map>
> +
> +using namespace std;
> +
> +int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, string> &m)
> +{
> +  for (vector<string>::iterator it = v.begin (); it != v.end (); ++it)
> +    it->length();
> +
> +  for (list<string>::iterator it = l.begin (); it != l.end (); ++it)
> +    it->length();
> +
> +  for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it)
> +    it->first + it->second.length();
> +
> +  for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0)
> +    for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); ++it1)
> +      {
> +        it0->length();
> +        it1->length();
> +      }
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/const-1.c b/gcc/testsuite/gcc.dg/const-1.c
> index a5b2b16..2e95bd8 100644
> --- a/gcc/testsuite/gcc.dg/const-1.c
> +++ b/gcc/testsuite/gcc.dg/const-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target nonpic } } */
> -/* { dg-options "-O2 -Wsuggest-attribute=const" } */
> +/* { dg-options "-O2 -Wsuggest-attribute=const -fno-finite-loops" } */
>
>  extern int extern_const(int a) __attribute__ ((const));
>
> diff --git a/gcc/testsuite/gcc.dg/graphite/graphite.exp b/gcc/testsuite/gcc.dg/graphite/graphite.exp
> index ea61446..523a955 100644
> --- a/gcc/testsuite/gcc.dg/graphite/graphite.exp
> +++ b/gcc/testsuite/gcc.dg/graphite/graphite.exp
> @@ -56,7 +56,7 @@ set vect_files        [lsort [glob -nocomplain $srcdir/$subdir/vect-*.c ] ]
>
>  # Tests to be compiled.
>  set dg-do-what-default compile
> -dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all"
> +dg-runtest $scop_files        "" "-O2 -fgraphite -fdump-tree-graphite-all -fno-finite-loops"
>  dg-runtest $id_files          "" "-O2 -fgraphite-identity -ffast-math -fdump-tree-graphite-details"
>
>  # Tests to be run.
> diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
> index f6fc41d..de2fb2c 100644
> --- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c
> +++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c
> @@ -1,6 +1,6 @@
>  /* For PR rtl-optimization/27735  */
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
> +/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details -fno-finite-loops" } */
>
>  void set_color(void);
>  void xml_colorize_line(unsigned int *p, int state)
> diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
> index 7e5ba08..f491c51 100644
> --- a/gcc/testsuite/gcc.dg/predict-9.c
> +++ b/gcc/testsuite/gcc.dg/predict-9.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate" } */
> +/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-profile_estimate -fno-finite-loops" } */
>
>  extern int global;
>  extern int global2;
> diff --git a/gcc/testsuite/gcc.dg/pure-2.c b/gcc/testsuite/gcc.dg/pure-2.c
> index fe6e2bc..318cfd1 100644
> --- a/gcc/testsuite/gcc.dg/pure-2.c
> +++ b/gcc/testsuite/gcc.dg/pure-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
> +/* { dg-options "-O2 -Wsuggest-attribute=pure -fno-finite-loops" } */
>  /* { dg-add-options bind_pic_locally } */
>
>  extern int extern_const(int a) __attribute__ ((pure));
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
> index d289e5d..a9bdf26 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20040211-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-cddce2" } */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -fno-finite-loops" } */
>
>  struct rtx_def;
>  typedef struct rtx_def *rtx;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> new file mode 100644
> index 0000000..18c1ddb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loops" } */
> +
> +typedef struct list {
> +    char pad[15];
> +    struct list *next;
> +} list;
> +
> +int data;
> +
> +list *head, *tail;
> +
> +int __attribute__((pure)) pfn (int);
> +
> +int foo (unsigned u, int s)
> +{
> +  unsigned i;
> +  list *p;
> +  int j;
> +
> +  for (i = 0; i < u; i += 2)
> +    ;
> +
> +  for (p = head; p; p = p->next)
> +    ;
> +
> +  for (j = data; j & s; j = pfn (j + 3))
> +    ;
> +
> +  for (p = head; p != tail; p = p->next)
> +    for (j = data + 1; j > s; j = pfn (j + 2))
> +      ;
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
> index a29c9fb..3d05ad2 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-10.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-options "-O2 -fdump-tree-optimized -fno-finite-loops" } */
>  /* { dg-require-effective-target int32plus } */
>
>  int bar (void);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
> index e9b4f26..187c084 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w" } */
> +/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details -w -fno-finite-loops" } */
>
>  struct __sFILE
>  {
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
> index d829b04..6752676 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details" } */
> +/* { dg-options "-O2 -fdump-tree-thread2-details -fdump-tree-thread3-details -fdump-tree-thread4-details -fno-finite-loops" } */
>  /* { dg-final { scan-tree-dump "FSM" "thread2" } } */
>  /* { dg-final { scan-tree-dump "FSM" "thread3" } } */
>  /* { dg-final { scan-tree-dump "FSM" "thread4" { xfail *-*-* } } } */
> diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
> index 2478219..a38899e 100644
> --- a/gcc/tree-ssa-dce.c
> +++ b/gcc/tree-ssa-dce.c
> @@ -245,6 +245,17 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
>             mark_stmt_necessary (stmt, true);
>             return;
>           }
> +       /* IFN_GOACC_LOOP calls are necessary in that they are used to
> +          represent parameter (i.e. step, bound) of a lowered OpenACC
> +          partitioned loop.  But this kind of partitioned loop might not
> +          survive from aggressive loop removal for it has loop exit and
> +          is assumed to be finite.  Therefore, we need to explicitly mark
> +          these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) */
> +       if (gimple_call_internal_p (stmt, IFN_GOACC_LOOP))
> +         {
> +           mark_stmt_necessary (stmt, true);
> +           return;
> +         }
>         if (!gimple_call_lhs (stmt))
>           return;
>         break;
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 470b6a2..9b9cb41 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -2798,6 +2798,27 @@ finite_loop_p (struct loop *loop)
>                  loop->num);
>        return true;
>      }
> +
> +  if (flag_finite_loops)
> +    {
> +      unsigned i;
> +      vec<edge> exits = get_loop_exit_edges (loop);
> +      edge ex;
> +
> +      /* If the loop has a normal exit, we can assume it will terminate.  */
> +      FOR_EACH_VEC_ELT (exits, i, ex)
> +       if (!(ex->flags & (EDGE_EH | EDGE_ABNORMAL | EDGE_FAKE)))
> +         {
> +           exits.release ();
> +           if (dump_file)
> +             fprintf (dump_file, "Assume loop %i to be finite: it has an exit "
> +                      "and -ffinite-loops is on.\n", loop->num);
> +           return true;
> +         }
> +
> +      exits.release ();
> +    }
> +
>    return false;
>  }
>
> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
> new file mode 100644
> index 0000000..44767cd
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile }  */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
> +
> +int
> +f1 (void)
> +{
> +  int i, j;
> +
> +#pragma acc parallel loop tile(2,3)
> +  for (i = 1; i < 10; i++)
> +    for (j = 1; j < 10; j++)
> +      for (;;)
> +       ;
> +
> +  return i + j;
> +}
> +
> +int
> +f2 (void)
> +{
> +  int i, j, k;
> +
> +#pragma acc parallel loop tile(2,3)
> +  for (i = 1; i < 10; i++)
> +    for (j = 1; j < 10; j++)
> +      for (k = 1; k < 10; k++)
> +       ;
> +
> +  return i + j;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [committed][nvptx, libgomp] Update pr85381-{2,4}.c test-cases
  2019-06-12  9:43                                             ` Richard Biener
@ 2019-06-15 12:05                                               ` Tom de Vries
  0 siblings, 0 replies; 45+ messages in thread
From: Tom de Vries @ 2019-06-15 12:05 UTC (permalink / raw)
  To: Richard Biener, Feng Xue OS; +Cc: gcc-patches, Thomas Schwinge, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 350 bytes --]

[ was: Re: [PATCH V8] Remove empty loop with assumed finiteness (PR
tree-optimization/89713) ]

On 12-06-19 11:42, Richard Biener wrote:
> On Tue, Jun 11, 2019 at 4:40 AM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>>
>> Reformat to comply with gcc coding style.
> 
> OK for trunk.
> 

Committed patch to update nvptx test-cases.

Thanks,
- Tom

[-- Attachment #2: 0001-nvptx-libgomp-Update-pr85381-2-4-.c-test-cases.patch --]
[-- Type: text/x-patch, Size: 2471 bytes --]

[nvptx, libgomp] Update pr85381-{2,4}.c test-cases

After the fix for "PR tree-optimization/89713 - Assume loop with an exit is
finite" ( r272234 ) empty oacc loops are removed before expand.

Update pr85381-{2,4}.c accordingly.

2019-06-15  Tom de Vries  <tdevries@suse.de>

	PR tree-optimization/89713
	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Expect no bar.sync.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Same.

---
 .../testsuite/libgomp.oacc-c-c++-common/pr85381-2.c  | 20 +-------------------
 .../testsuite/libgomp.oacc-c-c++-common/pr85381-4.c  |  5 +----
 2 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
index 6570c64afff..2cb5b95949d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
@@ -15,22 +15,4 @@ main (void)
   return 0;
 }
 
-/* Todo: Boths bar.syncs can be removed.
-   Atm we generate this dead code inbetween forked and joining:
-
-                     mov.u32 %r28, %ntid.y;
-                     mov.u32 %r29, %tid.y;
-                     add.u32 %r30, %r29, %r29;
-                     setp.gt.s32     %r31, %r30, 19;
-             @%r31   bra     $L2;
-                     add.u32 %r25, %r28, %r28;
-                     mov.u32 %r24, %r30;
-     $L3:
-                     add.u32 %r24, %r24, %r25;
-                     setp.le.s32     %r33, %r24, 19;
-             @%r33   bra     $L3;
-     $L2:
-
-   so the loop is not recognized as empty loop (which we detect by seeing if
-   joining immediately follows forked).  */
-/* { dg-final { scan-assembler-times "bar.sync" 2 } } */
+/* { dg-final { scan-assembler-times "bar.sync" 0 } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
index d955d79718d..e8a433ffc0a 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
@@ -21,7 +21,4 @@ main (void)
   return 0;
 }
 
-/* Atm, %ntid.y is broadcast from one loop to the next, so there are 2 bar.syncs
-   for that (the other two are there for the same reason as in pr85381-2.c).
-   Todo: Recompute %ntid.y instead of broadcasting it. */
-/* { dg-final { scan-assembler-times "bar.sync" 4 } } */
+/* { dg-final { scan-assembler-times "bar.sync" 0 } } */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
@ 2020-04-01 13:36 ` Richard Biener
  2020-04-01 13:47   ` Jakub Jelinek
  2020-04-01 19:15   ` Jason Merrill
  0 siblings, 2 replies; 45+ messages in thread
From: Richard Biener @ 2020-04-01 13:36 UTC (permalink / raw)
  To: gcc-patches; +Cc: jason, Joseph S. Myers

This does away with enabling -ffinite-loops at -O2+ for all languages
and instead enables it selectively for C++ only.

Jason, I didn't find a reference as to when the forward progress
guarantee was introduced to C++ so I randomly chose C++11, is that
correct?

Joseph, this simply never enables -ffinite-loops for C by default
also based on the fact this was added in the first place for C++
standard library abstraction removal thus without any real-world
C testcases.

Bootstrap / regtest in progress - I expect I need to add explicit
-ffinite-loops to a few testcases (and fixup the changelog).

I did not look whether other languages guarantee forward progress
but the feature is new in GCC 10 and restricting it to C++ is
sensible I think.

Any further comments?

Writing a runtime testcase is hard since by definition it would
not finish.  I'll construct a scan-tree-dump one.

Thanks,
Richard.

2020-04-01  Richard Biener  <rguenther@suse.de>

	PR c/94392
	* c-opts.c (): Enable -ffinite-loops for -O2 and C++11.

	* common.opt (ffinite-loops): Initialize to zero.
	* opts.c (default_options_table): Remove OPT_ffinite_loops
	entry.
	* ipa-inline.c (): Match up -ffinite-loops.
	* doc/invoke.texi (ffinite-loops): Adjust documentation of
	default setting.
---
 gcc/c-family/c-opts.c | 4 ++++
 gcc/common.opt        | 2 +-
 gcc/doc/invoke.texi   | 3 ++-
 gcc/ipa-inline.c      | 5 ++++-
 gcc/opts.c            | 1 -
 5 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 6b6c754ad86..58ba0948e79 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -989,6 +989,10 @@ c_common_post_options (const char **pfilename)
   SET_OPTION_IF_UNSET (&global_options, &global_options_set, flag_new_ttp,
 		       cxx_dialect >= cxx17);
 
+  /* C++11 guarantees forward progress.  */
+  SET_OPTION_IF_UNSET (&global_options, &global_options_set, flag_finite_loops,
+		       optimize >= 2 && cxx_dialect >= cxx11);
+
   if (cxx_dialect >= cxx11)
     {
       /* If we're allowing C++0x constructs, don't warn about C++98
diff --git a/gcc/common.opt b/gcc/common.opt
index 4368910cb54..bb2ea4c905d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1490,7 +1490,7 @@ Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
 ffinite-loops
-Common Report Var(flag_finite_loops) Optimization
+Common Report Var(flag_finite_loops) Optimization Init(0)
 Assume that loops with an exit will terminate and not loop indefinitely.
 
 ffixed-
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 412750c1fc9..142bfeacead 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10432,7 +10432,8 @@ Assume that a loop with an exit will eventually take the exit and not loop
 indefinitely.  This allows the compiler to remove loops that otherwise have
 no side-effects, not considering eventual endless looping as such.
 
-This option is enabled by default at @option{-O2}.
+This option is enabled by default at @option{-O2} for C++ with -std=c++11
+or higher.
 
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 302ce16a646..d0e59d431ca 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -512,7 +512,10 @@ can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
 	      /* When devirtualization is disabled for callee, it is not safe
 		 to inline it as we possibly mangled the type info.
 		 Allow early inlining of always inlines.  */
-	      || (!early && check_maybe_down (flag_devirtualize)))
+	      || (!early && check_maybe_down (flag_devirtualize))
+	      /* It's not safe to inline a function where loops maybe
+		 infinite into a function where we assume the reverse.  */
+	      || check_maybe_down (flag_finite_loops))
 	{
 	  e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
 	  inlinable = false;
diff --git a/gcc/opts.c b/gcc/opts.c
index 5dc7d65dedd..d4df8627bf7 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -478,7 +478,6 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
-    { OPT_LEVELS_2_PLUS, OPT_ffinite_loops, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
-- 
2.25.1

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-01 13:36 ` [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++ Richard Biener
@ 2020-04-01 13:47   ` Jakub Jelinek
  2020-04-01 13:52     ` Richard Biener
  2020-04-01 19:15   ` Jason Merrill
  1 sibling, 1 reply; 45+ messages in thread
From: Jakub Jelinek @ 2020-04-01 13:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Joseph S. Myers

On Wed, Apr 01, 2020 at 03:36:30PM +0200, Richard Biener wrote:
> @@ -512,7 +512,10 @@ can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
>  	      /* When devirtualization is disabled for callee, it is not safe
>  		 to inline it as we possibly mangled the type info.
>  		 Allow early inlining of always inlines.  */
> -	      || (!early && check_maybe_down (flag_devirtualize)))
> +	      || (!early && check_maybe_down (flag_devirtualize))
> +	      /* It's not safe to inline a function where loops maybe
> +		 infinite into a function where we assume the reverse.  */
> +	      || check_maybe_down (flag_finite_loops))
>  	{
>  	  e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
>  	  inlinable = false;

Couldn't the above care only if the function has any loops?
Otherwise, won't it prevent cross-language LTO inlining too much?
Or instead of disabling inlining arrange for a safe flag_finite_loops value
for the resulting function (set some flag in cfun of the function that would
be considered together with flag_finite_loops (so that we don't have to
create further OPTIMIZATION_NODEs) and disable finite loops opts if we've
inlined !flag_finite_loops function into flag_finite_loops one)?

	Jakub


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-01 13:47   ` Jakub Jelinek
@ 2020-04-01 13:52     ` Richard Biener
  2020-04-01 15:56       ` Jan Hubicka
  0 siblings, 1 reply; 45+ messages in thread
From: Richard Biener @ 2020-04-01 13:52 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Joseph S. Myers

On Wed, 1 Apr 2020, Jakub Jelinek wrote:

> On Wed, Apr 01, 2020 at 03:36:30PM +0200, Richard Biener wrote:
> > @@ -512,7 +512,10 @@ can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
> >  	      /* When devirtualization is disabled for callee, it is not safe
> >  		 to inline it as we possibly mangled the type info.
> >  		 Allow early inlining of always inlines.  */
> > -	      || (!early && check_maybe_down (flag_devirtualize)))
> > +	      || (!early && check_maybe_down (flag_devirtualize))
> > +	      /* It's not safe to inline a function where loops maybe
> > +		 infinite into a function where we assume the reverse.  */
> > +	      || check_maybe_down (flag_finite_loops))
> >  	{
> >  	  e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> >  	  inlinable = false;
> 
> Couldn't the above care only if the function has any loops?
> Otherwise, won't it prevent cross-language LTO inlining too much?
> Or instead of disabling inlining arrange for a safe flag_finite_loops value
> for the resulting function (set some flag in cfun of the function that would
> be considered together with flag_finite_loops (so that we don't have to
> create further OPTIMIZATION_NODEs) and disable finite loops opts if we've
> inlined !flag_finite_loops function into flag_finite_loops one)?

I guess I can split out this hunk from the patch - it's a separate
issue affecting also C++ with mixed option CUs.  Yes, ideally we'd
simply initialize loop->finite_p from flag_finite_loops at CFG
construction time and then only ever check the flag on the loop
structure.  That would leave us with infinite loops for loops
we only discover later though.  It also opens up the possibility
of a per-loop #pragma.

Richard.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-01 13:52     ` Richard Biener
@ 2020-04-01 15:56       ` Jan Hubicka
  2020-04-01 16:59         ` Richard Biener
  0 siblings, 1 reply; 45+ messages in thread
From: Jan Hubicka @ 2020-04-01 15:56 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches, Joseph S. Myers

> On Wed, 1 Apr 2020, Jakub Jelinek wrote:
> 
> > On Wed, Apr 01, 2020 at 03:36:30PM +0200, Richard Biener wrote:
> > > @@ -512,7 +512,10 @@ can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
> > >  	      /* When devirtualization is disabled for callee, it is not safe
> > >  		 to inline it as we possibly mangled the type info.
> > >  		 Allow early inlining of always inlines.  */
> > > -	      || (!early && check_maybe_down (flag_devirtualize)))
> > > +	      || (!early && check_maybe_down (flag_devirtualize))
> > > +	      /* It's not safe to inline a function where loops maybe
> > > +		 infinite into a function where we assume the reverse.  */
> > > +	      || check_maybe_down (flag_finite_loops))
> > >  	{
> > >  	  e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> > >  	  inlinable = false;
> > 
> > Couldn't the above care only if the function has any loops?
> > Otherwise, won't it prevent cross-language LTO inlining too much?
> > Or instead of disabling inlining arrange for a safe flag_finite_loops value
> > for the resulting function (set some flag in cfun of the function that would
> > be considered together with flag_finite_loops (so that we don't have to
> > create further OPTIMIZATION_NODEs) and disable finite loops opts if we've
> > inlined !flag_finite_loops function into flag_finite_loops one)?
> 
> I guess I can split out this hunk from the patch - it's a separate
> issue affecting also C++ with mixed option CUs.  Yes, ideally we'd
> simply initialize loop->finite_p from flag_finite_loops at CFG
> construction time and then only ever check the flag on the loop
> structure.  That would leave us with infinite loops for loops
> we only discover later though.  It also opens up the possibility
> of a per-loop #pragma.

We do want to preserve cross-module inlining between C and C++, so we
really should go with marking the loops pre-inline about finiteness and
try to preserve the info, otherwise we are going to lose quite some
optimizations.

Honza
> 
> Richard.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-01 15:56       ` Jan Hubicka
@ 2020-04-01 16:59         ` Richard Biener
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Biener @ 2020-04-01 16:59 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jakub Jelinek, gcc-patches, Joseph S. Myers

On Wed, 1 Apr 2020, Jan Hubicka wrote:

> > On Wed, 1 Apr 2020, Jakub Jelinek wrote:
> > 
> > > On Wed, Apr 01, 2020 at 03:36:30PM +0200, Richard Biener wrote:
> > > > @@ -512,7 +512,10 @@ can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
> > > >  	      /* When devirtualization is disabled for callee, it is not safe
> > > >  		 to inline it as we possibly mangled the type info.
> > > >  		 Allow early inlining of always inlines.  */
> > > > -	      || (!early && check_maybe_down (flag_devirtualize)))
> > > > +	      || (!early && check_maybe_down (flag_devirtualize))
> > > > +	      /* It's not safe to inline a function where loops maybe
> > > > +		 infinite into a function where we assume the reverse.  */
> > > > +	      || check_maybe_down (flag_finite_loops))
> > > >  	{
> > > >  	  e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> > > >  	  inlinable = false;
> > > 
> > > Couldn't the above care only if the function has any loops?
> > > Otherwise, won't it prevent cross-language LTO inlining too much?
> > > Or instead of disabling inlining arrange for a safe flag_finite_loops value
> > > for the resulting function (set some flag in cfun of the function that would
> > > be considered together with flag_finite_loops (so that we don't have to
> > > create further OPTIMIZATION_NODEs) and disable finite loops opts if we've
> > > inlined !flag_finite_loops function into flag_finite_loops one)?
> > 
> > I guess I can split out this hunk from the patch - it's a separate
> > issue affecting also C++ with mixed option CUs.  Yes, ideally we'd
> > simply initialize loop->finite_p from flag_finite_loops at CFG
> > construction time and then only ever check the flag on the loop
> > structure.  That would leave us with infinite loops for loops
> > we only discover later though.  It also opens up the possibility
> > of a per-loop #pragma.
> 
> We do want to preserve cross-module inlining between C and C++, so we
> really should go with marking the loops pre-inline about finiteness and
> try to preserve the info, otherwise we are going to lose quite some
> optimizations.

OK, I'll update the patch accordingly.

Richard.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-01 13:36 ` [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++ Richard Biener
  2020-04-01 13:47   ` Jakub Jelinek
@ 2020-04-01 19:15   ` Jason Merrill
  2020-04-02  9:12     ` Richard Biener
  1 sibling, 1 reply; 45+ messages in thread
From: Jason Merrill @ 2020-04-01 19:15 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Joseph S. Myers

On 4/1/20 9:36 AM, Richard Biener wrote:
> This does away with enabling -ffinite-loops at -O2+ for all languages
> and instead enables it selectively for C++ only.
> 
> Jason, I didn't find a reference as to when the forward progress
> guarantee was introduced to C++ so I randomly chose C++11, is that
> correct?

C++11 says "Implementations should ensure that all unblocked threads 
eventually make progress."

C++17 adds the "Forward progress" section that says

"The implementation may assume that any thread will eventually do one of 
the following:
  — terminate,
  — make a call to a library I/O function,
  — perform an access through a volatile glvalue, or
  — perform a synchronization operation or an atomic operation.
[Note: This is intended to allow compiler transformations such as 
removal of empty loops, even when termination cannot be proven. — end note]"

Jason


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-01 19:15   ` Jason Merrill
@ 2020-04-02  9:12     ` Richard Biener
  2020-04-02  9:17       ` Jakub Jelinek
  2020-04-03  8:29       ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++) Thomas Schwinge
  0 siblings, 2 replies; 45+ messages in thread
From: Richard Biener @ 2020-04-02  9:12 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches, Joseph S. Myers

On Wed, 1 Apr 2020, Jason Merrill wrote:

> On 4/1/20 9:36 AM, Richard Biener wrote:
> > This does away with enabling -ffinite-loops at -O2+ for all languages
> > and instead enables it selectively for C++ only.
> > 
> > Jason, I didn't find a reference as to when the forward progress
> > guarantee was introduced to C++ so I randomly chose C++11, is that
> > correct?
> 
> C++11 says "Implementations should ensure that all unblocked threads
> eventually make progress."
> 
> C++17 adds the "Forward progress" section that says
> 
> "The implementation may assume that any thread will eventually do one of the
> following:
>  — terminate,
>  — make a call to a library I/O function,
>  — perform an access through a volatile glvalue, or
>  — perform a synchronization operation or an atomic operation.
> [Note: This is intended to allow compiler transformations such as removal of
> empty loops, even when termination cannot be proven. — end note]"

OK, so I assume using C++11 as in the patch is fine.

I'm retesting the following currently which factors in Honzas comments
turning -ffinite-loops into a per-loop setting at CFG construction time
(the same place where we'd handle #pragma induced ANNOTATE_EXPRs for
such feature - for GCC11 we might implement the C side this way which has
more restrictive wording where such forward progress can be assumed)

Thanks,
Richard.

[PATCH] c/94392 - only enable -ffinite-loops for C++

This does away with enabling -ffinite-loops at -O2+ for all languages
and instead enables it selectively for C++ only.

It also makes -ffinite-loops loop-private at CFG construction time
fixing correctness issues with inlining.

2020-04-02  Richard Biener  <rguenther@suse.de>

	PR c/94392
	* c-opts.c (c_common_post_options): Enable -ffinite-loops
	for -O2 and C++11 or newer.

	* common.opt (ffinite-loops): Initialize to zero.
	* opts.c (default_options_table): Remove OPT_ffinite_loops
	entry.
	* cfgloop.h (loop::finite_p): New member.
	* cfgloopmanip.c (copy_loop_info): Copy finite_p.
	* ipa-icf-gimple.c (func_checker::compare_loops): Compare
	finite_p.
	* lto-streamer-in.c (input_cfg): Stream finite_p.
	* lto-streamer-out.c (output_cfg): Likewise.
	* tree-cfg.c (replace_loop_annotate): Initialize finite_p
	from flag_finite_loops at CFG build time.
	* tree-ssa-loop-niter.c (finite_loop_p): Check the loops
	finite_p flag instead of flag_finite_loops.
	* doc/invoke.texi (ffinite-loops): Adjust documentation of
	default setting.

	* gcc.dg/torture/pr94392.c: New testcase.
---
 gcc/c-family/c-opts.c                  |  4 ++++
 gcc/cfgloop.h                          |  4 ++++
 gcc/cfgloopmanip.c                     |  1 +
 gcc/common.opt                         |  2 +-
 gcc/doc/invoke.texi                    |  3 ++-
 gcc/ipa-icf-gimple.c                   |  2 ++
 gcc/lto-streamer-in.c                  |  1 +
 gcc/lto-streamer-out.c                 |  1 +
 gcc/opts.c                             |  1 -
 gcc/testsuite/gcc.dg/torture/pr94392.c | 22 ++++++++++++++++++++++
 gcc/tree-cfg.c                         |  3 +++
 gcc/tree-ssa-loop-niter.c              |  2 +-
 12 files changed, 42 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr94392.c

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 6b6c754ad86..58ba0948e79 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -989,6 +989,10 @@ c_common_post_options (const char **pfilename)
   SET_OPTION_IF_UNSET (&global_options, &global_options_set, flag_new_ttp,
 		       cxx_dialect >= cxx17);
 
+  /* C++11 guarantees forward progress.  */
+  SET_OPTION_IF_UNSET (&global_options, &global_options_set, flag_finite_loops,
+		       optimize >= 2 && cxx_dialect >= cxx11);
+
   if (cxx_dialect >= cxx11)
     {
       /* If we're allowing C++0x constructs, don't warn about C++98
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 1c49a8b8c2d..18b404e292f 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -226,6 +226,10 @@ public:
   /* True if the loop is part of an oacc kernels region.  */
   unsigned in_oacc_kernels_region : 1;
 
+  /* True if the loop is known to be finite.  This is a localized
+     flag_finite_loops or similar pragmas state.  */
+  unsigned finite_p : 1;
+
   /* The number of times to unroll the loop.  0 means no information given,
      just do what we always do.  A value of 1 means do not unroll the loop.
      A value of USHRT_MAX means unroll with no specific unrolling factor.
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index c9375565f62..50c7267ec49 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1023,6 +1023,7 @@ copy_loop_info (class loop *loop, class loop *target)
   target->dont_vectorize = loop->dont_vectorize;
   target->force_vectorize = loop->force_vectorize;
   target->in_oacc_kernels_region = loop->in_oacc_kernels_region;
+  target->finite_p = loop->finite_p;
   target->unroll = loop->unroll;
   target->owned_clique = loop->owned_clique;
 }
diff --git a/gcc/common.opt b/gcc/common.opt
index 4368910cb54..bb2ea4c905d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1490,7 +1490,7 @@ Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
 ffinite-loops
-Common Report Var(flag_finite_loops) Optimization
+Common Report Var(flag_finite_loops) Optimization Init(0)
 Assume that loops with an exit will terminate and not loop indefinitely.
 
 ffixed-
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 412750c1fc9..142bfeacead 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10432,7 +10432,8 @@ Assume that a loop with an exit will eventually take the exit and not loop
 indefinitely.  This allows the compiler to remove loops that otherwise have
 no side-effects, not considering eventual endless looping as such.
 
-This option is enabled by default at @option{-O2}.
+This option is enabled by default at @option{-O2} for C++ with -std=c++11
+or higher.
 
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 3e5b2d4bd6d..d306fec56ce 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -395,6 +395,8 @@ func_checker::compare_loops (basic_block bb1, basic_block bb2)
     return return_false_with_msg ("dont_vectorize");
   if (l1->force_vectorize != l2->force_vectorize)
     return return_false_with_msg ("force_vectorize");
+  if (l1->finite_p != l2->finite_p)
+    return return_false_with_msg ("finite_p");
   if (l1->unroll != l2->unroll)
     return return_false_with_msg ("unroll");
   if (!compare_variable_decl (l1->simduid, l2->simduid))
diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 9566e5ee102..244f5b8aa5c 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -821,6 +821,7 @@ input_cfg (class lto_input_block *ib, class data_in *data_in,
       loop->owned_clique = streamer_read_hwi (ib);
       loop->dont_vectorize = streamer_read_hwi (ib);
       loop->force_vectorize = streamer_read_hwi (ib);
+      loop->finite_p = streamer_read_hwi (ib);
       loop->simduid = stream_read_tree (ib, data_in);
 
       place_new_loop (fn, loop);
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index a219c1d0dd1..52ef94718db 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1950,6 +1950,7 @@ output_cfg (struct output_block *ob, struct function *fn)
       streamer_write_hwi (ob, loop->owned_clique);
       streamer_write_hwi (ob, loop->dont_vectorize);
       streamer_write_hwi (ob, loop->force_vectorize);
+      streamer_write_hwi (ob, loop->finite_p);
       stream_write_tree (ob, loop->simduid, true);
     }
 
diff --git a/gcc/opts.c b/gcc/opts.c
index 5dc7d65dedd..d4df8627bf7 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -478,7 +478,6 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
-    { OPT_LEVELS_2_PLUS, OPT_ffinite_loops, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
diff --git a/gcc/testsuite/gcc.dg/torture/pr94392.c b/gcc/testsuite/gcc.dg/torture/pr94392.c
new file mode 100644
index 00000000000..373f18ce983
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr94392.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-skip-if "finite loops" { *-*-* } { "-ffinite-loops" } } */
+/* { dg-skip-if "LTO optimizes the test" { *-*-* } { "-flto" } } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+int a, b;
+
+int
+main()
+{
+  while (1)
+    {
+      /* Try really hard.  */
+      if (a != b)
+	return 1;
+    }
+  return 0;
+}
+
+/* ISO C does not guarantee forward progress like C++ does so we
+   cannot assume the loop is finite and optimize it to return 1.  */
+/* { dg-final { scan-tree-dump "if" "optimized" } } */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index f7b817d94e6..e99fb9ff5d1 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -324,6 +324,9 @@ replace_loop_annotate (void)
       /* Then look into the latch, if any.  */
       if (loop->latch)
 	replace_loop_annotate_in_block (loop->latch, loop);
+
+      /* Push the global flag_finite_loops state down to individual loops.  */
+      loop->finite_p = flag_finite_loops;
     }
 
   /* Remove IFN_ANNOTATE.  Safeguard for the case loop->latch == NULL.  */
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 6e6df0bfdb8..7d61ef080eb 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2834,7 +2834,7 @@ finite_loop_p (class loop *loop)
       return true;
     }
 
-  if (flag_finite_loops)
+  if (loop->finite_p)
     {
       unsigned i;
       vec<edge> exits = get_loop_exit_edges (loop);
-- 
2.13.7

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-02  9:12     ` Richard Biener
@ 2020-04-02  9:17       ` Jakub Jelinek
  2020-04-02  9:41         ` Richard Biener
  2020-04-03  8:29       ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++) Thomas Schwinge
  1 sibling, 1 reply; 45+ messages in thread
From: Jakub Jelinek @ 2020-04-02  9:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jason Merrill, gcc-patches, Joseph S. Myers

On Thu, Apr 02, 2020 at 11:12:48AM +0200, Richard Biener wrote:
> > "The implementation may assume that any thread will eventually do one of the
> > following:
> >  — terminate,
> >  — make a call to a library I/O function,
> >  — perform an access through a volatile glvalue, or
> >  — perform a synchronization operation or an atomic operation.
> > [Note: This is intended to allow compiler transformations such as removal of
> > empty loops, even when termination cannot be proven. — end note]"

With -ffinite-loops, do we actually not optimize if the loop has volatile accesses
or atomics or library I/O calls?

	Jakub


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++
  2020-04-02  9:17       ` Jakub Jelinek
@ 2020-04-02  9:41         ` Richard Biener
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Biener @ 2020-04-02  9:41 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jason Merrill, gcc-patches, Joseph S. Myers

On Thu, 2 Apr 2020, Jakub Jelinek wrote:

> On Thu, Apr 02, 2020 at 11:12:48AM +0200, Richard Biener wrote:
> > > "The implementation may assume that any thread will eventually do one of the
> > > following:
> > >  — terminate,
> > >  — make a call to a library I/O function,
> > >  — perform an access through a volatile glvalue, or
> > >  — perform a synchronization operation or an atomic operation.
> > > [Note: This is intended to allow compiler transformations such as removal of
> > > empty loops, even when termination cannot be proven. — end note]"
> 
> With -ffinite-loops, do we actually not optimize if the loop has volatile accesses
> or atomics or library I/O calls?

We don't remove the loop then, yes.  All of those are considered 
side-effects by GCC and thus they are considered needed and keep
the loop live.  From a technical point finite_loop_p will still
return true for those which is not correct but harmless at the
moment (I hope).  I read the above as

volatile int i;

int __attribute__((const,noinline)) baz(int i) { return i; }
int foo(int a)
{
  do
    {
      i;
      if (a != baz(a)) return 1;      
    }
  while (1);
}

being a valid endles loop which we indeed preserve.  But we notice
the opportunity to unswitch on the condition and elide _that_ loop:

  <bb 2> [local count: 118111600]:
  _1 = baz (a_4(D));
  if (_1 != a_4(D))
    goto <bb 4>; [11.00%]
  else
    goto <bb 3>; [89.00%]

  <bb 3> [local count: 955630224]:
  vol.0_5 ={v} i;
  goto <bb 3>; [100.00%]

  <bb 4> [local count: 118111600]:
  vol.0_3 ={v} i;
  return 1;

thus turn it into

  if (a != baz(a)) { i; return 1; }
  do { i; } while (1);

Richard.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++)
  2020-04-02  9:12     ` Richard Biener
  2020-04-02  9:17       ` Jakub Jelinek
@ 2020-04-03  8:29       ` Thomas Schwinge
  2020-04-03  9:36         ` Revert "[nvptx, libgomp] Update pr85381-{2,4}.c " Richard Biener
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Schwinge @ 2020-04-03  8:29 UTC (permalink / raw)
  To: Richard Biener, Tom de Vries, gcc-patches
  Cc: Joseph S. Myers, Jason Merrill, Feng Xue OS

[-- Attachment #1: Type: text/plain, Size: 1558 bytes --]

Hi!

On 2020-04-02T11:12:48+0200, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 1 Apr 2020, Jason Merrill wrote:
>
>> On 4/1/20 9:36 AM, Richard Biener wrote:
>> > This does away with enabling -ffinite-loops at -O2+ for all languages
>> > and instead enables it selectively for C++ only.

> I'm retesting the following [...]

..., which got pushed in commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae
"c/94392 - only enable -ffinite-loops for C++".

I pushed the attached in commit 4f6a0888de52a2e523a6fd4235fe7f8193819c3b
'Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713,
PR94392]'.  As can be observed in two nvptx offloading test cases
regressing, 'apparently now again "empty oacc loops are" no longer
"removed before expand"' (quoting myself from the commit log, adapting
Tom's commit log snippet from the reverted commit).

It's not obvious to me how the "finite loop" property discussed/changed
in Richard's commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae "c/94392 -
only enable -ffinite-loops for C++" relates to the previously observed
optimization of removing "empty oacc loops [...] before expand" (after
PR89713 commit c29c92c789d93848cc1c929838771bfc68cb272c "PR
tree-optimization/89713 - Assume loop with an exit is finite"), but
examining that in detail is for another day.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Revert-nvptx-libgomp-Update-pr85381-2-4-.c-test-case.patch --]
[-- Type: text/x-diff, Size: 3340 bytes --]

From 4f6a0888de52a2e523a6fd4235fe7f8193819c3b Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 3 Apr 2020 10:07:16 +0200
Subject: [PATCH] Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases"
 [PR89713, PR94392]

In response to PR94392 commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae
"c/94392 - only enable -ffinite-loops for C++", this reverts PR89713
commit 00908992f2a78f213d227aea8dbab014a1361df0, as apparently now again
"empty oacc loops are" no longer "removed before expand".

	libgomp/
	PR tree-optimization/89713
	PR c/94392
	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Again expect
	'bar.sync'.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise.
---
 libgomp/ChangeLog                             |  8 ++++++++
 .../libgomp.oacc-c-c++-common/pr85381-2.c     | 20 ++++++++++++++++++-
 .../libgomp.oacc-c-c++-common/pr85381-4.c     |  5 ++++-
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 6c437930b02f..3716f559aa1c 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,11 @@
+2020-04-03  Thomas Schwinge  <thomas@codesourcery.com>
+
+	PR tree-optimization/89713
+	PR c/94392
+	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Again expect
+	'bar.sync'.
+	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise.
+
 2020-03-31  Tobias Burnus  <tobias@codesourcery.com>
 
 	* target.c (GOMP_target_enter_exit_data): Handle PSET/MAP_POINTER.
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
index 2cb5b95949de..6570c64afff5 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
@@ -15,4 +15,22 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "bar.sync" 0 } } */
+/* Todo: Boths bar.syncs can be removed.
+   Atm we generate this dead code inbetween forked and joining:
+
+                     mov.u32 %r28, %ntid.y;
+                     mov.u32 %r29, %tid.y;
+                     add.u32 %r30, %r29, %r29;
+                     setp.gt.s32     %r31, %r30, 19;
+             @%r31   bra     $L2;
+                     add.u32 %r25, %r28, %r28;
+                     mov.u32 %r24, %r30;
+     $L3:
+                     add.u32 %r24, %r24, %r25;
+                     setp.le.s32     %r33, %r24, 19;
+             @%r33   bra     $L3;
+     $L2:
+
+   so the loop is not recognized as empty loop (which we detect by seeing if
+   joining immediately follows forked).  */
+/* { dg-final { scan-assembler-times "bar.sync" 2 } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
index e8a433ffc0a5..d955d79718df 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
@@ -21,4 +21,7 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "bar.sync" 0 } } */
+/* Atm, %ntid.y is broadcast from one loop to the next, so there are 2 bar.syncs
+   for that (the other two are there for the same reason as in pr85381-2.c).
+   Todo: Recompute %ntid.y instead of broadcasting it. */
+/* { dg-final { scan-assembler-times "bar.sync" 4 } } */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++)
  2020-04-03  8:29       ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++) Thomas Schwinge
@ 2020-04-03  9:36         ` Richard Biener
  2020-04-03 10:34           ` Jakub Jelinek
  2020-10-30 14:09           ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c " Thomas Schwinge
  0 siblings, 2 replies; 45+ messages in thread
From: Richard Biener @ 2020-04-03  9:36 UTC (permalink / raw)
  To: Thomas Schwinge
  Cc: Tom de Vries, gcc-patches, Joseph S. Myers, Jason Merrill, Feng Xue OS

On Fri, 3 Apr 2020, Thomas Schwinge wrote:

> Hi!
> 
> On 2020-04-02T11:12:48+0200, Richard Biener <rguenther@suse.de> wrote:
> > On Wed, 1 Apr 2020, Jason Merrill wrote:
> >
> >> On 4/1/20 9:36 AM, Richard Biener wrote:
> >> > This does away with enabling -ffinite-loops at -O2+ for all languages
> >> > and instead enables it selectively for C++ only.
> 
> > I'm retesting the following [...]
> 
> ..., which got pushed in commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae
> "c/94392 - only enable -ffinite-loops for C++".
> 
> I pushed the attached in commit 4f6a0888de52a2e523a6fd4235fe7f8193819c3b
> 'Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713,
> PR94392]'.  As can be observed in two nvptx offloading test cases
> regressing, 'apparently now again "empty oacc loops are" no longer
> "removed before expand"' (quoting myself from the commit log, adapting
> Tom's commit log snippet from the reverted commit).
> 
> It's not obvious to me how the "finite loop" property discussed/changed
> in Richard's commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae "c/94392 -
> only enable -ffinite-loops for C++" relates to the previously observed
> optimization of removing "empty oacc loops [...] before expand" (after
> PR89713 commit c29c92c789d93848cc1c929838771bfc68cb272c "PR
> tree-optimization/89713 - Assume loop with an exit is finite"), but
> examining that in detail is for another day.

For C we no longer have -ffinite-loops in effect but for C++ we still
do.  But since the testcase is c/c++ common I'd have expected it
now fails "split" ... so an explicit -fno-finite-loops or
-ffinite-loops with an explanation would be easier.

Note there's now also the opportunity to set the loop flag for
OpenACC/OpenMP annotated loops if any of that guarantees finiteness.
(for GCC11 only please)

Richard.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++)
  2020-04-03  9:36         ` Revert "[nvptx, libgomp] Update pr85381-{2,4}.c " Richard Biener
@ 2020-04-03 10:34           ` Jakub Jelinek
  2020-10-30 14:09           ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c " Thomas Schwinge
  1 sibling, 0 replies; 45+ messages in thread
From: Jakub Jelinek @ 2020-04-03 10:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: Thomas Schwinge, gcc-patches, Joseph S. Myers

On Fri, Apr 03, 2020 at 11:36:29AM +0200, Richard Biener wrote:
> Note there's now also the opportunity to set the loop flag for
> OpenACC/OpenMP annotated loops if any of that guarantees finiteness.
> (for GCC11 only please)

Dunno about OpenACC, but OpenMP loops guarantee finiteness, as the number of
iterations must be computable before the loop and must fit into the type in
which that count is computed without overflows.

	Jakub


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++)
  2020-04-03  9:36         ` Revert "[nvptx, libgomp] Update pr85381-{2,4}.c " Richard Biener
  2020-04-03 10:34           ` Jakub Jelinek
@ 2020-10-30 14:09           ` Thomas Schwinge
  2020-10-30 14:16             ` Revert "[nvptx, libgomp] Update pr85381-{2,4}.c " Jakub Jelinek
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Schwinge @ 2020-10-30 14:09 UTC (permalink / raw)
  To: Richard Biener, Frederik Harwath, Jakub Jelinek
  Cc: Tom de Vries, gcc-patches, Joseph S. Myers, Jason Merrill, Feng Xue OS

Hi!

Frederik stumbled over a related thing.

On 2020-04-03T12:36:29+0200, Richard Biener <rguenther@suse.de> wrote:
> On Fri, 3 Apr 2020, Thomas Schwinge wrote:
>> On 2020-04-02T11:12:48+0200, Richard Biener <rguenther@suse.de> wrote:
>> > On Wed, 1 Apr 2020, Jason Merrill wrote:
>> >
>> >> On 4/1/20 9:36 AM, Richard Biener wrote:
>> >> > This does away with enabling -ffinite-loops at -O2+ for all languages
>> >> > and instead enables it selectively for C++ only.
>>
>> > I'm retesting the following [...]
>>
>> ..., which got pushed in commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae
>> "c/94392 - only enable -ffinite-loops for C++".
>>
>> I pushed the attached in commit 4f6a0888de52a2e523a6fd4235fe7f8193819c3b
>> 'Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713,
>> PR94392]'.  As can be observed in two nvptx offloading test cases
>> regressing, 'apparently now again "empty oacc loops are" no longer
>> "removed before expand"' (quoting myself from the commit log, adapting
>> Tom's commit log snippet from the reverted commit).
>>
>> It's not obvious to me how the "finite loop" property discussed/changed
>> in Richard's commit 75efe9cb1f8938a713ce540dc3b27bc2afcd3fae "c/94392 -
>> only enable -ffinite-loops for C++" relates to the previously observed
>> optimization of removing "empty oacc loops [...] before expand" (after
>> PR89713 commit c29c92c789d93848cc1c929838771bfc68cb272c "PR
>> tree-optimization/89713 - Assume loop with an exit is finite"), but
>> examining that in detail is for another day.
>
> For C we no longer have -ffinite-loops in effect but for C++ we still
> do.  But since the testcase is c/c++ common I'd have expected it
> now fails "split" ... so an explicit -fno-finite-loops or
> -ffinite-loops with an explanation would be easier.

(Thanks, and ACK; still have to look into that.)

> Note there's now also the opportunity to set the loop flag for
> OpenACC/OpenMP annotated loops if any of that guarantees finiteness.
> (for GCC11 only please)

On 2020-04-03T13:34:18+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> Dunno about OpenACC, but OpenMP loops guarantee finiteness, as the number of
> iterations must be computable before the loop and must fit into the type in
> which that count is computed without overflows.

Specifically, is that computable at run-time or compile-time?

Similar for OpenACC.  For example, OpenACC 3.0, 2.9. "Loop Construct",
"Restrictions": "A loop associated with a 'loop' construct that does not
have a 'seq' clause must be written such that the loop iteration count is
computable when entering the loop construct".

(This can only viewed by members of the OpenACC GitHub organization, but
I wanted to share the pointer anyway, and can relay discussion as
necessary.)  For the next version of OpenACC (soon!), this is being
further clarified as per the current discussion in
<https://github.com/OpenACC/openacc-spec/pull/320> "Proposed changes for
range-based for loops and != operator", which should be relevant here:

|   - A loop associated with a 'loop' construct that does not have a 'seq'
|     clause must be written to meet all of the following conditions:
|
|       - The loop variable must be of integer, C/C++ pointer, or C++
|         random-access iterator type.
|
|       - The loop variable must monotonically increase or decrease in the
|         direction of its termination condition.
|
|       - The loop iteration count must be computable in constant time when
|         entering the 'loop' construct.
|
|     For a C++ range-based 'for' loop, the loop variable identified by the
|     above conditions is the internal iterator, such as a pointer, that
|     the compiler generates to iterate the range.  It is not the variable
|     declared by the 'for' loop.

(Notice: "computable in constant time" (which means: not computable only
by actually iterating the whole loop structure), and "computable [...]
when entering" (which, if I got that right, means: at run-time, not
necessarily already compile-time?).


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++)
  2020-10-30 14:09           ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c " Thomas Schwinge
@ 2020-10-30 14:16             ` Jakub Jelinek
  0 siblings, 0 replies; 45+ messages in thread
From: Jakub Jelinek @ 2020-10-30 14:16 UTC (permalink / raw)
  To: Thomas Schwinge
  Cc: Richard Biener, Frederik Harwath, Tom de Vries, gcc-patches,
	Joseph S. Myers, Jason Merrill, Feng Xue OS

On Fri, Oct 30, 2020 at 03:09:31PM +0100, Thomas Schwinge wrote:
> On 2020-04-03T13:34:18+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> > Dunno about OpenACC, but OpenMP loops guarantee finiteness, as the number of
> > iterations must be computable before the loop and must fit into the type in
> > which that count is computed without overflows.
> 
> Specifically, is that computable at run-time or compile-time?

At run-time.  OpenMP certainly doesn't disallow loops with non-constant
steps or lower/upper bound expressions (ok, there is one exception, for
!= loop condition the step must be a constant expression, the step has to
be just 1 or -1).

	Jakub


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-10-30 14:17 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17  4:17 [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Feng Xue OS
2019-05-17 16:47 ` Jeff Law
2019-05-17 18:50   ` Richard Biener
2019-05-18 14:00     ` Marc Glisse
2019-05-20  7:50       ` Richard Biener
2019-05-20  8:27         ` Feng Xue OS
2019-05-20  9:19           ` Richard Biener
2019-05-20  9:48             ` Feng Xue OS
2019-05-20 11:54               ` Richard Biener
2019-05-20 14:00                 ` Feng Xue OS
2019-05-20 14:04                   ` Richard Biener
2019-05-20 14:51                     ` Feng Xue OS
2019-05-21 10:12                       ` Richard Biener
2019-05-21 14:24                         ` Richard Biener
2019-05-22 13:44                           ` Michael Matz
2019-05-24 16:02                             ` [PATCH V3] " Feng Xue OS
2019-05-24  9:15                           ` [PATCH V2] " Feng Xue OS
2019-05-29 11:16                             ` Richard Biener
2019-06-04  6:49                               ` [PATCH V4] " Feng Xue OS
2019-06-04  8:24                                 ` Marc Glisse
2019-06-04 15:16                                   ` [PATCH V5] " Feng Xue OS
2019-06-04 15:24                                     ` [PATCH V6] " Feng Xue OS
2019-06-05 11:05                                       ` Richard Biener
2019-06-06 10:00                                         ` [PATCH V7] " Feng Xue OS
2019-06-11  2:40                                           ` [PATCH V8] " Feng Xue OS
2019-06-12  9:43                                             ` Richard Biener
2019-06-15 12:05                                               ` [committed][nvptx, libgomp] Update pr85381-{2,4}.c test-cases Tom de Vries
2019-05-20 13:04         ` [PATCH] Remove empty loop with assumed finiteness (PR tree-optimization/89713) Marc Glisse
2019-05-20 13:26           ` Richard Biener
2019-05-20 14:49             ` Michael Matz
2019-05-21  8:06               ` Marc Glisse
2020-04-01 13:36 ` [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++ Richard Biener
2020-04-01 13:47   ` Jakub Jelinek
2020-04-01 13:52     ` Richard Biener
2020-04-01 15:56       ` Jan Hubicka
2020-04-01 16:59         ` Richard Biener
2020-04-01 19:15   ` Jason Merrill
2020-04-02  9:12     ` Richard Biener
2020-04-02  9:17       ` Jakub Jelinek
2020-04-02  9:41         ` Richard Biener
2020-04-03  8:29       ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c test-cases" [PR89713, PR94392] (was: [PATCH][RFC] c/94392 - only enable -ffinite-loops for C++) Thomas Schwinge
2020-04-03  9:36         ` Revert "[nvptx, libgomp] Update pr85381-{2,4}.c " Richard Biener
2020-04-03 10:34           ` Jakub Jelinek
2020-10-30 14:09           ` Revert "[nvptx, libgomp] Update pr85381-{2, 4}.c " Thomas Schwinge
2020-10-30 14:16             ` Revert "[nvptx, libgomp] Update pr85381-{2,4}.c " Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).