public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Yuri Rumyantsev <ysrumyan@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Igor Zamyatin <izamyatin@gmail.com>
Subject: Re: [PATCH] Unswitching outer loops.
Date: Wed, 07 Oct 2015 15:26:00 -0000	[thread overview]
Message-ID: <CAEoMCqT1ZQb3sdts0O-+Yr8O8vh032c27b2KkPGBB+YdH7BvmA@mail.gmail.com> (raw)
In-Reply-To: <CAEoMCqSqfYyGS7acvna5z=+Jo5CUskj_qGEt3K6PC9=NuXMhtA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 22457 bytes --]

Richard,

I noticed that 'gimple' type was changed and send you updated patch.

Thanks.
Yuri.

2015-10-07 12:53 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
> Richard,
>
> I've fixed adding virtual phi argument and add check on irreducible basic block.
> New patch is attached.
>
> I checked it for bootstrap and regression testing, no new failures.
>
> ChangeLog:
> 2015-10-07  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and
> "cfghooks.h", add prototypes for introduced new functions.
> (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all
> checks on ability of loop unswitching to tree_unswitch_single_loop;
> invoke tree_unswitch_single_loop or tree_unswitch_outer_loop depending
> on innermost loop check.
> (tree_unswitch_single_loop): Add all required checks on ability of
> loop unswitching under zero recursive level guard.
> (tree_unswitch_outer_loop): New function.
> (find_loop_guard): Likewise.
> (empty_bb_without_guard_p): Likewise.
> (used_outside_loop_p): Likewise.
> (get_vop_from_header): Likewise.
> (hoist_guard): Likewise.
> (check_exit_phi): Likewise.
>
>    gcc/testsuite/ChangeLog:
> * gcc.dg/loop-unswitch-2.c: New test.
> * gcc.dg/loop-unswitch-3.c: Likewise.
> * gcc.dg/loop-unswitch-4.c: Likewise.
>
>
> 2015-10-06 15:21 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Oct 6, 2015 at 1:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> Here is updated patch which reflects almost all your remarks:
>>> 1. Use ordinary get_loop_body.
>>> 2. Delete useless asserts.
>>> 3. Use check on iterated loop instead of finite_loop_p.
>>> 4. Do not update CFG by adjusting the CONDs condition to always true/false.
>>> 5. Add couple tests.
>>
>> +  /* Add NEW_ADGE argument for all phi in post-header block.  */
>> +  bb = exit->dest;
>> +  for (gphi_iterator gsi = gsi_start_phis (bb);
>> +       !gsi_end_p (gsi); gsi_next (&gsi))
>> +    {
>> +      gphi *phi = gsi.phi ();
>> +      /* edge_iterator ei; */
>> +      tree arg;
>> +      if (virtual_operand_p (gimple_phi_result (phi)))
>> +       {
>> +         arg = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
>> +         add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
>>
>> now I know what confused me - here you are looking at a loop exit PHI
>> but querying with the preheader edge index.  I think you need to walk
>> the loop header PHIs to find the PHI for the virtual operand and use that
>> to get the PHI arg from?
>>
>> The side-effect / used-outside code is still the same.  What matters
>> is side-effects outside of the loop-header protected code region, not
>> blocks excluding the inner loop.  Say,
>>
>>   for (;;)
>>     {
>>       if (invariant-guard)
>>         {
>>            printf ("Blah");
>>            for (;;)
>>              ;
>>         }
>>     }
>>
>> would still ok to be unswitched.  So instead of
>>
>> +      if (body[i]->loop_father != loop)
>> +       continue;
>>
>> it would be
>>
>>        if (dominated_by_p (CDI_DOMINATORS, body[i], header)
>>            && !dominated_by_p (CDI_DOMINATORS, body[i], fe->dest))
>>
>> with the obvious improvement to the patch to not only consider header checks
>> in the outer loop header but in the pre-header block of the inner loop.
>>
>> And I still think you should walk the exit PHIs args to see whether they
>> are defined in the non-guarded region of the outer loop instead of walking
>> all uses of all defs.
>>
>> Note that I think you miss endless loops as side-effects if that endless
>> loop occurs through a irreducible region (thus not reflected in the
>> loop tree).  Thus you should reject BB_IRREDUCIBLE_LOOP blocks
>> in the non-guarded region as well.
>>
>> It seems to me that protecting adjacent loops with a single guard is
>> also eligible for hoisting thus the restriction on loop->inner->next
>> should become a restriction on no loops (or irreducible regions)
>> in the non-guarded region.
>>
>> Most things can be improved as followup, but at least the
>> virtual PHI arg thing needs to be sorted out.
>>
>> Thanks,
>> Richard.
>>
>>
>>> ChangeLog:
>>> 2015-10-06  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>
>>> * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and
>>> "cfghooks.h", add prototypes for introduced new functions.
>>> (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all
>>> checks on ability of loop unswitching to tree_unswitch_single_loop;
>>> invoke tree_unswitch_single_loop or tree_unswitch_outer_loop depending
>>> on innermost loop check.
>>> (tree_unswitch_single_loop): Add all required checks on ability of
>>> loop unswitching under zero recursive level guard.
>>> (tree_unswitch_outer_loop): New function.
>>> (find_loop_guard): Likewise.
>>> (empty_bb_without_guard_p): Likewise.
>>> (used_outside_loop_p): Likewise.
>>> (hoist_guard): Likewise.
>>> (check_exit_phi): Likewise.
>>>
>>>    gcc/testsuite/ChangeLog:
>>> * gcc.dg/loop-unswitch-2.c: New test.
>>> * gcc.dg/loop-unswitch-3.c: Likewise.
>>> * gcc.dg/loop-unswitch-4.c: Likewise.
>>>
>>> 2015-10-06 10:59 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Mon, Oct 5, 2015 at 3:13 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Thanks Richard.
>>>>> I'd like to answer on your last comment related to using of exit edge
>>>>> argument for edge that skips loop.
>>>>> Let's consider the following test-case:
>>>>>
>>>>> #include <stdlib.h>
>>>>> #define N 32
>>>>> float *foo(int ustride, int size, float *src)
>>>>> {
>>>>>    float *buffer, *p;
>>>>>    int i, k;
>>>>>
>>>>>    if (!src)
>>>>>     return NULL;
>>>>>
>>>>>    buffer = (float *) malloc(N * size * sizeof(float));
>>>>>
>>>>>    if(buffer)
>>>>>       for(i=0, p=buffer; i<N; i++, src+=ustride)
>>>>> for(k=0; k<size; k++)
>>>>>  *p++ = src[k];
>>>>>
>>>>>    return buffer;
>>>>> }
>>>>>
>>>>> Before adding new edge we have in post-header bb:
>>>>>   <bb 9>:
>>>>>   # _6 = PHI <0B(8), buffer_20(16)>
>>>>>   return _6;
>>>>>
>>>>> It is clear that we must preserve function semantic and transform it to
>>>>> _6 = PHI <0B(12), buffer_19(9), buffer_19(4)>
>>>>
>>>> Ah, yeah.  I was confusing the loop exit of the inner vs. the outer loop.
>>>>
>>>> Richard.
>>>>
>>>>>
>>>>> 2015-10-05 13:57 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Wed, Sep 30, 2015 at 12:46 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> I re-designed outer loop unswitching using basic idea of 23855 patch -
>>>>>>> hoist invariant guard if loop is empty without guard. Note that this
>>>>>>> was added to loop unswitching pass with simple modifications - using
>>>>>>> another loop iterator etc.
>>>>>>>
>>>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>>> What is your opinion?
>>>>>>
>>>>>> Overall it looks good.  Some comments below - a few more testcases would
>>>>>> be nice as well.
>>>>>>
>>>>>> +  /* Loop must not be infinite.  */
>>>>>> +  if (!finite_loop_p (loop))
>>>>>> +    return false;
>>>>>>
>>>>>> why's that?
>>>>>>
>>>>>> +  body = get_loop_body_in_dom_order (loop);
>>>>>> +  for (i = 0; i < loop->num_nodes; i++)
>>>>>> +    {
>>>>>> +      if (body[i]->loop_father != loop)
>>>>>> +       continue;
>>>>>> +      if (!empty_bb_without_guard_p (loop, body[i]))
>>>>>>
>>>>>> I wonder if there is a better way to iterate over the interesting
>>>>>> blocks and PHIs
>>>>>> we need to check for side-effects (and thus we maybe can avoid gathering
>>>>>> the loop in DOM order).
>>>>>>
>>>>>> +      FOR_EACH_SSA_TREE_OPERAND (name, stmt, op_iter, SSA_OP_DEF)
>>>>>> +       {
>>>>>> +         if (may_be_used_outside
>>>>>>
>>>>>> may_be_used_outside can be hoisted above the loop.  I wonder if we can take
>>>>>> advantage of loop-closed SSA form here (and the fact we have a single exit
>>>>>> from the loop).  Iterating over exit dest PHIs and determining whether the
>>>>>> exit edge DEF is inside the loop part it may not be should be enough.
>>>>>>
>>>>>> +  gcc_assert (single_succ_p (pre_header));
>>>>>>
>>>>>> that should be always true.
>>>>>>
>>>>>> +  gsi_remove (&gsi, false);
>>>>>> +  bb = guard->dest;
>>>>>> +  remove_edge (guard);
>>>>>> +  /* Update dominance for destination of GUARD.  */
>>>>>> +  if (EDGE_COUNT (bb->preds) == 0)
>>>>>> +    {
>>>>>> +      basic_block s_bb;
>>>>>> +      gcc_assert (single_succ_p (bb));
>>>>>> +      s_bb = single_succ (bb);
>>>>>> +      delete_basic_block (bb);
>>>>>> +      if (single_pred_p (s_bb))
>>>>>> +       set_immediate_dominator (CDI_DOMINATORS, s_bb, single_pred (s_bb));
>>>>>>
>>>>>> all this massaging should be simplified by leaving it to CFG cleanup by
>>>>>> simply adjusting the CONDs condition to always true/false.  There is
>>>>>> gimple_cond_make_{true,false} () for this (would be nice to have a variant
>>>>>> taking a bool).
>>>>>>
>>>>>> +  new_edge = make_edge (pre_header, exit->dest, flags);
>>>>>> +  if (fix_dom_of_exit)
>>>>>> +    set_immediate_dominator (CDI_DOMINATORS, exit->dest, pre_header);
>>>>>> +  update_stmt (gsi_stmt (gsi));
>>>>>>
>>>>>> the update_stmt should be not necessary, it's done by gsi_insert_after already.
>>>>>>
>>>>>> +  /* Add NEW_ADGE argument for all phi in post-header block.  */
>>>>>> +  bb = exit->dest;
>>>>>> +  for (gphi_iterator gsi = gsi_start_phis (bb);
>>>>>> +       !gsi_end_p (gsi); gsi_next (&gsi))
>>>>>> +    {
>>>>>> +      gphi *phi = gsi.phi ();
>>>>>> +      /* edge_iterator ei; */
>>>>>> +      tree arg;
>>>>>> +      if (virtual_operand_p (gimple_phi_result (phi)))
>>>>>> +       {
>>>>>> +         arg = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
>>>>>> +         add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
>>>>>> +       }
>>>>>> +      else
>>>>>> +       {
>>>>>> +         /* Use exit edge argument.  */
>>>>>> +         arg = PHI_ARG_DEF_FROM_EDGE (phi, exit);
>>>>>> +         add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
>>>>>>
>>>>>> Hum.  How is it ok to use the exit edge argument for the edge that skips
>>>>>> the loop?  Why can't you always use the pre-header edge value?
>>>>>> That is, if we have
>>>>>>
>>>>>>  for(i=0;i<m;++i)
>>>>>>    {
>>>>>>      if (n > 0)
>>>>>>     {
>>>>>>      for (;;)
>>>>>>        {
>>>>>>        }
>>>>>>      }
>>>>>>    }
>>>>>>   ... = i;
>>>>>>
>>>>>> then i is used after the loop and the correct value to use if
>>>>>> n > 0 is false is '0'.  Maybe this way we can also relax
>>>>>> what check_exit_phi does?  IMHO the only restriction is
>>>>>> if sth defined inside the loop before the header check for
>>>>>> the inner loop is used after the loop.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> ChangeLog:
>>>>>>> 2015-09-30  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>
>>>>>>> * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and
>>>>>>> "cfghooks.h", add prototypes for introduced new functions.
>>>>>>> (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all
>>>>>>> checks on ability of loop unswitching to tree_unswitch_single_loop;
>>>>>>> invoke tree_unswitch_single_loop or tree_unswitch_outer_loop depending
>>>>>>> on innermost loop check.
>>>>>>> (tree_unswitch_single_loop): Add all required checks on ability of
>>>>>>> loop unswitching under zero recursive level guard.
>>>>>>> (tree_unswitch_outer_loop): New function.
>>>>>>> (find_loop_guard): Likewise.
>>>>>>> (empty_bb_without_guard_p): Likewise.
>>>>>>> (used_outside_loop_p): Likewise.
>>>>>>> (hoist_guard): Likewise.
>>>>>>> (check_exit_phi): Likewise.
>>>>>>>
>>>>>>>    gcc/testsuite/ChangeLog:
>>>>>>> * gcc.dg/loop-unswitch-2.c: New test.
>>>>>>>
>>>>>>> 2015-09-16 11:26 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> Yeah, as said, the patch wasn't fully ready and it also felt odd to do
>>>>>>>> this hoisting in loop header copying.  Integrating it
>>>>>>>> with LIM would be a better fit eventually.
>>>>>>>>
>>>>>>>> Note that we did agree to go forward with your original patch just
>>>>>>>> making it more "generically" perform outer loop
>>>>>>>> unswitching.  Did you explore that idea further?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Sep 15, 2015 at 6:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Thanks Richard.
>>>>>>>>>
>>>>>>>>> I found one more issue that could not be fixed simply. In 23855 you
>>>>>>>>> consider the following test-case:
>>>>>>>>> void foo(int *ie, int *je, double *x)
>>>>>>>>> {
>>>>>>>>>   int i, j;
>>>>>>>>>   for (j=0; j<*je; ++j)
>>>>>>>>>     for (i=0; i<*ie; ++i)
>>>>>>>>>       x[i+j] = 0.0;
>>>>>>>>> }
>>>>>>>>> and proposed to hoist up a check on *ie out of loop. It requires
>>>>>>>>> memref alias analysis since in general x and ie can alias (if their
>>>>>>>>> types are compatible - int *ie & int * x). Such analysis is performed
>>>>>>>>> by pre or lim passes. Without such analysis we can not hoist a test on
>>>>>>>>> non-zero for *ie out of loop using 238565 patch.
>>>>>>>>>  The second concern is that proposed copy header algorithm changes
>>>>>>>>> loop structure significantly and it is not accepted by vectorizer
>>>>>>>>> since latch is not empty (such transformation assumes loop peeling for
>>>>>>>>> one iteration. So I can propose to implement simple guard hoisting
>>>>>>>>> without copying header and tail blocks (if it is possible).
>>>>>>>>>
>>>>>>>>> I will appreciate you for any advice or help since without such
>>>>>>>>> hoisting we are not able to perform outer loop vectorization for
>>>>>>>>> important benchmark.
>>>>>>>>> and
>>>>>>>>>
>>>>>>>>> 2015-09-15 14:22 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>> On Thu, Sep 3, 2015 at 6:32 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>
>>>>>>>>>>> I started learning, tuning and debugging patch proposed in 23855 and
>>>>>>>>>>> discovered thta it does not work properly.
>>>>>>>>>>> So I wonder is it tested patch and it should work?
>>>>>>>>>>
>>>>>>>>>> I don't remember, but as it wasn't committed it certainly wasn't ready.
>>>>>>>>>>
>>>>>>>>>>> Should it accept for hoisting the following loop nest
>>>>>>>>>>>   for (i=0; i<n; i++) {
>>>>>>>>>>>     s = 0;
>>>>>>>>>>>     for (j=0; j<m; j++)
>>>>>>>>>>>       s += a[i] * b[j];
>>>>>>>>>>>     c[i] = s;
>>>>>>>>>>>   }
>>>>>>>>>>> Note that i-loop will nit be empty if m is equal to 0.
>>>>>>>>>>
>>>>>>>>>> if m is equal to 0 then we still have the c[i] = s store, no?  Of course
>>>>>>>>>> we could unswitch the outer loop on m == 0 but simple hoisting wouldn't work.
>>>>>>>>>>
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>> 2015-08-03 10:27 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>>>> On Fri, Jul 31, 2015 at 1:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I learned your updated patch for 23825 and it is more general in
>>>>>>>>>>>>> comparison with my.
>>>>>>>>>>>>> I'd like to propose you a compromise - let's consider my patch only
>>>>>>>>>>>>> for force-vectorize outer loop only to allow outer-loop
>>>>>>>>>>>>> vecctorization.
>>>>>>>>>>>>
>>>>>>>>>>>> I don't see why we should special-case that if the approach in 23825
>>>>>>>>>>>> is sensible.
>>>>>>>>>>>>
>>>>>>>>>>>>> Note that your approach will not hoist invariant
>>>>>>>>>>>>> guards if loops contains something else except for inner-loop, i.e. it
>>>>>>>>>>>>> won't be empty for taken branch.
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, it does not perform unswitching but guard hoisting.  Note that this
>>>>>>>>>>>> is originally Zdenek Dvoraks patch.
>>>>>>>>>>>>
>>>>>>>>>>>>> I also would like to answer on your last question - CFG cleanup is
>>>>>>>>>>>>> invoked to perform deletion of single-argument phi nodes from tail
>>>>>>>>>>>>> block through substitution - such phi's prevent outer-loop
>>>>>>>>>>>>> vectorization. But it is clear that such transformation can be done
>>>>>>>>>>>>> other pass.
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm, I wonder why the copy_prop pass after unswitching does not
>>>>>>>>>>>> get rid of them?
>>>>>>>>>>>>
>>>>>>>>>>>>> What is your opinion?
>>>>>>>>>>>>
>>>>>>>>>>>> My opinion is that if we want to enhance unswitching to catch this
>>>>>>>>>>>> (or similar) cases then we should make it a lot more general than
>>>>>>>>>>>> your pattern-matching approach.  I see nothing that should prevent
>>>>>>>>>>>> us from considering unswitching non-innermost loops in general.
>>>>>>>>>>>> It should be only a cost consideration to not do non-innermost loop
>>>>>>>>>>>> unswitching (in addition to maybe a --param specifying the maximum
>>>>>>>>>>>> depth of a loop nest to unswitch).
>>>>>>>>>>>>
>>>>>>>>>>>> So my first thought when seeing your patch still holds - the patch
>>>>>>>>>>>> looks very much too specific.
>>>>>>>>>>>>
>>>>>>>>>>>> Richard.
>>>>>>>>>>>>
>>>>>>>>>>>>> Yuri.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2015-07-28 13:50 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>>>>>> On Thu, Jul 23, 2015 at 4:45 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I checked that both test-cases from 23855 are sucessfully unswitched
>>>>>>>>>>>>>>> by proposed patch. I understand that it does not catch deeper loop
>>>>>>>>>>>>>>> nest as
>>>>>>>>>>>>>>>    for (i=0; i<10; i++)
>>>>>>>>>>>>>>>      for (j=0;j<n;j++)
>>>>>>>>>>>>>>>         for (k=0;k<20;k++)
>>>>>>>>>>>>>>>   ...
>>>>>>>>>>>>>>> but duplication of middle-loop does not look reasonable.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here is dump for your second test-case:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> void foo(int *ie, int *je, double *x)
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>   int i, j;
>>>>>>>>>>>>>>>   for (j=0; j<*je; ++j)
>>>>>>>>>>>>>>>     for (i=0; i<*ie; ++i)
>>>>>>>>>>>>>>>       x[i+j] = 0.0;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> grep -i unswitch t6.c.119t.unswitch
>>>>>>>>>>>>>>> ;; Unswitching outer loop
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was saying that why go with a limited approach when a patch (in
>>>>>>>>>>>>>> unknown state...)
>>>>>>>>>>>>>> is available that does it more generally?  Also unswitching is quite
>>>>>>>>>>>>>> expensive compared
>>>>>>>>>>>>>> to "moving" the invariant condition.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In your patch:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +  if (!nloop->force_vectorize)
>>>>>>>>>>>>>> +    nloop->force_vectorize = true;
>>>>>>>>>>>>>> +  if (loop->safelen != 0)
>>>>>>>>>>>>>> +    nloop->safelen = loop->safelen;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I see no guard on force_vectorize so = true looks bogus here.  Please just use
>>>>>>>>>>>>>> copy_loop_info.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +  if (integer_nonzerop (cond_new))
>>>>>>>>>>>>>> +    gimple_cond_set_condition_from_tree (cond_stmt, boolean_true_node);
>>>>>>>>>>>>>> +  else if (integer_zerop (cond_new))
>>>>>>>>>>>>>> +    gimple_cond_set_condition_from_tree (cond_stmt, boolean_false_node);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> gimple_cond_make_true/false (cond_stmt);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> btw, seems odd that we have to recompute which loop is the true / false variant
>>>>>>>>>>>>>> when we just fed a guard condition to loop_version.  Can't we statically
>>>>>>>>>>>>>> determine whether loop or nloop has the in-loop condition true or false?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +  /* Clean-up cfg to remove useless one-argument phi in exit block of
>>>>>>>>>>>>>> +     outer-loop.  */
>>>>>>>>>>>>>> +  cleanup_tree_cfg ();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I know unswitching is already O(number-of-unswitched-loops * size-of-function)
>>>>>>>>>>>>>> because it updates SSA form after each individual unswitching (and it does that
>>>>>>>>>>>>>> because it invokes itself recursively on unswitched loops).  But do you really
>>>>>>>>>>>>>> need to invoke CFG cleanup here?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yuri.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2015-07-14 14:06 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>>>>>>>> On Fri, Jul 10, 2015 at 12:02 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Here is presented simple transformation which tries to hoist out of
>>>>>>>>>>>>>>>>> outer-loop a check on zero trip count for inner-loop. This is very
>>>>>>>>>>>>>>>>> restricted transformation since it accepts outer-loops with very
>>>>>>>>>>>>>>>>> simple cfg, as for example:
>>>>>>>>>>>>>>>>>     acc = 0;
>>>>>>>>>>>>>>>>>    for (i = 1; i <= m; i++) {
>>>>>>>>>>>>>>>>>       for (j = 0; j < n; j++)
>>>>>>>>>>>>>>>>>          if (l[j] == i) { v[j] = acc; acc++; };
>>>>>>>>>>>>>>>>>       acc <<= 1;
>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Note that degenerative outer loop (without inner loop) will be
>>>>>>>>>>>>>>>>> completely deleted as dead code.
>>>>>>>>>>>>>>>>> The main goal of this transformation was to convert outer-loop to form
>>>>>>>>>>>>>>>>> accepted by outer-loop vectorization (such test-case is also included
>>>>>>>>>>>>>>>>> to patch).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think this is
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> as well.  It has a patch adding a invariant loop guard hoisting
>>>>>>>>>>>>>>>> phase to loop-header copying.  Yeah, it needs updating to
>>>>>>>>>>>>>>>> trunk again I suppose.  It's always non-stage1 when I come
>>>>>>>>>>>>>>>> back to that patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Your patch seems to be very specific and only handles outer
>>>>>>>>>>>>>>>> loops of innermost loops.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ChangeLog:
>>>>>>>>>>>>>>>>> 2015-07-10  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> * tree-ssa-loop-unswitch.c: Include "tree-cfgcleanup.h" and
>>>>>>>>>>>>>>>>> "gimple-iterator.h", add prototype for tree_unswitch_outer_loop.
>>>>>>>>>>>>>>>>> (tree_ssa_unswitch_loops): Add invoke of tree_unswitch_outer_loop.
>>>>>>>>>>>>>>>>> (tree_unswitch_outer_loop): New function.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/unswitch-outer-loop-1.c: New test.
>>>>>>>>>>>>>>>>> * gcc.dg/vect/vect-outer-simd-3.c: New test.

[-- Attachment #2: patch.fixed --]
[-- Type: application/octet-stream, Size: 17376 bytes --]

diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-2.c b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
new file mode 100644
index 0000000..5ebf608
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-2.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+
+void foo (float **a, float **b, float *c, int n, int m, int l)
+{
+  int i,j,k;
+  float s;
+  for (i=0; i<l; i++)
+    for (j=0; j<n; j++)
+      for (k=0; k<m; k++)
+	c[i] += a[i][k] * b[k][j];
+}
+
+/* { dg-final { scan-tree-dump-times "guard hoisted" 2 "unswitch" } } */
+
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-3.c b/gcc/testsuite/gcc.dg/loop-unswitch-3.c
new file mode 100644
index 0000000..e355286
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-3.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */
+
+#include <stdlib.h>
+#define N 32
+float *foo(int ustride, int size, float *src)
+{
+   float *buffer, *p;
+   int i, k;
+
+   if (!src)
+    return NULL;
+
+   buffer = (float *) malloc(N * size * sizeof(float));
+
+   if(buffer)
+      for(i=0, p=buffer; i<N; i++, src+=ustride)
+	for(k=0; k<size; k++)
+	  *p++ = src[k];
+
+   return buffer;
+}
+
+/* { dg-final { scan-tree-dump-times "guard hoisted" 1 "unswitch" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-4.c b/gcc/testsuite/gcc.dg/loop-unswitch-4.c
new file mode 100644
index 0000000..320a1cd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-4.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -funswitch-loops" } */
+
+#include <stdlib.h>
+__attribute__ ((noinline))
+void foo (float **a, float **b, float *c, int n, int m, int l)
+{
+  int i,j,k;
+  float s;
+  for (i=0; i<l; i++)
+    for (j=0; j<n; j++)
+      for (k=0; k<m; k++)
+	c[i] += a[i][k] * b[k][j];
+}
+
+int main()
+{
+  const int N = 32;
+  float **ar1, **ar2;
+  float *res;
+  int i, j;
+  ar1 = (float **)malloc (N * sizeof (float*));
+  ar2 = (float **)malloc (N * sizeof (float*));
+  res = (float *)malloc( N * sizeof (float));
+  for (i=0; i<N; i++)
+    {
+      ar1[i] = (float*)malloc (N * sizeof (float));
+      ar2[i] = (float*)malloc (N * sizeof (float));
+    }
+  for (i=0; i<N; i++)
+    {
+      for (j=0; j<N; j++)
+	{
+	  ar1[i][j] = 2.0f;
+	  ar2[i][j] = 1.5f;
+	}
+      res[i] = 0.0f;
+    }
+  foo (ar1, ar2, res, N, N, N);
+  for (i=0; i<N; i++)
+    if (res[i] != 3072.0f)
+      abort();
+  for (i=0; i<N; i++)
+    res[i] = 0.0f;
+  foo (ar1, ar2, res, N, 0, N);
+  for (i=0; i<N; i++)
+    if (res[i] != 0.0f)
+      abort();
+ 
+  return 0;
+}
+
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 0b54612..4328d6a 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -39,6 +39,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "tree-pass.h"
 #include "tree-inline.h"
+#include "gimple-iterator.h"
+#include "cfghooks.h"
 
 /* This file implements the loop unswitching, i.e. transformation of loops like
 
@@ -79,6 +81,13 @@ along with GCC; see the file COPYING3.  If not see
 static struct loop *tree_unswitch_loop (struct loop *, basic_block, tree);
 static bool tree_unswitch_single_loop (struct loop *, int);
 static tree tree_may_unswitch_on (basic_block, struct loop *);
+static bool tree_unswitch_outer_loop (struct loop *);
+static edge find_loop_guard (struct loop *);
+static bool empty_bb_without_guard_p (struct loop *, basic_block);
+static bool used_outside_loop_p (struct loop *, tree);
+static void hoist_guard (struct loop *, edge);
+static bool check_exit_phi (struct loop *);
+static tree get_vop_from_header (struct loop *);
 
 /* Main entry point.  Perform loop unswitching on all suitable loops.  */
 
@@ -87,42 +96,15 @@ tree_ssa_unswitch_loops (void)
 {
   struct loop *loop;
   bool changed = false;
-  HOST_WIDE_INT iterations;
 
-  /* Go through inner loops (only original ones).  */
-  FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
+  /* Go through all loops starting from innermost.  */
+  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
     {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-        fprintf (dump_file, ";; Considering loop %d\n", loop->num);
-
-      /* Do not unswitch in cold regions. */
-      if (optimize_loop_for_size_p (loop))
-        {
-          if (dump_file && (dump_flags & TDF_DETAILS))
-            fprintf (dump_file, ";; Not unswitching cold loops\n");
-          continue;
-        }
-
-      /* The loop should not be too large, to limit code growth. */
-      if (tree_num_loop_insns (loop, &eni_size_weights)
-          > (unsigned) PARAM_VALUE (PARAM_MAX_UNSWITCH_INSNS))
-        {
-          if (dump_file && (dump_flags & TDF_DETAILS))
-            fprintf (dump_file, ";; Not unswitching, loop too big\n");
-          continue;
-        }
-
-      /* If the loop is not expected to iterate, there is no need
-	 for unswitching.  */
-      iterations = estimated_loop_iterations_int (loop);
-      if (iterations >= 0 && iterations <= 1)
-	{
-          if (dump_file && (dump_flags & TDF_DETAILS))
-            fprintf (dump_file, ";; Not unswitching, loop is not expected to iterate\n");
-          continue;
-	}
-
-      changed |= tree_unswitch_single_loop (loop, 0);
+      if (!loop->inner)
+	/* Unswitch innermost loop.  */
+	changed |= tree_unswitch_single_loop (loop, 0);
+      else
+	changed |= tree_unswitch_outer_loop (loop);
     }
 
   if (changed)
@@ -216,6 +198,39 @@ tree_unswitch_single_loop (struct loop *loop, int num)
   tree cond = NULL_TREE;
   gimple *stmt;
   bool changed = false;
+  HOST_WIDE_INT iterations;
+
+  /* Perform initial tests if unswitch is eligible.  */
+  if (num == 0)
+    {
+      /* Do not unswitch in cold regions. */
+      if (optimize_loop_for_size_p (loop))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, ";; Not unswitching cold loops\n");
+	  return false;
+	}
+
+      /* The loop should not be too large, to limit code growth. */
+      if (tree_num_loop_insns (loop, &eni_size_weights)
+	  > (unsigned) PARAM_VALUE (PARAM_MAX_UNSWITCH_INSNS))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, ";; Not unswitching, loop too big\n");
+	  return false;
+	}
+
+      /* If the loop is not expected to iterate, there is no need
+	 for unswitching.  */
+      iterations = estimated_loop_iterations_int (loop);
+      if (iterations >= 0 && iterations <= 1)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, ";; Not unswitching, loop is not expected"
+		     " to iterate\n");
+	  return false;
+	}
+    }
 
   i = 0;
   bbs = get_loop_body (loop);
@@ -403,6 +418,374 @@ tree_unswitch_loop (struct loop *loop,
 		       REG_BR_PROB_BASE - prob_true, false);
 }
 
+/* Unswitch outer loops by hoisting invariant guard on
+   inner loop without code duplication.  */
+static bool
+tree_unswitch_outer_loop (struct loop *loop)
+{
+  edge exit, guard;
+  HOST_WIDE_INT iterations;
+
+  gcc_assert (loop->inner);
+  if (loop->inner->next)
+    return false;
+  /* Accept loops with single exit only.  */
+  exit = single_exit (loop);
+  if (!exit)
+    return false;
+  /* Check that phi argument of exit edge is not defined inside loop.  */
+  if (!check_exit_phi (loop))
+    return false;
+  /* If the loop is not expected to iterate, there is no need
+      for unswitching.  */
+  iterations = estimated_loop_iterations_int (loop);
+  if (iterations >= 0 && iterations <= 1)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, ";; Not unswitching, loop is not expected"
+		 " to iterate\n");
+	return false;
+    }
+
+  guard = find_loop_guard (loop);
+  if (guard)
+    {
+      hoist_guard (loop, guard);
+      update_ssa (TODO_update_ssa);
+      return true;
+    }
+  return false;
+}
+
+/* Checks if the body of the LOOP is within an invariant guard.  If this
+   is the case, returns the edge that jumps over the real body of the loop,
+   otherwise returns NULL.  */
+
+static edge
+find_loop_guard (struct loop *loop)
+{
+  basic_block header = loop->header;
+  edge guard_edge, te, fe;
+  /* bitmap processed, known_invariants;*/
+  basic_block *body = NULL;
+  unsigned i;
+  tree use;
+  ssa_op_iter iter;
+
+  /* We check for the following situation:
+
+     while (1)
+       {
+	 [header]]
+         loop_phi_nodes;
+	 something1;
+	 if (cond1)
+	   body;
+	 nvar = phi(orig, bvar) ... for all variables changed in body;
+	 [guard_end]
+	 something2;
+	 if (cond2)
+	   break;
+	 something3;
+       }
+
+     where:
+
+     1) cond1 is loop invariant
+     2) If cond1 is false, then the loop is essentially empty; i.e.,
+	a) nothing in something1, something2 and something3 has side
+	   effects
+	b) anything defined in something1, something2 and something3
+	   is not used outside of the loop.  */
+
+  while (single_succ_p (header))
+    header = single_succ (header);
+  if (!last_stmt (header)
+      || gimple_code (last_stmt (header)) != GIMPLE_COND)
+    return NULL;
+
+  extract_true_false_edges_from_block (header, &te, &fe);
+  if (!flow_bb_inside_loop_p (loop, te->dest)
+      || !flow_bb_inside_loop_p (loop, fe->dest))
+    return NULL;
+
+  if (just_once_each_iteration_p (loop, te->dest)
+      || (single_succ_p (te->dest)
+	  && just_once_each_iteration_p (loop, single_succ (te->dest))))
+    {
+      if (just_once_each_iteration_p (loop, fe->dest))
+	return NULL;
+      guard_edge = te;
+    }
+  else if (just_once_each_iteration_p (loop, fe->dest)
+	   || (single_succ_p (fe->dest)
+	       && just_once_each_iteration_p (loop, single_succ (fe->dest))))
+    guard_edge = fe;
+  else
+    return NULL;
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file,
+	     "Considering guard %d -> %d in loop %d\n",
+	     guard_edge->src->index, guard_edge->dest->index, loop->num);
+  /* Check if condition operands do not have definitions inside loop since
+     any bb copying is not performed.  */
+  FOR_EACH_SSA_TREE_OPERAND (use, last_stmt (header), iter, SSA_OP_USE)
+    {
+      gimple *def = SSA_NAME_DEF_STMT (use);
+      basic_block def_bb = gimple_bb (def);
+      if (def_bb
+          && flow_bb_inside_loop_p (loop, def_bb))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "  guard operands have definitions"
+				" inside loop\n");
+	  return NULL;
+	}
+    }
+
+  body = get_loop_body (loop);
+  for (i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = body[i];
+      if (bb->loop_father != loop)
+	continue;
+      if (bb->flags & BB_IRREDUCIBLE_LOOP)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "Block %d is marked as irreducible in loop\n",
+		      bb->index);
+	  guard_edge = NULL;
+	  goto end;
+	}
+      if (!empty_bb_without_guard_p (loop, bb))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "  block %d has side effects\n", bb->index);
+	  guard_edge = NULL;
+	  goto end;
+	}
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "  suitable to hoist\n");
+end:
+  if (body)
+    free (body);
+  return guard_edge;
+}
+
+/* Returns true if
+   1) no statement in BB has side effects
+   2) assuming that edge GUARD is always taken, all definitions in BB
+      are noy used outside of the loop.
+   KNOWN_INVARIANTS is a set of ssa names we know to be invariant, and
+   PROCESSED is a set of ssa names for that we already tested whether they
+   are invariant or not.  */
+
+static bool
+empty_bb_without_guard_p (struct loop *loop, basic_block bb)
+{
+  basic_block exit_bb = single_exit (loop)->src;
+  bool may_be_used_outside = (bb == exit_bb
+			      || !dominated_by_p (CDI_DOMINATORS, bb, exit_bb));
+  tree name;
+  ssa_op_iter op_iter;
+
+  /* Phi nodes do not have side effects, but their results might be used
+     outside of the loop.  */
+  if (may_be_used_outside)
+    {
+      for (gphi_iterator gsi = gsi_start_phis (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gphi *phi = gsi.phi ();
+	  name = PHI_RESULT (phi);
+	  if (virtual_operand_p (name))
+	    continue;
+
+	  if (used_outside_loop_p (loop, name))
+	    return false;
+	}
+    }
+
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+       !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (gimple_has_side_effects (stmt))
+	return false;
+
+      if (gimple_vdef(stmt))
+	return false;
+
+      FOR_EACH_SSA_TREE_OPERAND (name, stmt, op_iter, SSA_OP_DEF)
+	{
+	  if (may_be_used_outside
+	      && used_outside_loop_p (loop, name))
+	    return false;
+	}
+    }
+  return true;
+}
+
+/* Return true if NAME is used outside of LOOP.  */
+
+static bool
+used_outside_loop_p (struct loop *loop, tree name)
+{
+  imm_use_iterator it;
+  use_operand_p use;
+
+  FOR_EACH_IMM_USE_FAST (use, it, name)
+    {
+      gimple *stmt = USE_STMT (use);
+      if (!flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
+	return true;
+    }
+
+  return false;
+}
+
+/* Return argument for loop preheader edge in header virtual phi if any.  */
+
+static tree
+get_vop_from_header (struct loop *loop)
+{
+  for (gphi_iterator gsi = gsi_start_phis (loop->header);
+       !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gphi *phi = gsi.phi ();
+      if (!virtual_operand_p (gimple_phi_result (phi)))
+	continue;
+      return PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
+    }
+  return NULL_TREE;
+}
+
+/* Move the check of GUARD outside of LOOP.  */
+
+static void
+hoist_guard (struct loop *loop, edge guard)
+{
+  edge exit = single_exit (loop);
+  edge preh = loop_preheader_edge (loop);
+  basic_block pre_header = preh->src;
+  basic_block bb;
+  edge te, fe, e, new_edge;
+  gimple *stmt;
+  basic_block guard_bb = guard->src;
+  gimple_stmt_iterator gsi;
+  int flags = 0;
+  bool fix_dom_of_exit;
+  gcond *cond_stmt, *new_cond_stmt;
+
+  bb = get_immediate_dominator (CDI_DOMINATORS, exit->dest);
+  fix_dom_of_exit = flow_bb_inside_loop_p (loop, bb);
+  gsi = gsi_last_bb (guard_bb);
+  stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_COND);
+  cond_stmt = as_a <gcond *> (stmt);
+  extract_true_false_edges_from_block (guard_bb, &te, &fe);
+  /* Insert guard to PRE_HEADER.  */
+  if (!empty_block_p (pre_header))
+    gsi = gsi_last_bb (pre_header);
+  else
+    gsi = gsi_start_bb (pre_header);
+  /* Create copy of COND_STMT.  */
+  new_cond_stmt = gimple_build_cond (gimple_cond_code (cond_stmt),
+				     gimple_cond_lhs (cond_stmt),
+				     gimple_cond_rhs (cond_stmt),
+				     NULL_TREE, NULL_TREE);
+  gsi_insert_after (&gsi, new_cond_stmt, GSI_NEW_STMT);
+  /* Convert COND_STMT to true/false conditional.  */
+  if (guard == te)
+    gimple_cond_make_false (cond_stmt);
+  else
+    gimple_cond_make_true (cond_stmt);
+  update_stmt (cond_stmt);
+  /* Create new loop pre-header.  */
+  e = split_block (pre_header, last_stmt (pre_header));
+  gcc_assert (loop_preheader_edge (loop)->src == e->dest);
+  if (guard == fe)
+    {
+      e->flags = EDGE_TRUE_VALUE;
+      flags |= EDGE_FALSE_VALUE;
+    }
+  else
+    {
+      e->flags = EDGE_FALSE_VALUE;
+      flags |= EDGE_TRUE_VALUE;
+    }
+  new_edge = make_edge (pre_header, exit->dest, flags);
+  if (fix_dom_of_exit)
+    set_immediate_dominator (CDI_DOMINATORS, exit->dest, pre_header);
+  /* Add NEW_ADGE argument for all phi in post-header block.  */
+  bb = exit->dest;
+  for (gphi_iterator gsi = gsi_start_phis (bb);
+       !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gphi *phi = gsi.phi ();
+      tree arg;
+      if (virtual_operand_p (gimple_phi_result (phi)))
+	{
+	  arg = get_vop_from_header (loop);
+	  if (arg == NULL_TREE)
+	    /* Use exit edge argument.  */
+	    arg =  PHI_ARG_DEF_FROM_EDGE (phi, exit);
+	  add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
+	}
+      else
+	{
+	  /* Use exit edge argument.  */
+	  arg = PHI_ARG_DEF_FROM_EDGE (phi, exit);
+	  add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
+	}
+    }
+
+  mark_virtual_operands_for_renaming (cfun);
+  update_ssa (TODO_update_ssa);
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "  guard hoisted.\n");
+}
+
+/* Return true if phi argument for exit edge can be used
+   for edge around loop.  */
+
+static bool
+check_exit_phi (struct loop *loop)
+{
+  edge exit = single_exit (loop);
+  basic_block pre_header = loop_preheader_edge (loop)->src;
+
+  for (gphi_iterator gsi = gsi_start_phis (exit->dest);
+       !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gphi *phi = gsi.phi ();
+      tree arg;
+      gimple *def;
+      basic_block def_bb;
+      if (virtual_operand_p (gimple_phi_result (phi)))
+	continue;
+      arg = PHI_ARG_DEF_FROM_EDGE (phi, exit);
+      if (TREE_CODE (arg) != SSA_NAME)
+	continue;
+      def = SSA_NAME_DEF_STMT (arg);
+      if (!def)
+	continue;
+      def_bb = gimple_bb (def);
+      if (!def_bb)
+	continue;
+      if (!dominated_by_p (CDI_DOMINATORS, pre_header, def_bb))
+	/* Definition inside loop!  */
+	return false;
+      /* Check loop closed phi invariant.  */
+      if (!flow_bb_inside_loop_p (def_bb->loop_father, pre_header))
+	return false;
+    }
+  return true;
+}
+
 /* Loop unswitching pass.  */
 
 namespace {

  reply	other threads:[~2015-10-07 15:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-10 10:03 Yuri Rumyantsev
2015-07-14 11:07 ` Richard Biener
2015-07-23 15:21   ` Yuri Rumyantsev
2015-07-28 11:00     ` Richard Biener
2015-07-31 12:07       ` Yuri Rumyantsev
2015-07-31 15:54         ` Jeff Law
2015-08-03  7:27         ` Richard Biener
     [not found]           ` <CAEoMCqSorkh1WmFtVB_huC2hbcVr8uc1EYaRaNVe1g+5hVuzPw@mail.gmail.com>
     [not found]             ` <CAFiYyc1nCCyF-4BH2hPWkKpmXnaQFQ34RMM5TTuHjZxZ25crrA@mail.gmail.com>
     [not found]               ` <CAEoMCqSRsER9ZGgnX9eJgZJyN4EwkpxzWWk1FHRxWNiEW0HVCg@mail.gmail.com>
     [not found]                 ` <CAFiYyc2O9i690A0LZ0+jEOP8nkyz8Btc0YAb469aMgnRaVsmsQ@mail.gmail.com>
2015-09-30 11:40                   ` Yuri Rumyantsev
2015-10-05 10:57                     ` Richard Biener
2015-10-05 13:13                       ` Yuri Rumyantsev
2015-10-06  7:59                         ` Richard Biener
2015-10-06 11:41                           ` Yuri Rumyantsev
2015-10-06 12:21                             ` Richard Biener
2015-10-07  9:53                               ` Yuri Rumyantsev
2015-10-07 15:26                                 ` Yuri Rumyantsev [this message]
2015-10-08 12:31                                   ` Richard Biener
2015-10-09 19:05                                 ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEoMCqT1ZQb3sdts0O-+Yr8O8vh032c27b2KkPGBB+YdH7BvmA@mail.gmail.com \
    --to=ysrumyan@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=izamyatin@gmail.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).