Re: [PATCH] Fix loop split incorrect count and probability

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Xionghu Luo <luoxhu@linux.ibm.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
	fxue@os.amperecomputing.com, wschmidt@linux.ibm.com,
	guojiufu@linux.ibm.com, linkw@gcc.gnu.org, hubicka@ucw.cz
Subject: Re: [PATCH] Fix loop split incorrect count and probability
Date: Fri, 15 Oct 2021 13:51:15 +0800	[thread overview]
Message-ID: <e96eb6f7-b919-42a2-b1e3-e678abe5eed7@linux.ibm.com> (raw)
In-Reply-To: <4rpoq32-1r80-89p-436-npo6p91q72pr@fhfr.qr>

[-- Attachment #1: Type: text/plain, Size: 12709 bytes --]



On 2021/9/23 20:17, Richard Biener wrote:
> On Wed, 22 Sep 2021, Xionghu Luo wrote:
> 
>>
>>
>> On 2021/8/11 17:16, Richard Biener wrote:
>>> On Wed, 11 Aug 2021, Xionghu Luo wrote:
>>>
>>>>
>>>>
>>>> On 2021/8/10 22:47, Richard Biener wrote:
>>>>> On Mon, 9 Aug 2021, Xionghu Luo wrote:
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> On 2021/8/6 19:46, Richard Biener wrote:
>>>>>>> On Tue, 3 Aug 2021, Xionghu Luo wrote:
>>>>>>>
>>>>>>>> loop split condition is moved between loop1 and loop2, the split bb's
>>>>>>>> count and probability should also be duplicated instead of (100% vs
>>>>>>>> INV),
>>>>>>>> secondly, the original loop1 and loop2 count need be propotional from
>>>>>>>> the
>>>>>>>> original loop.
>>>>>>>>
>>>>>>>>
>>>>>>>> diff base/loop-cond-split-1.c.151t.lsplit
>>>>>>>> patched/loop-cond-split-1.c.151t.lsplit:
>>>>>>>> ...
>>>>>>>>       int prephitmp_16;
>>>>>>>>       int prephitmp_25;
>>>>>>>>
>>>>>>>>       <bb 2> [local count: 118111600]:
>>>>>>>>       if (n_7(D) > 0)
>>>>>>>>         goto <bb 4>; [89.00%]
>>>>>>>>       else
>>>>>>>>         goto <bb 3>; [11.00%]
>>>>>>>>
>>>>>>>>       <bb 3> [local count: 118111600]:
>>>>>>>>       return;
>>>>>>>>
>>>>>>>>       <bb 4> [local count: 105119324]:
>>>>>>>>       pretmp_3 = ga;
>>>>>>>>
>>>>>>>> -  <bb 5> [local count: 955630225]:
>>>>>>>> +  <bb 5> [local count: 315357973]:
>>>>>>>>       # i_13 = PHI <i_10(20), 0(4)>
>>>>>>>>       # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
>>>>>>>>       if (prephitmp_12 != 0)
>>>>>>>>         goto <bb 6>; [33.00%]
>>>>>>>>       else
>>>>>>>>         goto <bb 7>; [67.00%]
>>>>>>>>
>>>>>>>> -  <bb 6> [local count: 315357972]:
>>>>>>>> +  <bb 6> [local count: 104068130]:
>>>>>>>>       _2 = do_something ();
>>>>>>>>       ga = _2;
>>>>>>>>
>>>>>>>> -  <bb 7> [local count: 955630225]:
>>>>>>>> +  <bb 7> [local count: 315357973]:
>>>>>>>>       # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
>>>>>>>>       i_10 = inc (i_13);
>>>>>>>>       if (n_7(D) > i_10)
>>>>>>>>         goto <bb 21>; [89.00%]
>>>>>>>>       else
>>>>>>>>         goto <bb 11>; [11.00%]
>>>>>>>>
>>>>>>>>       <bb 11> [local count: 105119324]:
>>>>>>>>       goto <bb 3>; [100.00%]
>>>>>>>>
>>>>>>>> -  <bb 21> [local count: 850510901]:
>>>>>>>> +  <bb 21> [local count: 280668596]:
>>>>>>>>       if (prephitmp_12 != 0)
>>>>>>>> -    goto <bb 20>; [100.00%]
>>>>>>>> +    goto <bb 20>; [33.00%]
>>>>>>>>       else
>>>>>>>> -    goto <bb 19>; [INV]
>>>>>>>> +    goto <bb 19>; [67.00%]
>>>>>>>>
>>>>>>>> -  <bb 20> [local count: 850510901]:
>>>>>>>> +  <bb 20> [local count: 280668596]:
>>>>>>>>       goto <bb 5>; [100.00%]
>>>>>>>>
>>>>>>>> -  <bb 19> [count: 0]:
>>>>>>>> +  <bb 19> [local count: 70429947]:
>>>>>>>>       # i_23 = PHI <i_10(21)>
>>>>>>>>       # prephitmp_25 = PHI <prephitmp_5(21)>
>>>>>>>>
>>>>>>>> -  <bb 12> [local count: 955630225]:
>>>>>>>> +  <bb 12> [local count: 640272252]:
>>>>>>>>       # i_15 = PHI <i_23(19), i_22(16)>
>>>>>>>>       # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
>>>>>>>>       i_22 = inc (i_15);
>>>>>>>>       if (n_7(D) > i_22)
>>>>>>>>         goto <bb 16>; [89.00%]
>>>>>>>>       else
>>>>>>>>         goto <bb 11>; [11.00%]
>>>>>>>>
>>>>>>>> -  <bb 16> [local count: 850510901]:
>>>>>>>> +  <bb 16> [local count: 569842305]:
>>>>>>>>       goto <bb 12>; [100.00%]
>>>>>>>>
>>>>>>>>     }
>>>>>>>>
>>>>>>>> gcc/ChangeLog:
>>>>>>>>
>>>>>>>>  * tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
>>>>>>>>  (do_split_loop_on_cond): Likewise.
>>>>>>>> ---
>>>>>>>>     gcc/tree-ssa-loop-split.c | 16 ++++++++--------
>>>>>>>>     1 file changed, 8 insertions(+), 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>>>>>>>> index 3a09bbc39e5..8e5a7ded0f7 100644
>>>>>>>> --- a/gcc/tree-ssa-loop-split.c
>>>>>>>> +++ b/gcc/tree-ssa-loop-split.c
>>>>>>>> @@ -583,10 +583,10 @@ split_loop (class loop *loop1)
>>>>>>>>      basic_block cond_bb;
>>>>>>
>>>>>>    	if (!initial_true)
>>>>>> -	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
>>>>>> +	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
>>>>>> +
>>>>>> +	edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
>>>>>> +			   ? EDGE_SUCC (bbs[i], 0)
>>>>>> +			   : EDGE_SUCC (bbs[i], 1);
>>>>>>
>>>>>>>>     
>>>>>>>>     	class loop *loop2 = loop_version (loop1, cond, &cond_bb,
>>>>>>>> -					   profile_probability::always
>>>>>>>> (),
>>>>>>>> -					   profile_probability::always
>>>>>>>> (),
>>>>>>>> -					   profile_probability::always
>>>>>>>> (),
>>>>>>>> -					   profile_probability::always
>>>>>>>> (),
>>>>>>>> +					   true_edge->probability,
>>>>>>>> +
>>>>>>>> true_edge->probability.invert (),
>>>>>>>> +					   true_edge->probability,
>>>>>>>> +
>>>>>>>> true_edge->probability.invert (),
>>>>>>>>             true);
>>>>>>>
>>>>>>> there is no 'true_edge' variable at this point.
>>>>>>
>>>>>> Sorry, missed the above hunk when split the patch.
>>>>>>
>>>>>>>
>>>>>>>>      gcc_assert (loop2);
>>>>>>>>     @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1,
>>>>>>>> edge invar_branch)
>>>>>>>>       initialize_original_copy_tables ();
>>>>>>>>     
>>>>>>>>       struct loop *loop2 = loop_version (loop1, boolean_true_node,
>>>>>>>> NULL,
>>>>>>>> -				     profile_probability::always (),
>>>>>>>> -				     profile_probability::never (),
>>>>>>>> -				     profile_probability::always (),
>>>>>>>> -				     profile_probability::always (),
>>>>>>>> +				     invar_branch->probability.invert
>>>>>>>> (),
>>>>>>>> +				     invar_branch->probability,
>>>>>>>> +				     invar_branch->probability.invert
>>>>>>>> (),
>>>>>>>> +				     invar_branch->probability,
>>>>>>>>       				     true);
>>>>>>>>       if (!loop2)
>>>>>>>>         {
>>>>>>>
>>>>>>> The patch introduction seems to talk about do_split_loop_on_cond only.
>>>>>>
>>>>>> split_loop faces similar issue though it sets the two branches to 100% vs
>>>>>> 100%
>>>>>> and no scaling which seems also incorrect.
>>>>>>
>>>>>>> Since loop versioning inserts a condition with the passed probabilities
>>>>>>> but in this case a 'boolean_true_node' condition the then and else
>>>>>>> probabilities passed look correct.  It's just the scaling arguments
>>>>>>> that look wrong?  This loop_version call should get a comment as to
>>>>>>> why we are passing probabilities the way we do.
>>>>>>
>>>>>> This optimization is transforming from:
>>>>>>
>>>>>> for (i = 0; i < n; i = inc (i))
>>>>>>       {
>>>>>>         if (ga)
>>>>>>           ga = do_something ();
>>>>>> }
>>>>>>
>>>>>> to:
>>>>>>
>>>>>>     for (i = 0; i < x; i = inc (i))
>>>>>> {
>>>>>>       if (true)
>>>>>>           ga = do_something ();
>>>>>>           if (!ga)
>>>>>>             break;
>>>>>> }
>>>>>>     for (; i < n; i = inc (i))
>>>>>> {
>>>>>>       if (false)
>>>>>>            ga = do_something ();
>>>>>> }
>>>>>>
>>>>>>
>>>>>> `boolean_true_node` is passed in to make the first loop's condition
>>>>>> statement to be 'true', after returning from loop_version, there is a
>>>>>> piece of code forcing the condition in second loop to be false,
>>>>>> and the original condition is moved from *in loop* to *exit edge*
>>>>>> between loop1 and loop2.
>>>>>
>>>>> Yes, one complication is that we use loop_version but do not retain
>>>>> the CFG it creates.  Something like the vectorizers
>>>>> slpeel_tree_duplicate_loop_to_edge_cfg would be a better "fit"
>>>>> but then that code doesn't do any scaling yet.  But then
>>>>> loop_version uses cfg_hook_duplicate_loop_to_header_edge and I suppose
>>>>> we could write a variant that simply doesn't mangle the CFG
>>>>> with a new condition switching between both loops but keeps them
>>>>> executed after each other ...
>>>>>
>>>>> As said, this adds to the confusion and some awkwardness.
>>>>
>>>> Then loop_version in loop split requires two types of variant, one
>>>> is to insert condition to loop preheader for 'split_loops' usage,
>>>> another is to insert condition to loop exit for 'do_split_loop_on_cond'
>>>> usage, it needs one extra function to encapsulate these cfg alterations
>>>> from outside to inside.
>>>>
>>>> unswitching only execute one loop as it only moves the invariant condition
>>>> to first loop's pre-header.  While 'split_loops' and
>>>> 'do_split_loop_on_cond'
>>>> may execute both loops no matter the condition is moved to the first loop's
>>>> preheader or exit.
>>>>
>>>> The condition stmt in loop unswitching is invariant, but it is variant
>>>> in loop splitting, that's why loop unswitching execute only one loop
>>>> but loop splitting executes both loops.
>>>>
>>>> Seems we need two condition arguments for loop_version, one for connecting
>>>> loop1 preheader to loop2 preheader, another one for connecting loop1's exit
>>>> to loop2's header?  Then it will be more generic for both unswitching pass
>>>> and splitting pass.  Looks a bit complicated and conflicted with
>>>> loop_version's
>>>> comments:
>>>>
>>>> /* Main entry point for Loop Versioning transformation.
>>>>
>>>>     This transformation given a condition and a loop, creates
>>>>     -if (condition) { loop_copy1 } else { loop_copy2 }, ... */
>>>>
>>>>
>>>> And this only works for loop split usage, those many other places
>>>> doesn't use loop_version like this?
>>>
>>> Yes, as said if you don't want the above CFG then you probably
>>> shouldn't use loop_version but instead use its building blocks
>>> (and some refactoring of loop_version can make that easier).
>>>
>>> I think splitting wants
>>>
>>>    loop_copy1
>>>    if (condition)
>>>      loop_copy2
>>>
>>> IMHO it would be good to split 'loopify' into the actual loopification
>>> and the scaling.  Basically make the part of loop_version that
>>> copies the loop on the header edge and creates a loop structure for
>>> it separate.
>>>
>>> loop distribution uses slpeel_tree_duplicate_loop_to_edge_cfg
>>> (copy_loop_before).
>>>
>>
>> Unfortunately slpeel_tree_duplicate_loop_to_edge_cfg only supports
>> copying loops with single exit, it would cause many ICEs in it even
>> building GCC stage1 (line 1065, line 1184 due to exit or new_exit
>> is NULL returning from single_exit (loop).).  Seems loop_version is
>> more flexible for loop split.
> 
> Hmm, sure - loop_version does not need to do anything special with
> exits since control flow either enters the original or the loop copy.
> 
> But slpeel_tree_duplicate_loop_to_edge_cfg intends to create
> control flow that enters _both_ loops, so it needs to have
> an edge from the first loop exit to the second loop entry.
> 
> One could extend this to a region copy, copying eventual CFG merges
> of exits and specifying which exit of a SEME region is supposed
> to be connected to the original region entry.
> 
> After all that's what loop splitting needs in the end - though not
> sure what exactly it does with more than one exit.

In tree-ssa-loop-split.c,  split_loop only accepts single exit loop,
the recently added split_loop_on_cond could process multiple exits loop.

For example, do some changes to the loop-cond-split-1.c,

int ga;
extern int a;
extern int b;
extern int c;

void test1 (int n) {
  int i;
  for (i = 0; i < n; i = inc (i)). {
      if (a+3 > 0)
       break;

      if (ga)
        ga = do_something ();

      for (int j = 0; j < n; j++)
        {
           b+=5;
           if (b > c) break;
        }
    }
}

the "if (ga)" will be a new exit edge from loop_copy1 to loop_copy2. 
I am not sure whether it is valuable to do semi-invariant loop split to such
cases with multiple exits, but obviously the split_loop_on_cond is a special
case from split_loop both duplicate loop to 
   if (condition1) {loop_copy1} if (condition2) {loop_copy2}
The only difference is condition1 is true for split_loop_on_cond.

> 
> So there's another set of "loop" copying, gimple_duplicate_sese_region,
> which doesn't actually require a single exit but a single "important"
> exit.  That might match how you treat multiple exits.

gimple_duplicate_sese_region doesn't handle subloops and latches.  Finally,
I found that loop_version is still much better
than slpeel_tree_duplicate_loop_to_edge_cfg and gimple_duplicate_sese_region
since it could handle all cases like multiple exits/subloops, etc.  I did some
refactor to the code to introduce loop_version2 to create duplicated loops
with two input conditions as attached patch, is this reasonable enough?

I also tried to copy the code in loop_version out of it to don't call loop_version
in loop_version2, but it seems useless with many duplicate code and NOT get rid
of creating "if (condition1) {loop_copy1}" at first?


-- 
Thanks,
Xionghu

[-- Attachment #2: 0001-Add-loop_version2-to-support-loop-transformation-wit.patch --]
[-- Type: text/plain, Size: 13160 bytes --]

From cde3a3a1ab6001f340f3da5c37b416eb42febc8e Mon Sep 17 00:00:00 2001
From: Xiong Hu Luo <luoxhu@linux.ibm.com>
Date: Thu, 14 Oct 2021 04:01:50 -0500
Subject: [PATCH] Add loop_version2 to support loop transformation with two
 conditions

loop_version2 is added to support transform loop to
  - if (condition1) {loop_copy1} if(condition2) {loop_copy2},
to support both loop split and loop conditional split usage.  Actually
do_split_loop_on_cond is a specific usage from loop split when
condition1 is true. The looop_version2 implementation calls
loop_version first to create
 -if (condition1) { loop_copy1  } else { loop_copy2  }
first, then insert condition2 to loop_copy2 with connect_loops.

gcc/ChangeLog:

	* tree-ssa-loop-split.c (connect_loop_phis): Move from ...
	(connect_loops): Move from ...
	(split_loop): Call loop_version2.
	(do_split_loop_on_cond): Call loop_version2.
	* cfgloopmanip.c (connect_loop_phis): ... to this.
	(connect_loops): ... to this.
	(loop_version2): New function.
	* cfgloopmanip.h (loop_version2): New declare.
---
 gcc/cfgloopmanip.c        | 178 ++++++++++++++++++++++++++++++++++++++
 gcc/cfgloopmanip.h        |   5 ++
 gcc/tree-ssa-loop-split.c |  58 ++++---------
 3 files changed, 201 insertions(+), 40 deletions(-)

diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index 82c242dd720..b5d0d99734d 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -24,6 +24,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
+#include "ssa.h"
+#include "tree-cfg.h"
 #include "cfghooks.h"
 #include "cfganal.h"
 #include "cfgloop.h"
@@ -1768,3 +1770,179 @@ loop_version (class loop *loop,
 
   return nloop;
 }
+
+/* This function updates the SSA form after connect_loops made a new
+   edge NEW_E leading from LOOP1 exit to LOOP2 (via in intermediate
+   conditional).  I.e. the second loop can now be entered either
+   via the original entry or via NEW_E, so the entry values of LOOP2
+   phi nodes are either the original ones or those at the exit
+   of LOOP1.  Insert new phi nodes in LOOP2 pre-header reflecting
+   this.  The loops need to fulfill easy_exit_values().  */
+
+static void
+connect_loop_phis (class loop *loop1, class loop *loop2, edge new_e)
+{
+  basic_block rest = loop_preheader_edge (loop2)->src;
+  gcc_assert (new_e->dest == rest);
+  edge skip_first = EDGE_PRED (rest, EDGE_PRED (rest, 0) == new_e);
+
+  edge firste = loop_preheader_edge (loop1);
+  edge seconde = loop_preheader_edge (loop2);
+  edge firstn = loop_latch_edge (loop1);
+  gphi_iterator psi_first, psi_second;
+  for (psi_first = gsi_start_phis (loop1->header),
+       psi_second = gsi_start_phis (loop2->header);
+       !gsi_end_p (psi_first);
+       gsi_next (&psi_first), gsi_next (&psi_second))
+    {
+      tree init, next, new_init;
+      use_operand_p op;
+      gphi *phi_first = psi_first.phi ();
+      gphi *phi_second = psi_second.phi ();
+
+      init = PHI_ARG_DEF_FROM_EDGE (phi_first, firste);
+      next = PHI_ARG_DEF_FROM_EDGE (phi_first, firstn);
+      op = PHI_ARG_DEF_PTR_FROM_EDGE (phi_second, seconde);
+      gcc_assert (operand_equal_for_phi_arg_p (init, USE_FROM_PTR (op)));
+
+      /* Prefer using original variable as a base for the new ssa name.
+	 This is necessary for virtual ops, and useful in order to avoid
+	 losing debug info for real ops.  */
+      if (TREE_CODE (next) == SSA_NAME
+	  && useless_type_conversion_p (TREE_TYPE (next),
+					TREE_TYPE (init)))
+	new_init = copy_ssa_name (next);
+      else if (TREE_CODE (init) == SSA_NAME
+	       && useless_type_conversion_p (TREE_TYPE (init),
+					     TREE_TYPE (next)))
+	new_init = copy_ssa_name (init);
+      else if (useless_type_conversion_p (TREE_TYPE (next),
+					  TREE_TYPE (init)))
+	new_init = make_temp_ssa_name (TREE_TYPE (next), NULL,
+				       "unrinittmp");
+      else
+	new_init = make_temp_ssa_name (TREE_TYPE (init), NULL,
+				       "unrinittmp");
+
+      gphi * newphi = create_phi_node (new_init, rest);
+      add_phi_arg (newphi, init, skip_first, UNKNOWN_LOCATION);
+      add_phi_arg (newphi, next, new_e, UNKNOWN_LOCATION);
+      SET_USE (op, new_init);
+    }
+}
+
+/* The two loops LOOP1 and LOOP2 were just created by loop versioning,
+   they are still equivalent and placed in two arms of a diamond, like so:
+
+               .------if (cond)------.
+               v                     v
+             pre1                   pre2
+              |                      |
+        .--->h1                     h2<----.
+        |     |                      |     |
+        |    ex1---.            .---ex2    |
+        |    /     |            |     \    |
+        '---l1     X            |     l2---'
+                   |            |
+                   |            |
+                   '--->join<---'
+
+   This function transforms the program such that LOOP1 is conditionally
+   falling through to LOOP2, or skipping it.  This is done by splitting
+   the ex1->join edge at X in the diagram above, and inserting a condition
+   whose one arm goes to pre2, resulting in this situation:
+
+               .------if (cond)------.
+               v                     v
+             pre1       .---------->pre2
+              |         |            |
+        .--->h1         |           h2<----.
+        |     |         |            |     |
+        |    ex1---.    |       .---ex2    |
+        |    /     v    |       |     \    |
+        '---l1   skip---'       |     l2---'
+                   |            |
+                   |            |
+                   '--->join<---'
+
+
+   The condition used is the exit condition of LOOP1, which effectively means
+   that when the first loop exits (for whatever reason) but the real original
+   exit expression is still false the second loop will be entered.
+   The function returns the new edge cond->pre2.
+
+   This doesn't update the SSA form, see connect_loop_phis for that.  */
+
+static edge
+connect_loops (class loop *loop1, class loop *loop2, void *cond_e)
+{
+  tree cond_expr = (tree) cond_e;
+  edge exit = single_exit (loop1);
+  basic_block skip_bb = split_edge (exit);
+  gimple *new_cond_expr;
+  gimple_stmt_iterator gsi;
+  edge new_e, skip_e;
+
+  new_cond_expr = gimple_build_cond_from_tree (cond_expr, NULL_TREE, NULL_TREE);
+
+  gsi = gsi_last_bb (skip_bb);
+  gsi_insert_after (&gsi, new_cond_expr, GSI_NEW_STMT);
+
+  skip_e = EDGE_SUCC (skip_bb, 0);
+  skip_e->flags &= ~EDGE_FALLTHRU;
+  new_e = make_edge (skip_bb, loop_preheader_edge (loop2)->src, 0);
+  if (exit->flags & EDGE_TRUE_VALUE)
+    {
+      skip_e->flags |= EDGE_TRUE_VALUE;
+      new_e->flags |= EDGE_FALSE_VALUE;
+    }
+  else
+    {
+      skip_e->flags |= EDGE_FALSE_VALUE;
+      new_e->flags |= EDGE_TRUE_VALUE;
+    }
+
+  new_e->probability = profile_probability::likely ();
+  skip_e->probability = new_e->probability.invert ();
+
+  return new_e;
+}
+
+/* Main entry point for Loop Versioning transformation with two conditions.
+
+   This transformation given a condition and a loop, creates
+   -if (condition1) { loop_copy1 } if (condition2) { loop_copy2 },
+   where loop_copy1 is the loop transformed in one way, and loop_copy2
+   is the loop transformed in another way (or unchanged). COND_EXPR1
+   and COND_EXPR2 may be a run time test for things that were not resolved by
+   static analysis (overlapping ranges (anti-aliasing), alignment, etc.).
+
+   If non-NULL, CONDITION_BB is set to the basic block
+   containing the condition.
+
+   THEN_PROB is the probability of the then edge of the if.  THEN_SCALE
+   is the ratio by that the frequencies in the original loop should
+   be scaled.  ELSE_SCALE is the ratio by that the frequencies in the
+   new loop should be scaled.
+
+   If PLACE_AFTER is true, we place the new loop after LOOP in the
+   instruction stream, otherwise it is placed before LOOP.  */
+
+class loop *
+loop_version2 (class loop *loop, void *cond_expr1, basic_block *condition_bb,
+	       void *cond_expr2, profile_probability then_prob,
+	       profile_probability else_prob, profile_probability then_scale,
+	       profile_probability else_scale, bool place_after)
+{
+  /* Generate if (condition1) {loop_copy1} else {loop_copy2} and propotion the
+    profile_count first.  */
+  class loop *nloop = loop_version (loop, cond_expr1, condition_bb, then_prob,
+				    else_prob, then_scale, else_scale, place_after);
+
+  /* Fix up the cfg to
+      - if (condition1) {loop_copy1}; if (condition2) {loop_copy2}.  */
+  edge new_e = connect_loops (loop, nloop, cond_expr2);
+  connect_loop_phis (loop, nloop, new_e);
+
+  return nloop;
+}
diff --git a/gcc/cfgloopmanip.h b/gcc/cfgloopmanip.h
index 07d5f925b79..23b077e94d5 100644
--- a/gcc/cfgloopmanip.h
+++ b/gcc/cfgloopmanip.h
@@ -63,4 +63,9 @@ class loop * loop_version (class loop *, void *,
 			    profile_probability, profile_probability,
 			    profile_probability, profile_probability, bool);
 
+class loop * loop_version2 (class loop *, void *,
+			    basic_block *, void *,
+			    profile_probability, profile_probability,
+			    profile_probability, profile_probability, bool);
+
 #endif /* GCC_CFGLOOPMANIP_H */
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index d30782888f3..f3096488473 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -573,9 +573,9 @@ split_loop (class loop *loop1)
 	if (stmts2)
 	  gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop1),
 					    stmts2);
-	tree cond = build2 (guard_code, boolean_type_node, guard_init, border);
+	tree cond1 = build2 (guard_code, boolean_type_node, guard_init, border);
 	if (!initial_true)
-	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
+	  cond1 = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond1);
 
 	edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
 			   ? EDGE_SUCC (bbs[i], 0)
@@ -586,16 +586,19 @@ split_loop (class loop *loop1)
 	initialize_original_copy_tables ();
 	basic_block cond_bb;
 
-	class loop *loop2 = loop_version (loop1, cond, &cond_bb,
+	edge exit = single_exit (loop1);
+	gimple *stmt = last_stmt (exit->src);
+	tree cond2 = build2 (gimple_cond_code (stmt), boolean_type_node,
+			     gimple_cond_lhs (stmt), gimple_cond_rhs (stmt));
+
+	class loop *loop2 = loop_version2 (loop1, cond1, &cond_bb, cond2,
 					   true_edge->probability,
 					   true_edge->probability.invert (),
 					   true_edge->probability,
 					   true_edge->probability.invert (),
 					   true);
-	gcc_assert (loop2);
 
-	edge new_e = connect_loops (loop1, loop2);
-	connect_loop_phis (loop1, loop2, new_e);
+	gcc_assert (loop2);
 
 	/* The iterations of the second loop is now already
 	   exactly those that the first loop didn't do, but the
@@ -1489,12 +1492,16 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
 
   initialize_original_copy_tables ();
 
-  struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
-				     invar_branch->probability.invert (),
-				     invar_branch->probability,
+  gimple *break_cond = gimple_build_cond (gimple_cond_code(cond),
+					  gimple_cond_lhs (cond),
+					  gimple_cond_rhs (cond),
+					  NULL_TREE, NULL_TREE);
+
+  class loop *loop2 = loop_version2 (loop1, boolean_true_node, &cond_bb,
+				     break_cond, profile_probability::always (),
+				     profile_probability::never (),
 				     invar_branch->probability.invert (),
-				     invar_branch->probability,
-				     true);
+				     invar_branch->probability, true);
   if (!loop2)
     {
       free_original_copy_tables ();
@@ -1513,35 +1520,6 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
 
   update_stmt (cond_copy);
 
-  /* Insert a new conditional statement on latch edge of loop1, its condition
-     is duplicated from the semi-invariant.  This statement acts as a switch
-     to transfer execution from loop1 to loop2, when loop1 enters into
-     invariant state.  */
-  basic_block latch_bb = split_edge (loop_latch_edge (loop1));
-  basic_block break_bb = split_edge (single_pred_edge (latch_bb));
-  gimple *break_cond = gimple_build_cond (gimple_cond_code(cond),
-					  gimple_cond_lhs (cond),
-					  gimple_cond_rhs (cond),
-					  NULL_TREE, NULL_TREE);
-
-  gimple_stmt_iterator gsi = gsi_last_bb (break_bb);
-  gsi_insert_after (&gsi, break_cond, GSI_NEW_STMT);
-
-  edge to_loop1 = single_succ_edge (break_bb);
-  edge to_loop2 = make_edge (break_bb, loop_preheader_edge (loop2)->src, 0);
-
-  to_loop1->flags &= ~EDGE_FALLTHRU;
-  to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE;
-  to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE;
-
-  to_loop1->probability = invar_branch->probability.invert ();
-  to_loop2->probability = invar_branch->probability;
-
-  /* Due to introduction of a control flow edge from loop1 latch to loop2
-     pre-header, we should update PHIs in loop2 to reflect this connection
-     between loop1 and loop2.  */
-  connect_loop_phis (loop1, loop2, to_loop2);
-
   free_original_copy_tables ();
 
   return true;
-- 
2.27.0.90.geebb51ba8c

next prev parent reply	other threads:[~2021-10-15  5:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-03  8:58 Xionghu Luo
2021-08-04  2:42 ` Xionghu Luo
2021-08-06 11:46 ` Richard Biener
2021-08-09  2:37   ` Xionghu Luo
2021-08-10 14:47     ` Richard Biener
2021-08-11  3:03       ` Feng Xue OS
2021-10-26 13:05         ` Jan Hubicka
2021-10-27  1:42           ` Xionghu Luo
2021-08-11  8:32       ` Xionghu Luo
2021-08-11  9:16         ` Richard Biener
2021-08-12  3:24           ` Xionghu Luo
2021-09-22  8:40           ` Xionghu Luo
2021-09-23 12:17             ` Richard Biener
2021-10-15  5:51               ` Xionghu Luo [this message]
2021-10-21  8:43                 ` Xionghu Luo
2021-10-21 10:55                   ` Richard Biener
2021-10-26  5:40                     ` Xionghu Luo
2021-10-26 11:59                       ` Richard Biener
2021-10-26 12:19                         ` Jan Hubicka
2021-08-09  4:33   ` Feng Xue OS

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e96eb6f7-b919-42a2-b1e3-e678abe5eed7@linux.ibm.com \
    --to=luoxhu@linux.ibm.com \
    --cc=fxue@os.amperecomputing.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guojiufu@linux.ibm.com \
    --cc=hubicka@ucw.cz \
    --cc=linkw@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).