public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Xionghu Luo <luoxhu@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
	 fxue@os.amperecomputing.com, wschmidt@linux.ibm.com,
	 guojiufu@linux.ibm.com, linkw@gcc.gnu.org, hubicka@ucw.cz
Subject: Re: [PATCH] Fix loop split incorrect count and probability
Date: Tue, 10 Aug 2021 16:47:41 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.YFH.7.76.2108101634250.11781@zhemvz.fhfr.qr> (raw)
In-Reply-To: <58f9e351-fc10-406e-5272-270ece73be71@linux.ibm.com>

On Mon, 9 Aug 2021, Xionghu Luo wrote:

> Thanks,
> 
> On 2021/8/6 19:46, Richard Biener wrote:
> > On Tue, 3 Aug 2021, Xionghu Luo wrote:
> > 
> >> loop split condition is moved between loop1 and loop2, the split bb's
> >> count and probability should also be duplicated instead of (100% vs INV),
> >> secondly, the original loop1 and loop2 count need be propotional from the
> >> original loop.
> >>
> >>
> >> diff base/loop-cond-split-1.c.151t.lsplit  patched/loop-cond-split-1.c.151t.lsplit:
> >> ...
> >>     int prephitmp_16;
> >>     int prephitmp_25;
> >>
> >>     <bb 2> [local count: 118111600]:
> >>     if (n_7(D) > 0)
> >>       goto <bb 4>; [89.00%]
> >>     else
> >>       goto <bb 3>; [11.00%]
> >>
> >>     <bb 3> [local count: 118111600]:
> >>     return;
> >>
> >>     <bb 4> [local count: 105119324]:
> >>     pretmp_3 = ga;
> >>
> >> -  <bb 5> [local count: 955630225]:
> >> +  <bb 5> [local count: 315357973]:
> >>     # i_13 = PHI <i_10(20), 0(4)>
> >>     # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
> >>     if (prephitmp_12 != 0)
> >>       goto <bb 6>; [33.00%]
> >>     else
> >>       goto <bb 7>; [67.00%]
> >>
> >> -  <bb 6> [local count: 315357972]:
> >> +  <bb 6> [local count: 104068130]:
> >>     _2 = do_something ();
> >>     ga = _2;
> >>
> >> -  <bb 7> [local count: 955630225]:
> >> +  <bb 7> [local count: 315357973]:
> >>     # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
> >>     i_10 = inc (i_13);
> >>     if (n_7(D) > i_10)
> >>       goto <bb 21>; [89.00%]
> >>     else
> >>       goto <bb 11>; [11.00%]
> >>
> >>     <bb 11> [local count: 105119324]:
> >>     goto <bb 3>; [100.00%]
> >>
> >> -  <bb 21> [local count: 850510901]:
> >> +  <bb 21> [local count: 280668596]:
> >>     if (prephitmp_12 != 0)
> >> -    goto <bb 20>; [100.00%]
> >> +    goto <bb 20>; [33.00%]
> >>     else
> >> -    goto <bb 19>; [INV]
> >> +    goto <bb 19>; [67.00%]
> >>
> >> -  <bb 20> [local count: 850510901]:
> >> +  <bb 20> [local count: 280668596]:
> >>     goto <bb 5>; [100.00%]
> >>
> >> -  <bb 19> [count: 0]:
> >> +  <bb 19> [local count: 70429947]:
> >>     # i_23 = PHI <i_10(21)>
> >>     # prephitmp_25 = PHI <prephitmp_5(21)>
> >>
> >> -  <bb 12> [local count: 955630225]:
> >> +  <bb 12> [local count: 640272252]:
> >>     # i_15 = PHI <i_23(19), i_22(16)>
> >>     # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
> >>     i_22 = inc (i_15);
> >>     if (n_7(D) > i_22)
> >>       goto <bb 16>; [89.00%]
> >>     else
> >>       goto <bb 11>; [11.00%]
> >>
> >> -  <bb 16> [local count: 850510901]:
> >> +  <bb 16> [local count: 569842305]:
> >>     goto <bb 12>; [100.00%]
> >>
> >>   }
> >>
> >> gcc/ChangeLog:
> >>
> >> 	* tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
> >> 	(do_split_loop_on_cond): Likewise.
> >> ---
> >>   gcc/tree-ssa-loop-split.c | 16 ++++++++--------
> >>   1 file changed, 8 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> >> index 3a09bbc39e5..8e5a7ded0f7 100644
> >> --- a/gcc/tree-ssa-loop-split.c
> >> +++ b/gcc/tree-ssa-loop-split.c
> >> @@ -583,10 +583,10 @@ split_loop (class loop *loop1)
> >>   	basic_block cond_bb;
> 
>  	if (!initial_true)
> -	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); 
> +	  cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
> +
> +	edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
> +			   ? EDGE_SUCC (bbs[i], 0)
> +			   : EDGE_SUCC (bbs[i], 1);
> 
> >>   
> >>   	class loop *loop2 = loop_version (loop1, cond, &cond_bb,
> >> -					   profile_probability::always (),
> >> -					   profile_probability::always (),
> >> -					   profile_probability::always (),
> >> -					   profile_probability::always (),
> >> +					   true_edge->probability,
> >> +					   true_edge->probability.invert (),
> >> +					   true_edge->probability,
> >> +					   true_edge->probability.invert (),
> >>   					   true);
> > 
> > there is no 'true_edge' variable at this point.
> 
> Sorry, missed the above hunk when split the patch. 
> 
> > 
> >>   	gcc_assert (loop2);
> >>   
> >> @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
> >>     initialize_original_copy_tables ();
> >>   
> >>     struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
> >> -				     profile_probability::always (),
> >> -				     profile_probability::never (),
> >> -				     profile_probability::always (),
> >> -				     profile_probability::always (),
> >> +				     invar_branch->probability.invert (),
> >> +				     invar_branch->probability,
> >> +				     invar_branch->probability.invert (),
> >> +				     invar_branch->probability,
> >>   				     true);
> >>     if (!loop2)
> >>       {
> > 
> > The patch introduction seems to talk about do_split_loop_on_cond only.
> 
> split_loop faces similar issue though it sets the two branches to 100% vs 100%
> and no scaling which seems also incorrect.
> 
> > Since loop versioning inserts a condition with the passed probabilities
> > but in this case a 'boolean_true_node' condition the then and else
> > probabilities passed look correct.  It's just the scaling arguments
> > that look wrong?  This loop_version call should get a comment as to
> > why we are passing probabilities the way we do.
> 
> This optimization is transforming from:
> 
> for (i = 0; i < n; i = inc (i))
>     {
>       if (ga)
>         ga = do_something ();
> }
> 
> to: 
> 
>   for (i = 0; i < x; i = inc (i))
> {
>     if (true)
>          ga = do_something ();
>         if (!ga)
>           break;
> }
>   for (; i < n; i = inc (i))
> {
>     if (false)
>          ga = do_something ();
> }
> 
> 
> `boolean_true_node` is passed in to make the first loop's condition
> statement to be 'true', after returning from loop_version, there is a
> piece of code forcing the condition in second loop to be false,
> and the original condition is moved from *in loop* to *exit edge*
> between loop1 and loop2. 

Yes, one complication is that we use loop_version but do not retain
the CFG it creates.  Something like the vectorizers
slpeel_tree_duplicate_loop_to_edge_cfg would be a better "fit"
but then that code doesn't do any scaling yet.  But then
loop_version uses cfg_hook_duplicate_loop_to_header_edge and I suppose
we could write a variant that simply doesn't mangle the CFG
with a new condition switching between both loops but keeps them
executed after each other ...

As said, this adds to the confusion and some awkwardness.

> @fxue may have inputs about this since he contributed it years ago.
> 
> > 
> > It does seem that scaling the loop by the invar_branch probability
> > is correct.  Since this does similar things to unswitching, I see
> > that unswitching does
> > 
> > prob_true = edge_true->probability;
> > loop_version (loop, unshare_expr (cond),
> >                         NULL, prob_true,
> >                         prob_true.invert (),
> >                         prob_true, prob_true.invert (),
> >                         false);
> > 
> > which maybe suggests that your invar_branch based passing should
> > depend on 'true_invar'?
> > 
> > Also compared to unswitching the first loop is always entered,
> > so I wonder if the scaling is correct with respect to that
> > given unswitching where only ever one loop is entered?
> 
> 
> Scaling is based on the probability calculated on "if (ga)", if it is
> 33% vs 67% before split, then it is reasonable to scale loop1 to 33%
> and loop2 to 67% suppose the branch probability is correct enough?
> 
> unswitch also scaled the two loops based on prob_true, if prob_true
> is 50%, it decreases X's count to HALF unexpectedly since it should be
> executed on both branches?  while loop split still kept execute both first
> loop and second loop, it seems even more accurate than loop unswitching?

I just was saying that both doing exactly the same looks wrong (on
either side).

> tree-ssa-loop-unswitch.c:
> 
>    while (A)
>      {
>        if (inv)
>          B;
> 
>        X;
> 
>        if (!inv)
> 	 C;
>      }
> 
>    where inv is the loop invariant, into
> 
>    if (inv)
>      {
>        while (A)
> 	 {
>            B;
> 	   X;
> 	 }
>      }
>    else
>      {
>        while (A)
> 	 {
> 	   X;
> 	   C;
> 	 }
>      }

Yes, here scaling based on the if (inv) probability looks obviously
100% correct to me.  But I might be wrong.  And thus the
splitting case must be still off (so I conclude).  Hmm, in fact
I think for the loop unswitching case the scaling of the B and
C blocks is off?  Those should be considered always executed.
Note the loop unswitching pass is altering the conditions
condition to static true/false but I don't think it performs
further adjustments.

That said, likely the profile update cannot be done uniformly
for all blocks of a loop?

Richard.

  reply	other threads:[~2021-08-10 14:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-03  8:58 Xionghu Luo
2021-08-04  2:42 ` Xionghu Luo
2021-08-06 11:46 ` Richard Biener
2021-08-09  2:37   ` Xionghu Luo
2021-08-10 14:47     ` Richard Biener [this message]
2021-08-11  3:03       ` Feng Xue OS
2021-10-26 13:05         ` Jan Hubicka
2021-10-27  1:42           ` Xionghu Luo
2021-08-11  8:32       ` Xionghu Luo
2021-08-11  9:16         ` Richard Biener
2021-08-12  3:24           ` Xionghu Luo
2021-09-22  8:40           ` Xionghu Luo
2021-09-23 12:17             ` Richard Biener
2021-10-15  5:51               ` Xionghu Luo
2021-10-21  8:43                 ` Xionghu Luo
2021-10-21 10:55                   ` Richard Biener
2021-10-26  5:40                     ` Xionghu Luo
2021-10-26 11:59                       ` Richard Biener
2021-10-26 12:19                         ` Jan Hubicka
2021-08-09  4:33   ` Feng Xue OS

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.76.2108101634250.11781@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=fxue@os.amperecomputing.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guojiufu@linux.ibm.com \
    --cc=hubicka@ucw.cz \
    --cc=linkw@gcc.gnu.org \
    --cc=luoxhu@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).