From: Xionghu Luo <luoxhu@linux.ibm.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
fxue@os.amperecomputing.com, wschmidt@linux.ibm.com,
guojiufu@linux.ibm.com, linkw@gcc.gnu.org, hubicka@ucw.cz
Subject: Re: [PATCH] Fix loop split incorrect count and probability
Date: Wed, 22 Sep 2021 16:40:17 +0800 [thread overview]
Message-ID: <b7fc9c5b-bc81-5688-2472-c614210378be@linux.ibm.com> (raw)
In-Reply-To: <nycvar.YFH.7.76.2108111108290.11781@zhemvz.fhfr.qr>
On 2021/8/11 17:16, Richard Biener wrote:
> On Wed, 11 Aug 2021, Xionghu Luo wrote:
>
>>
>>
>> On 2021/8/10 22:47, Richard Biener wrote:
>>> On Mon, 9 Aug 2021, Xionghu Luo wrote:
>>>
>>>> Thanks,
>>>>
>>>> On 2021/8/6 19:46, Richard Biener wrote:
>>>>> On Tue, 3 Aug 2021, Xionghu Luo wrote:
>>>>>
>>>>>> loop split condition is moved between loop1 and loop2, the split bb's
>>>>>> count and probability should also be duplicated instead of (100% vs INV),
>>>>>> secondly, the original loop1 and loop2 count need be propotional from the
>>>>>> original loop.
>>>>>>
>>>>>>
>>>>>> diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit:
>>>>>> ...
>>>>>> int prephitmp_16;
>>>>>> int prephitmp_25;
>>>>>>
>>>>>> <bb 2> [local count: 118111600]:
>>>>>> if (n_7(D) > 0)
>>>>>> goto <bb 4>; [89.00%]
>>>>>> else
>>>>>> goto <bb 3>; [11.00%]
>>>>>>
>>>>>> <bb 3> [local count: 118111600]:
>>>>>> return;
>>>>>>
>>>>>> <bb 4> [local count: 105119324]:
>>>>>> pretmp_3 = ga;
>>>>>>
>>>>>> - <bb 5> [local count: 955630225]:
>>>>>> + <bb 5> [local count: 315357973]:
>>>>>> # i_13 = PHI <i_10(20), 0(4)>
>>>>>> # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
>>>>>> if (prephitmp_12 != 0)
>>>>>> goto <bb 6>; [33.00%]
>>>>>> else
>>>>>> goto <bb 7>; [67.00%]
>>>>>>
>>>>>> - <bb 6> [local count: 315357972]:
>>>>>> + <bb 6> [local count: 104068130]:
>>>>>> _2 = do_something ();
>>>>>> ga = _2;
>>>>>>
>>>>>> - <bb 7> [local count: 955630225]:
>>>>>> + <bb 7> [local count: 315357973]:
>>>>>> # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
>>>>>> i_10 = inc (i_13);
>>>>>> if (n_7(D) > i_10)
>>>>>> goto <bb 21>; [89.00%]
>>>>>> else
>>>>>> goto <bb 11>; [11.00%]
>>>>>>
>>>>>> <bb 11> [local count: 105119324]:
>>>>>> goto <bb 3>; [100.00%]
>>>>>>
>>>>>> - <bb 21> [local count: 850510901]:
>>>>>> + <bb 21> [local count: 280668596]:
>>>>>> if (prephitmp_12 != 0)
>>>>>> - goto <bb 20>; [100.00%]
>>>>>> + goto <bb 20>; [33.00%]
>>>>>> else
>>>>>> - goto <bb 19>; [INV]
>>>>>> + goto <bb 19>; [67.00%]
>>>>>>
>>>>>> - <bb 20> [local count: 850510901]:
>>>>>> + <bb 20> [local count: 280668596]:
>>>>>> goto <bb 5>; [100.00%]
>>>>>>
>>>>>> - <bb 19> [count: 0]:
>>>>>> + <bb 19> [local count: 70429947]:
>>>>>> # i_23 = PHI <i_10(21)>
>>>>>> # prephitmp_25 = PHI <prephitmp_5(21)>
>>>>>>
>>>>>> - <bb 12> [local count: 955630225]:
>>>>>> + <bb 12> [local count: 640272252]:
>>>>>> # i_15 = PHI <i_23(19), i_22(16)>
>>>>>> # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
>>>>>> i_22 = inc (i_15);
>>>>>> if (n_7(D) > i_22)
>>>>>> goto <bb 16>; [89.00%]
>>>>>> else
>>>>>> goto <bb 11>; [11.00%]
>>>>>>
>>>>>> - <bb 16> [local count: 850510901]:
>>>>>> + <bb 16> [local count: 569842305]:
>>>>>> goto <bb 12>; [100.00%]
>>>>>>
>>>>>> }
>>>>>>
>>>>>> gcc/ChangeLog:
>>>>>>
>>>>>> * tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
>>>>>> (do_split_loop_on_cond): Likewise.
>>>>>> ---
>>>>>> gcc/tree-ssa-loop-split.c | 16 ++++++++--------
>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>>>>>
>>>>>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>>>>>> index 3a09bbc39e5..8e5a7ded0f7 100644
>>>>>> --- a/gcc/tree-ssa-loop-split.c
>>>>>> +++ b/gcc/tree-ssa-loop-split.c
>>>>>> @@ -583,10 +583,10 @@ split_loop (class loop *loop1)
>>>>>> basic_block cond_bb;
>>>>
>>>> if (!initial_true)
>>>> - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
>>>> + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
>>>> +
>>>> + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
>>>> + ? EDGE_SUCC (bbs[i], 0)
>>>> + : EDGE_SUCC (bbs[i], 1);
>>>>
>>>>>>
>>>>>> class loop *loop2 = loop_version (loop1, cond, &cond_bb,
>>>>>> - profile_probability::always (),
>>>>>> - profile_probability::always (),
>>>>>> - profile_probability::always (),
>>>>>> - profile_probability::always (),
>>>>>> + true_edge->probability,
>>>>>> + true_edge->probability.invert (),
>>>>>> + true_edge->probability,
>>>>>> + true_edge->probability.invert (),
>>>>>> true);
>>>>>
>>>>> there is no 'true_edge' variable at this point.
>>>>
>>>> Sorry, missed the above hunk when split the patch.
>>>>
>>>>>
>>>>>> gcc_assert (loop2);
>>>>>>
>>>>>> @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
>>>>>> initialize_original_copy_tables ();
>>>>>>
>>>>>> struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
>>>>>> - profile_probability::always (),
>>>>>> - profile_probability::never (),
>>>>>> - profile_probability::always (),
>>>>>> - profile_probability::always (),
>>>>>> + invar_branch->probability.invert (),
>>>>>> + invar_branch->probability,
>>>>>> + invar_branch->probability.invert (),
>>>>>> + invar_branch->probability,
>>>>>> true);
>>>>>> if (!loop2)
>>>>>> {
>>>>>
>>>>> The patch introduction seems to talk about do_split_loop_on_cond only.
>>>>
>>>> split_loop faces similar issue though it sets the two branches to 100% vs 100%
>>>> and no scaling which seems also incorrect.
>>>>
>>>>> Since loop versioning inserts a condition with the passed probabilities
>>>>> but in this case a 'boolean_true_node' condition the then and else
>>>>> probabilities passed look correct. It's just the scaling arguments
>>>>> that look wrong? This loop_version call should get a comment as to
>>>>> why we are passing probabilities the way we do.
>>>>
>>>> This optimization is transforming from:
>>>>
>>>> for (i = 0; i < n; i = inc (i))
>>>> {
>>>> if (ga)
>>>> ga = do_something ();
>>>> }
>>>>
>>>> to:
>>>>
>>>> for (i = 0; i < x; i = inc (i))
>>>> {
>>>> if (true)
>>>> ga = do_something ();
>>>> if (!ga)
>>>> break;
>>>> }
>>>> for (; i < n; i = inc (i))
>>>> {
>>>> if (false)
>>>> ga = do_something ();
>>>> }
>>>>
>>>>
>>>> `boolean_true_node` is passed in to make the first loop's condition
>>>> statement to be 'true', after returning from loop_version, there is a
>>>> piece of code forcing the condition in second loop to be false,
>>>> and the original condition is moved from *in loop* to *exit edge*
>>>> between loop1 and loop2.
>>>
>>> Yes, one complication is that we use loop_version but do not retain
>>> the CFG it creates. Something like the vectorizers
>>> slpeel_tree_duplicate_loop_to_edge_cfg would be a better "fit"
>>> but then that code doesn't do any scaling yet. But then
>>> loop_version uses cfg_hook_duplicate_loop_to_header_edge and I suppose
>>> we could write a variant that simply doesn't mangle the CFG
>>> with a new condition switching between both loops but keeps them
>>> executed after each other ...
>>>
>>> As said, this adds to the confusion and some awkwardness.
>>
>> Then loop_version in loop split requires two types of variant, one
>> is to insert condition to loop preheader for 'split_loops' usage,
>> another is to insert condition to loop exit for 'do_split_loop_on_cond'
>> usage, it needs one extra function to encapsulate these cfg alterations
>> from outside to inside.
>>
>> unswitching only execute one loop as it only moves the invariant condition
>> to first loop's pre-header. While 'split_loops' and 'do_split_loop_on_cond'
>> may execute both loops no matter the condition is moved to the first loop's
>> preheader or exit.
>>
>> The condition stmt in loop unswitching is invariant, but it is variant
>> in loop splitting, that's why loop unswitching execute only one loop
>> but loop splitting executes both loops.
>>
>> Seems we need two condition arguments for loop_version, one for connecting
>> loop1 preheader to loop2 preheader, another one for connecting loop1's exit
>> to loop2's header? Then it will be more generic for both unswitching pass
>> and splitting pass. Looks a bit complicated and conflicted with loop_version's
>> comments:
>>
>> /* Main entry point for Loop Versioning transformation.
>>
>> This transformation given a condition and a loop, creates
>> -if (condition) { loop_copy1 } else { loop_copy2 }, ... */
>>
>>
>> And this only works for loop split usage, those many other places
>> doesn't use loop_version like this?
>
> Yes, as said if you don't want the above CFG then you probably
> shouldn't use loop_version but instead use its building blocks
> (and some refactoring of loop_version can make that easier).
>
> I think splitting wants
>
> loop_copy1
> if (condition)
> loop_copy2
>
> IMHO it would be good to split 'loopify' into the actual loopification
> and the scaling. Basically make the part of loop_version that
> copies the loop on the header edge and creates a loop structure for
> it separate.
>
> loop distribution uses slpeel_tree_duplicate_loop_to_edge_cfg
> (copy_loop_before).
>
Unfortunately slpeel_tree_duplicate_loop_to_edge_cfg only supports
copying loops with single exit, it would cause many ICEs in it even
building GCC stage1 (line 1065, line 1184 due to exit or new_exit
is NULL returning from single_exit (loop).). Seems loop_version is
more flexible for loop split.
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/tree-vect-loop-manip.c;h=4988c93fdb61507a26430651b416ae61b217793a;hb=HEAD
--
Thanks,
Xionghu
next prev parent reply other threads:[~2021-09-22 8:40 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-03 8:58 Xionghu Luo
2021-08-04 2:42 ` Xionghu Luo
2021-08-06 11:46 ` Richard Biener
2021-08-09 2:37 ` Xionghu Luo
2021-08-10 14:47 ` Richard Biener
2021-08-11 3:03 ` Feng Xue OS
2021-10-26 13:05 ` Jan Hubicka
2021-10-27 1:42 ` Xionghu Luo
2021-08-11 8:32 ` Xionghu Luo
2021-08-11 9:16 ` Richard Biener
2021-08-12 3:24 ` Xionghu Luo
2021-09-22 8:40 ` Xionghu Luo [this message]
2021-09-23 12:17 ` Richard Biener
2021-10-15 5:51 ` Xionghu Luo
2021-10-21 8:43 ` Xionghu Luo
2021-10-21 10:55 ` Richard Biener
2021-10-26 5:40 ` Xionghu Luo
2021-10-26 11:59 ` Richard Biener
2021-10-26 12:19 ` Jan Hubicka
2021-08-09 4:33 ` Feng Xue OS
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7fc9c5b-bc81-5688-2472-c614210378be@linux.ibm.com \
--to=luoxhu@linux.ibm.com \
--cc=fxue@os.amperecomputing.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=guojiufu@linux.ibm.com \
--cc=hubicka@ucw.cz \
--cc=linkw@gcc.gnu.org \
--cc=rguenther@suse.de \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).