From: Xionghu Luo <luoxhu@linux.ibm.com>
To: Jan Hubicka <hubicka@kam.mff.cuni.cz>,
Richard Biener <rguenther@suse.de>
Cc: wschmidt@linux.ibm.com, dje.gcc@gmail.com,
gcc-patches@gcc.gnu.org, linkw@gcc.gnu.org,
segher@kernel.crashing.org
Subject: Re: [PATCH v3 1/4] Fix loop split incorrect count and probability
Date: Wed, 24 Nov 2021 13:11:43 +0800 [thread overview]
Message-ID: <517d9a55-adf8-3826-9d41-a9639e6091a7@linux.ibm.com> (raw)
In-Reply-To: <6ba3c6d7-4f9f-1ffe-cab5-23369bc4cf13@linux.ibm.com>
Gentle ping, thanks.
[PATCH v3] Fix loop split incorrect count and probability
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583626.html
On 2021/11/8 14:09, Xionghu Luo via Gcc-patches wrote:
>
>
> On 2021/10/27 15:44, Jan Hubicka wrote:
>>> On Wed, 27 Oct 2021, Jan Hubicka wrote:
>>>
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> * tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
>>>>> (do_split_loop_on_cond): Likewise.
>>>>> ---
>>>>> gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++---------
>>>>> 1 file changed, 16 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>>>>> index 3f6ad046623..d30782888f3 100644
>>>>> --- a/gcc/tree-ssa-loop-split.c
>>>>> +++ b/gcc/tree-ssa-loop-split.c
>>>>> @@ -575,7 +575,11 @@ split_loop (class loop *loop1)
>>>>> stmts2);
>>>>> tree cond = build2 (guard_code, boolean_type_node, guard_init, border);
>>>>> if (!initial_true)
>>>>> - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
>>>>> + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
>>>>> +
>>>>> + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
>>>>> + ? EDGE_SUCC (bbs[i], 0)
>>>>> + : EDGE_SUCC (bbs[i], 1);
>>>>>
>>>>> /* Now version the loop, placing loop2 after loop1 connecting
>>>>> them, and fix up SSA form for that. */
>>>>> @@ -583,10 +587,10 @@ split_loop (class loop *loop1)
>>>>> basic_block cond_bb;
>>>>>
>>>>> class loop *loop2 = loop_version (loop1, cond, &cond_bb,
>>>>> - profile_probability::always (),
>>>>> - profile_probability::always (),
>>>>> - profile_probability::always (),
>>>>> - profile_probability::always (),
>>>>> + true_edge->probability,
>>>>> + true_edge->probability.invert (),
>>>>> + true_edge->probability,
>>>>> + true_edge->probability.invert (),
>>>>> true);
>>>>
>>>> As discussed yesterday, for loop of form
>>>>
>>>> for (...)
>>>> if (cond)
>>>> cond = something();
>>>> else
>>>> something2
>>>>
>>>> Split as
>>>
>>> Note that you are missing to conditionalize loop1 execution
>>> on 'cond' (not sure if that makes a difference).
>> You are right - forgot to mention that.
>>
>> Entry conditional makes no difference on scaling stmts inside loop but
>> affects its header and expected trip count. We however need to set up
>> probability of this conditional (and preheader count if it exists)
>> There is no general way to read the probability of this initial
>> conditional from cfg profile. So I guess we are stuck with guessing
>> some arbitrary value. I guess common case is that cond is true first
>> iteration tough and often we can easily see that fromo PHI node
>> initializing the test variable.
>>
>> Other thing that changes is expected number of iterations of the split
>> loops, so we may want to update the exit conditinal probability
>> accordingly...
>>
> Sorry for the late reply. The below updated patch mainly solves the issues
> you pointed out:
> - profile count proportion for both original loop and copied loop
> without dropping down the true branch's count;
> - probability update in the two loops and between the two loops;
> - number of iterations update/check for split_loop.
>
>
> [PATCH v3] Fix loop split incorrect count and probability
>
> In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two
> kind of split. split_loop only works for single loop and insert edge at
> exit when split, while split_loop_on_cond is not limited to single loop
> and insert edge at latch when split. Both split behavior should consider
> loop count and probability update. For split_loop, loop split condition
> is moved in front of loop1 and loop2; But split_loop_on_cond moves the
> condition between loop1 and loop2, this patch does:
> 1) profile count proportion for both original loop and copied loop
> without dropping down the true branch's count;
> 2) probability update in and between the two loops;
> 3) number of iterations update for split_loop.
>
> Regression tested pass, OK for master?
>
> Changes diff for split_loop and split_loop_on_cond cases:
>
> 1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit
> ...
> <bb 2> [local count: 118111600]:
> if (beg_5(D) < end_8(D))
> goto <bb 14>; [89.00%]
> else
> goto <bb 6>; [11.00%]
>
> <bb 14> [local count: 105119324]:
> if (beg2_6(D) < c_9(D))
> - goto <bb 15>; [100.00%]
> + goto <bb 15>; [33.00%]
> else
> - goto <bb 16>; [100.00%]
> + goto <bb 16>; [67.00%]
>
> - <bb 15> [local count: 105119324]:
> + <bb 15> [local count: 34689377]:
> _25 = beg_5(D) + 1;
> _26 = end_8(D) - beg_5(D);
> _27 = beg2_6(D) + _26;
> _28 = MIN_EXPR <c_9(D), _27>;
>
> - <bb 3> [local count: 955630225]:
> + <bb 3> [local count: 315357973]:
> # i_16 = PHI <i_11(8), beg_5(D)(15)>
> # j_17 = PHI <j_12(8), beg2_6(D)(15)>
> printf ("a: %d %d\n", i_16, j_17);
> i_11 = i_16 + 1;
> j_12 = j_17 + 1;
> if (j_12 < _28)
> - goto <bb 8>; [89.00%]
> + goto <bb 8>; [29.37%]
> else
> - goto <bb 17>; [11.00%]
> + goto <bb 17>; [70.63%]
>
> - <bb 8> [local count: 850510901]:
> + <bb 8> [local count: 280668596]:
> goto <bb 3>; [100.00%]
>
> - <bb 16> [local count: 105119324]:
> + <bb 16> [local count: 70429947]:
> # i_22 = PHI <beg_5(D)(14), i_29(17)>
> # j_23 = PHI <beg2_6(D)(14), j_30(17)>
>
> <bb 10> [local count: 955630225]:
> # i_2 = PHI <i_22(16), i_20(13)>
> # j_1 = PHI <j_23(16), j_21(13)>
> i_20 = i_2 + 1;
> j_21 = j_1 + 1;
> if (end_8(D) > i_20)
> - goto <bb 13>; [89.00%]
> + goto <bb 13>; [59.63%]
> else
> - goto <bb 9>; [11.00%]
> + goto <bb 9>; [40.37%]
>
> - <bb 13> [local count: 850510901]:
> + <bb 13> [local count: 569842305]:
> goto <bb 10>; [100.00%]
>
> <bb 17> [local count: 105119324]:
> # i_29 = PHI <i_11(3)>
> # j_30 = PHI <j_12(3)>
> if (end_8(D) > i_29)
> goto <bb 16>; [80.00%]
> else
> goto <bb 9>; [20.00%]
>
> <bb 9> [local count: 105119324]:
>
> <bb 6> [local count: 118111600]:
> return 0;
>
> }
>
> 2) diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit:
> ...
> <bb 2> [local count: 118111600]:
> if (n_7(D) > 0)
> goto <bb 4>; [89.00%]
> else
> goto <bb 3>; [11.00%]
>
> <bb 3> [local count: 118111600]:
> return;
>
> <bb 4> [local count: 105119324]:
> pretmp_3 = ga;
>
> - <bb 5> [local count: 955630225]:
> + <bb 5> [local count: 315357973]:
> # i_13 = PHI <i_10(20), 0(4)>
> # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
> if (prephitmp_12 != 0)
> goto <bb 6>; [33.00%]
> else
> goto <bb 7>; [67.00%]
>
> <bb 6> [local count: 315357972]:
> _2 = do_something ();
> ga = _2;
>
> - <bb 7> [local count: 955630225]:
> + <bb 7> [local count: 315357973]:
> # prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
> i_10 = inc (i_13);
> if (n_7(D) > i_10)
> goto <bb 21>; [89.00%]
> else
> goto <bb 11>; [11.00%]
>
> <bb 11> [local count: 105119324]:
> goto <bb 3>; [100.00%]
>
> - <bb 21> [local count: 850510901]:
> + <bb 21> [local count: 280668596]:
> if (prephitmp_12 != 0)
> - goto <bb 20>; [100.00%]
> + goto <bb 20>; [33.00%]
> else
> - goto <bb 19>; [INV]
> + goto <bb 19>; [67.00%]
>
> - <bb 20> [local count: 850510901]:
> + <bb 20> [local count: 280668596]:
> goto <bb 5>; [100.00%]
>
> - <bb 19> [count: 0]:
> + <bb 19> [local count: 70429947]:
> # i_23 = PHI <i_10(21)>
> # prephitmp_25 = PHI <prephitmp_5(21)>
>
> - <bb 12> [local count: 955630225]:
> + <bb 12> [local count: 640272252]:
> # i_15 = PHI <i_23(19), i_22(16)>
> # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
> i_22 = inc (i_15);
> if (n_7(D) > i_22)
> - goto <bb 16>; [89.00%]
> + goto <bb 16>; [59.63%]
> else
> - goto <bb 11>; [11.00%]
> + goto <bb 11>; [40.37%]
>
> - <bb 16> [local count: 850510901]:
> + <bb 16> [local count: 569842305]:
> goto <bb 12>; [100.00%]
>
> }
>
> gcc/ChangeLog:
>
> * tree-ssa-loop-split.c (split_loop): Fix incorrect
> profile_count and probability.
> (do_split_loop_on_cond): Likewise.
> ---
> gcc/tree-ssa-loop-split.c | 110 +++++++++++++++++++++++++++++++++++---
> 1 file changed, 102 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> index 3f6ad046623..102766241fb 100644
> --- a/gcc/tree-ssa-loop-split.c
> +++ b/gcc/tree-ssa-loop-split.c
> @@ -575,7 +575,10 @@ split_loop (class loop *loop1)
> stmts2);
> tree cond = build2 (guard_code, boolean_type_node, guard_init, border);
> if (!initial_true)
> - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
> + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
> +
> + edge true_edge, false_edge;
> + extract_true_false_edges_from_block (bbs[i], &true_edge, &false_edge);
>
> /* Now version the loop, placing loop2 after loop1 connecting
> them, and fix up SSA form for that. */
> @@ -583,11 +586,11 @@ split_loop (class loop *loop1)
> basic_block cond_bb;
>
> class loop *loop2 = loop_version (loop1, cond, &cond_bb,
> - profile_probability::always (),
> - profile_probability::always (),
> - profile_probability::always (),
> - profile_probability::always (),
> - true);
> + true_edge->probability,
> + true_edge->probability.invert (),
> + profile_probability::always (),
> + profile_probability::always (),
> + true);
> gcc_assert (loop2);
>
> edge new_e = connect_loops (loop1, loop2);
> @@ -607,6 +610,53 @@ split_loop (class loop *loop1)
> tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop1));
> patch_loop_exit (loop1, guard_stmt, guard_next, newend, initial_true);
>
> + /* Check first loop's number of iterations. */
> + update_ssa (TODO_update_ssa);
> + gcc_assert (number_of_iterations_exit (loop1, single_exit (loop1),
> + &niter, false, true));
> +
> + /* Proportion first loop's bb counts except those dominated by true
> + branch to avoid drop 1s down. */
> + basic_block *bbs1, *bbs2;
> + bbs1 = get_loop_body (loop1);
> + unsigned j;
> + for (j = 0; j < loop1->num_nodes; j++)
> + if (bbs1[j] == loop1->latch
> + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest))
> + bbs1[j]->count
> + = bbs1[j]->count.apply_probability (true_edge->probability);
> + free (bbs1);
> +
> + /* Fix first loop's exit probability after scaling. */
> + edge exit_to_latch1 = single_pred_edge (loop1->latch);
> + exit_to_latch1->probability = exit_to_latch1->probability.apply_scale (
> + true_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE);
> + single_exit (loop1)->probability
> + = exit_to_latch1->probability.invert ();
> +
> + /* Check second loop's number of iterations. */
> + class tree_niter_desc niter2;
> + gcc_assert (number_of_iterations_exit (loop2, single_exit (loop2),
> + &niter2, false, true));
> +
> + /* Proportion second loop's bb counts except those dominated by false
> + branch to avoid drop 1s down. */
> + basic_block bbi_copy = get_bb_copy (false_edge->dest);
> + bbs2 = get_loop_body (loop2);
> + for (j = 0; j < loop2->num_nodes; j++)
> + if (bbs2[j] == loop2->latch
> + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy))
> + bbs2[j]->count = bbs2[j]->count.apply_probability (
> + true_edge->probability.invert ());
> + free (bbs2);
> +
> + /* Fix second loop's exit probability after scaling. */
> + edge exit_to_latch2 = single_pred_edge (loop2->latch);
> + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale (
> + false_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE);
> + single_exit (loop2)->probability
> + = exit_to_latch2->probability.invert ();
> +
> /* Finally patch out the two copies of the condition to be always
> true/false (or opposite). */
> gcond *force_true = as_a<gcond *> (last_stmt (bbs[i]));
> @@ -1486,8 +1536,8 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
> initialize_original_copy_tables ();
>
> struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL,
> - profile_probability::always (),
> - profile_probability::never (),
> + invar_branch->probability.invert (),
> + invar_branch->probability,
> profile_probability::always (),
> profile_probability::always (),
> true);
> @@ -1535,6 +1585,50 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch)
> between loop1 and loop2. */
> connect_loop_phis (loop1, loop2, to_loop2);
>
> + update_ssa (TODO_update_ssa);
> +
> + edge true_edge, false_edge, skip_edge1, skip_edge2;
> + extract_true_false_edges_from_block (cond_bb, &true_edge, &false_edge);
> +
> + /* Proportion first loop's bb counts except those dominated by true
> + branch to avoid drop 1s down. */
> + skip_edge1 = true_invar ? false_edge : true_edge;
> + skip_edge2 = true_invar ? true_edge : false_edge;
> + basic_block *bbs1, *bbs2;
> + bbs1 = get_loop_body (loop1);
> + unsigned j;
> + for (j = 0; j < loop1->num_nodes; j++)
> + if (bbs1[j] == loop1->latch
> + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], skip_edge1->dest))
> + bbs1[j]->count
> + = bbs1[j]->count.apply_probability (skip_edge1->probability);
> + free (bbs1);
> +
> + /* Fix first loop's exit probability after scaling. */
> + to_loop1->probability = invar_branch->probability.invert ();
> + to_loop2->probability = invar_branch->probability;
> +
> + /* Proportion second loop's bb counts except those dominated by false
> + branch to avoid drop 1s down. */
> + basic_block bbi_copy = get_bb_copy (skip_edge2->dest);
> + bbs2 = get_loop_body (loop2);
> + for (j = 0; j < loop2->num_nodes; j++)
> + if (bbs2[j] == loop2->latch
> + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy))
> + bbs2[j]->count
> + = bbs2[j]->count.apply_probability (skip_edge2->probability);
> + free (bbs2);
> +
> + /* Fix second loop's exit probability after scaling. */
> + edge loop2_latch_exit;
> + edge exit_to_latch2 = single_pred_edge (loop2->latch);
> + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale (
> + skip_edge2->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE);
> + loop2_latch_exit = EDGE_SUCC (exit_to_latch2->src, 0) == exit_to_latch2
> + ? EDGE_SUCC (exit_to_latch2->src, 1)
> + : EDGE_SUCC (exit_to_latch2->src, 0);
> + loop2_latch_exit->probability = exit_to_latch2->probability.invert ();
> +
> free_original_copy_tables ();
>
> return true;
>
--
Thanks,
Xionghu
next prev parent reply other threads:[~2021-11-24 5:11 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-27 6:34 [PATCH v2 0/4] loop split fix and functions renaming Xionghu Luo
2021-10-27 6:34 ` [PATCH v2 1/4] Fix loop split incorrect count and probability Xionghu Luo
2021-10-27 7:07 ` Jan Hubicka
2021-10-27 7:23 ` Jan Hubicka
2021-10-27 7:29 ` Richard Biener
2021-10-27 7:44 ` Jan Hubicka
2021-11-08 6:09 ` [PATCH v3 " Xionghu Luo
2021-11-24 5:11 ` Xionghu Luo [this message]
2021-10-27 6:34 ` [PATCH v2 2/4] Refactor loop_version Xionghu Luo
2021-10-29 11:52 ` Richard Biener
2021-11-01 5:28 ` Xionghu Luo
2021-10-27 6:34 ` [PATCH v2 3/4] Rename loop_version to clone_loop_to_header_edge Xionghu Luo
2021-11-03 13:36 ` Richard Biener
2021-10-27 6:34 ` [PATCH v2 4/4] Rename duplicate_loop_to_header_edge to duplicate_loop_body_to_header_edge Xionghu Luo
2021-10-29 11:50 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=517d9a55-adf8-3826-9d41-a9639e6091a7@linux.ibm.com \
--to=luoxhu@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hubicka@kam.mff.cuni.cz \
--cc=linkw@gcc.gnu.org \
--cc=rguenther@suse.de \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).