From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 678CF3858401; Mon, 8 Nov 2021 06:09:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 678CF3858401 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A82Yu2i010736; Mon, 8 Nov 2021 06:09:48 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c641s829v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 08 Nov 2021 06:09:47 +0000 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1A85tFvq002677; Mon, 8 Nov 2021 06:09:47 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c641s8299-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 08 Nov 2021 06:09:47 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1A868sjc004271; Mon, 8 Nov 2021 06:09:44 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04ams.nl.ibm.com with ESMTP id 3c5hb9t4n1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 08 Nov 2021 06:09:44 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1A869goD63242554 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 8 Nov 2021 06:09:42 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1EFDE4204F; Mon, 8 Nov 2021 06:09:42 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DD70442041; Mon, 8 Nov 2021 06:09:39 +0000 (GMT) Received: from [9.200.154.17] (unknown [9.200.154.17]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 8 Nov 2021 06:09:39 +0000 (GMT) Message-ID: <6ba3c6d7-4f9f-1ffe-cab5-23369bc4cf13@linux.ibm.com> Date: Mon, 8 Nov 2021 14:09:35 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Subject: Re: [PATCH v3 1/4] Fix loop split incorrect count and probability Content-Language: en-US To: Jan Hubicka , Richard Biener Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, wschmidt@linux.ibm.com, linkw@gcc.gnu.org, dje.gcc@gmail.com References: <20211027063448.1844771-1-luoxhu@linux.ibm.com> <20211027063448.1844771-2-luoxhu@linux.ibm.com> <20211027070732.GA57414@kam.mff.cuni.cz> <55673non-925q-qq24-n1n-q8nro6q752n@fhfr.qr> <20211027074454.GC57414@kam.mff.cuni.cz> From: Xionghu Luo In-Reply-To: <20211027074454.GC57414@kam.mff.cuni.cz> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: -2wKTPGSXN6NmsBHdPGaXPzb9Mk9lf9l X-Proofpoint-ORIG-GUID: LL_IxFeQisZ3MK0SgBZhC6LnLzRNFeE- X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-08_01,2021-11-03_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 bulkscore=0 clxscore=1015 mlxscore=0 malwarescore=0 impostorscore=0 mlxlogscore=999 priorityscore=1501 adultscore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111080040 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2021 06:09:52 -0000 On 2021/10/27 15:44, Jan Hubicka wrote: >> On Wed, 27 Oct 2021, Jan Hubicka wrote: >> >>>> >>>> gcc/ChangeLog: >>>> >>>> * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. >>>> (do_split_loop_on_cond): Likewise. >>>> --- >>>> gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- >>>> 1 file changed, 16 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c >>>> index 3f6ad046623..d30782888f3 100644 >>>> --- a/gcc/tree-ssa-loop-split.c >>>> +++ b/gcc/tree-ssa-loop-split.c >>>> @@ -575,7 +575,11 @@ split_loop (class loop *loop1) >>>> stmts2); >>>> tree cond = build2 (guard_code, boolean_type_node, guard_init, border); >>>> if (!initial_true) >>>> - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); >>>> + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); >>>> + >>>> + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE >>>> + ? EDGE_SUCC (bbs[i], 0) >>>> + : EDGE_SUCC (bbs[i], 1); >>>> >>>> /* Now version the loop, placing loop2 after loop1 connecting >>>> them, and fix up SSA form for that. */ >>>> @@ -583,10 +587,10 @@ split_loop (class loop *loop1) >>>> basic_block cond_bb; >>>> >>>> class loop *loop2 = loop_version (loop1, cond, &cond_bb, >>>> - profile_probability::always (), >>>> - profile_probability::always (), >>>> - profile_probability::always (), >>>> - profile_probability::always (), >>>> + true_edge->probability, >>>> + true_edge->probability.invert (), >>>> + true_edge->probability, >>>> + true_edge->probability.invert (), >>>> true); >>> >>> As discussed yesterday, for loop of form >>> >>> for (...) >>> if (cond) >>> cond = something(); >>> else >>> something2 >>> >>> Split as >> >> Note that you are missing to conditionalize loop1 execution >> on 'cond' (not sure if that makes a difference). > You are right - forgot to mention that. > > Entry conditional makes no difference on scaling stmts inside loop but > affects its header and expected trip count. We however need to set up > probability of this conditional (and preheader count if it exists) > There is no general way to read the probability of this initial > conditional from cfg profile. So I guess we are stuck with guessing > some arbitrary value. I guess common case is that cond is true first > iteration tough and often we can easily see that fromo PHI node > initializing the test variable. > > Other thing that changes is expected number of iterations of the split > loops, so we may want to update the exit conditinal probability > accordingly... > Sorry for the late reply. The below updated patch mainly solves the issues you pointed out: - profile count proportion for both original loop and copied loop without dropping down the true branch's count; - probability update in the two loops and between the two loops; - number of iterations update/check for split_loop. [PATCH v3] Fix loop split incorrect count and probability In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two kind of split. split_loop only works for single loop and insert edge at exit when split, while split_loop_on_cond is not limited to single loop and insert edge at latch when split. Both split behavior should consider loop count and probability update. For split_loop, loop split condition is moved in front of loop1 and loop2; But split_loop_on_cond moves the condition between loop1 and loop2, this patch does: 1) profile count proportion for both original loop and copied loop without dropping down the true branch's count; 2) probability update in and between the two loops; 3) number of iterations update for split_loop. Regression tested pass, OK for master? Changes diff for split_loop and split_loop_on_cond cases: 1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit ... [local count: 118111600]: if (beg_5(D) < end_8(D)) goto ; [89.00%] else goto ; [11.00%] [local count: 105119324]: if (beg2_6(D) < c_9(D)) - goto ; [100.00%] + goto ; [33.00%] else - goto ; [100.00%] + goto ; [67.00%] - [local count: 105119324]: + [local count: 34689377]: _25 = beg_5(D) + 1; _26 = end_8(D) - beg_5(D); _27 = beg2_6(D) + _26; _28 = MIN_EXPR ; - [local count: 955630225]: + [local count: 315357973]: # i_16 = PHI # j_17 = PHI printf ("a: %d %d\n", i_16, j_17); i_11 = i_16 + 1; j_12 = j_17 + 1; if (j_12 < _28) - goto ; [89.00%] + goto ; [29.37%] else - goto ; [11.00%] + goto ; [70.63%] - [local count: 850510901]: + [local count: 280668596]: goto ; [100.00%] - [local count: 105119324]: + [local count: 70429947]: # i_22 = PHI # j_23 = PHI [local count: 955630225]: # i_2 = PHI # j_1 = PHI i_20 = i_2 + 1; j_21 = j_1 + 1; if (end_8(D) > i_20) - goto ; [89.00%] + goto ; [59.63%] else - goto ; [11.00%] + goto ; [40.37%] - [local count: 850510901]: + [local count: 569842305]: goto ; [100.00%] [local count: 105119324]: # i_29 = PHI # j_30 = PHI if (end_8(D) > i_29) goto ; [80.00%] else goto ; [20.00%] [local count: 105119324]: [local count: 118111600]: return 0; } 2) diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit: ... [local count: 118111600]: if (n_7(D) > 0) goto ; [89.00%] else goto ; [11.00%] [local count: 118111600]: return; [local count: 105119324]: pretmp_3 = ga; - [local count: 955630225]: + [local count: 315357973]: # i_13 = PHI # prephitmp_12 = PHI if (prephitmp_12 != 0) goto ; [33.00%] else goto ; [67.00%] [local count: 315357972]: _2 = do_something (); ga = _2; - [local count: 955630225]: + [local count: 315357973]: # prephitmp_5 = PHI i_10 = inc (i_13); if (n_7(D) > i_10) goto ; [89.00%] else goto ; [11.00%] [local count: 105119324]: goto ; [100.00%] - [local count: 850510901]: + [local count: 280668596]: if (prephitmp_12 != 0) - goto ; [100.00%] + goto ; [33.00%] else - goto ; [INV] + goto ; [67.00%] - [local count: 850510901]: + [local count: 280668596]: goto ; [100.00%] - [count: 0]: + [local count: 70429947]: # i_23 = PHI # prephitmp_25 = PHI - [local count: 955630225]: + [local count: 640272252]: # i_15 = PHI # prephitmp_16 = PHI i_22 = inc (i_15); if (n_7(D) > i_22) - goto ; [89.00%] + goto ; [59.63%] else - goto ; [11.00%] + goto ; [40.37%] - [local count: 850510901]: + [local count: 569842305]: goto ; [100.00%] } gcc/ChangeLog: * tree-ssa-loop-split.c (split_loop): Fix incorrect profile_count and probability. (do_split_loop_on_cond): Likewise. --- gcc/tree-ssa-loop-split.c | 110 +++++++++++++++++++++++++++++++++++--- 1 file changed, 102 insertions(+), 8 deletions(-) diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c index 3f6ad046623..102766241fb 100644 --- a/gcc/tree-ssa-loop-split.c +++ b/gcc/tree-ssa-loop-split.c @@ -575,7 +575,10 @@ split_loop (class loop *loop1) stmts2); tree cond = build2 (guard_code, boolean_type_node, guard_init, border); if (!initial_true) - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); + + edge true_edge, false_edge; + extract_true_false_edges_from_block (bbs[i], &true_edge, &false_edge); /* Now version the loop, placing loop2 after loop1 connecting them, and fix up SSA form for that. */ @@ -583,11 +586,11 @@ split_loop (class loop *loop1) basic_block cond_bb; class loop *loop2 = loop_version (loop1, cond, &cond_bb, - profile_probability::always (), - profile_probability::always (), - profile_probability::always (), - profile_probability::always (), - true); + true_edge->probability, + true_edge->probability.invert (), + profile_probability::always (), + profile_probability::always (), + true); gcc_assert (loop2); edge new_e = connect_loops (loop1, loop2); @@ -607,6 +610,53 @@ split_loop (class loop *loop1) tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop1)); patch_loop_exit (loop1, guard_stmt, guard_next, newend, initial_true); + /* Check first loop's number of iterations. */ + update_ssa (TODO_update_ssa); + gcc_assert (number_of_iterations_exit (loop1, single_exit (loop1), + &niter, false, true)); + + /* Proportion first loop's bb counts except those dominated by true + branch to avoid drop 1s down. */ + basic_block *bbs1, *bbs2; + bbs1 = get_loop_body (loop1); + unsigned j; + for (j = 0; j < loop1->num_nodes; j++) + if (bbs1[j] == loop1->latch + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) + bbs1[j]->count + = bbs1[j]->count.apply_probability (true_edge->probability); + free (bbs1); + + /* Fix first loop's exit probability after scaling. */ + edge exit_to_latch1 = single_pred_edge (loop1->latch); + exit_to_latch1->probability = exit_to_latch1->probability.apply_scale ( + true_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); + single_exit (loop1)->probability + = exit_to_latch1->probability.invert (); + + /* Check second loop's number of iterations. */ + class tree_niter_desc niter2; + gcc_assert (number_of_iterations_exit (loop2, single_exit (loop2), + &niter2, false, true)); + + /* Proportion second loop's bb counts except those dominated by false + branch to avoid drop 1s down. */ + basic_block bbi_copy = get_bb_copy (false_edge->dest); + bbs2 = get_loop_body (loop2); + for (j = 0; j < loop2->num_nodes; j++) + if (bbs2[j] == loop2->latch + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) + bbs2[j]->count = bbs2[j]->count.apply_probability ( + true_edge->probability.invert ()); + free (bbs2); + + /* Fix second loop's exit probability after scaling. */ + edge exit_to_latch2 = single_pred_edge (loop2->latch); + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale ( + false_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); + single_exit (loop2)->probability + = exit_to_latch2->probability.invert (); + /* Finally patch out the two copies of the condition to be always true/false (or opposite). */ gcond *force_true = as_a (last_stmt (bbs[i])); @@ -1486,8 +1536,8 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) initialize_original_copy_tables (); struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, - profile_probability::always (), - profile_probability::never (), + invar_branch->probability.invert (), + invar_branch->probability, profile_probability::always (), profile_probability::always (), true); @@ -1535,6 +1585,50 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) between loop1 and loop2. */ connect_loop_phis (loop1, loop2, to_loop2); + update_ssa (TODO_update_ssa); + + edge true_edge, false_edge, skip_edge1, skip_edge2; + extract_true_false_edges_from_block (cond_bb, &true_edge, &false_edge); + + /* Proportion first loop's bb counts except those dominated by true + branch to avoid drop 1s down. */ + skip_edge1 = true_invar ? false_edge : true_edge; + skip_edge2 = true_invar ? true_edge : false_edge; + basic_block *bbs1, *bbs2; + bbs1 = get_loop_body (loop1); + unsigned j; + for (j = 0; j < loop1->num_nodes; j++) + if (bbs1[j] == loop1->latch + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], skip_edge1->dest)) + bbs1[j]->count + = bbs1[j]->count.apply_probability (skip_edge1->probability); + free (bbs1); + + /* Fix first loop's exit probability after scaling. */ + to_loop1->probability = invar_branch->probability.invert (); + to_loop2->probability = invar_branch->probability; + + /* Proportion second loop's bb counts except those dominated by false + branch to avoid drop 1s down. */ + basic_block bbi_copy = get_bb_copy (skip_edge2->dest); + bbs2 = get_loop_body (loop2); + for (j = 0; j < loop2->num_nodes; j++) + if (bbs2[j] == loop2->latch + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) + bbs2[j]->count + = bbs2[j]->count.apply_probability (skip_edge2->probability); + free (bbs2); + + /* Fix second loop's exit probability after scaling. */ + edge loop2_latch_exit; + edge exit_to_latch2 = single_pred_edge (loop2->latch); + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale ( + skip_edge2->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); + loop2_latch_exit = EDGE_SUCC (exit_to_latch2->src, 0) == exit_to_latch2 + ? EDGE_SUCC (exit_to_latch2->src, 1) + : EDGE_SUCC (exit_to_latch2->src, 0); + loop2_latch_exit->probability = exit_to_latch2->probability.invert (); + free_original_copy_tables (); return true; -- 2.27.0.90.geebb51ba8c