From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id AF59D385802E; Tue, 21 Dec 2021 03:57:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AF59D385802E Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1BL228Su002048; Tue, 21 Dec 2021 03:57:39 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3d2y1a8s1g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Dec 2021 03:57:39 +0000 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1BL3eVmD025682; Tue, 21 Dec 2021 03:57:38 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 3d2y1a8s12-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Dec 2021 03:57:38 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1BL3sqs5031084; Tue, 21 Dec 2021 03:57:36 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma06ams.nl.ibm.com with ESMTP id 3d16wjhwga-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 21 Dec 2021 03:57:35 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1BL3vXaK43319678 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Dec 2021 03:57:33 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9B7594C04A; Tue, 21 Dec 2021 03:57:33 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1B8DC4C044; Tue, 21 Dec 2021 03:57:31 +0000 (GMT) Received: from [9.197.245.13] (unknown [9.197.245.13]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Tue, 21 Dec 2021 03:57:30 +0000 (GMT) Message-ID: <73350ab6-375a-b832-e71e-f4c319ba6536@linux.ibm.com> Date: Tue, 21 Dec 2021 11:57:28 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Subject: Re: [PATCH 3/3] Fix loop split incorrect count and probability Content-Language: en-US To: Jeff Law , gcc-patches@gcc.gnu.org Cc: hubicka@kam.mff.cuni.cz, dje.gcc@gmail.com, wschmidt@linux.ibm.com, linkw@gcc.gnu.org, segher@kernel.crashing.org References: <20211208055416.1415283-1-luoxhu@linux.ibm.com> <20211208055416.1415283-4-luoxhu@linux.ibm.com> <6328088e-a807-059a-2a1c-8024e2c35008@gmail.com> From: Xionghu Luo In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: kdXLSFom1suME2xRsmtl6yvGbVIznvyW X-Proofpoint-GUID: YRuKd9sqjjLMJuYIvX-95dYxrq2wSnAG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2021-12-21_01,2021-12-20_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 priorityscore=1501 impostorscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2112210013 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Dec 2021 03:57:42 -0000 On 2021/12/13 16:57, Xionghu Luo via Gcc-patches wrote: > > > On 2021/12/9 07:47, Jeff Law wrote: >>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c >>> index 3f6ad046623..33128061aab 100644 >>> --- a/gcc/tree-ssa-loop-split.c >>> +++ b/gcc/tree-ssa-loop-split.c >>> >>> @@ -607,6 +610,38 @@ split_loop (class loop *loop1) >>>       tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge >>> (loop1)); >>>       patch_loop_exit (loop1, guard_stmt, guard_next, newend, >>> initial_true); >>>   +    update_ssa (TODO_update_ssa); >>> + >>> +    /* Proportion first loop's bb counts except those dominated by true >>> +       branch to avoid drop 1s down.  */ >>> +    basic_block *bbs1, *bbs2; >>> +    bbs1 = get_loop_body (loop1); >>> +    unsigned j; >>> +    for (j = 0; j < loop1->num_nodes; j++) >>> +      if (bbs1[j] == loop1->latch >>> +          || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) >>> +        bbs1[j]->count >>> +          = bbs1[j]->count.apply_probability (true_edge->probability); >>> +    free (bbs1); >> It looks like there's two copies of this code in this patch, one in >> split_loop and the other in do_split_loop_on_cond.  Would it make sense >> to factor it out into its own little function? >> >> >>> + >>> +    /* Proportion second loop's bb counts except those dominated by >>> false >>> +       branch to avoid drop 1s down.  */ >>> +    basic_block bbi_copy = get_bb_copy (false_edge->dest); >>> +    bbs2 = get_loop_body (loop2); >>> +    for (j = 0; j < loop2->num_nodes; j++) >>> +      if (bbs2[j] == loop2->latch >>> +          || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) >>> +        bbs2[j]->count = bbs2[j]->count.apply_probability ( >>> +          true_edge->probability.invert ()); >>> +    free (bbs2); >> Similarly for this block of code. >> >> If those can be reasonably factored out into two helper functions to be >> called from split_loop and do_split_loop_on_cond, then this is OK with >> the refactoring. >> >> jeff > > > Thanks for the comments, updated as below. Will commit this patchset and the > approved patch for LIM if there are no objections: This patch is committed to r12-6086. > > > [PATCH v2 3/3] Fix loop split incorrect count and probability > > In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two > kind of split. split_loop only works for single loop and insert edge at > exit when split, while split_loop_on_cond is not limited to single loop > and insert edge at latch when split. Both split behavior should consider > loop count and probability update. For split_loop, loop split condition > is moved in front of loop1 and loop2; But split_loop_on_cond moves the > condition between loop1 and loop2, this patch does: > 1) profile count proportion for both original loop and copied loop > without dropping down the true branch's count; > 2) probability update in the two loops and between the two loops. > > Regression tested pass, OK for master? > > Changes diff for split_loop and split_loop_on_cond cases: > > 1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit > ... > [local count: 118111600]: > _1 = end_6(D) - beg_7(D); > j_9 = _1 + beg2_8(D); > if (end_6(D) > beg_7(D)) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > if (j_9 >= c_11(D)) > - goto ; [100.00%] > + goto ; [33.00%] > else > - goto ; [100.00%] > + goto ; [67.00%] > > - [local count: 105119324]: > + [local count: 34689377]: > _27 = end_6(D) + -1; > _28 = beg_7(D) - end_6(D); > _29 = j_9 + _28; > _30 = _29 + 1; > _31 = MAX_EXPR ; > > - [local count: 955630225]: > + [local count: 315357973]: > # i_18 = PHI > # j_19 = PHI > printf ("a: %d %d\n", i_18, j_19); > i_13 = i_18 + -1; > j_14 = j_19 + -1; > if (j_14 >= _31) > - goto ; [89.00%] > + goto ; [29.37%] > else > - goto ; [11.00%] > + goto ; [70.63%] > > - [local count: 850510901]: > + [local count: 280668596]: > goto ; [100.00%] > > - [local count: 105119324]: > + [local count: 70429947]: > # i_24 = PHI > # j_25 = PHI > > [local count: 955630225]: > # i_3 = PHI > # j_2 = PHI > i_22 = i_3 + -1; > j_23 = j_2 + -1; > if (beg_7(D) < i_22) > goto ; [89.00%] > else > goto ; [11.00%] > > - [local count: 850510901]: > + [local count: 569842305]: > goto ; [100.00%] > > [local count: 105119324]: > # i_32 = PHI > # j_33 = PHI > if (beg_7(D) < i_32) > goto ; [80.00%] > else > goto ; [20.00%] > > [local count: 105119324]: > > [local count: 118111600]: > return 0; > > } > > 2) diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit: > ... > [local count: 118111600]: > if (n_7(D) > 0) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 118111600]: > return; > > [local count: 105119324]: > pretmp_3 = ga; > > - [local count: 955630225]: > + [local count: 315357973]: > # i_13 = PHI > # prephitmp_12 = PHI > if (prephitmp_12 != 0) > goto ; [33.00%] > else > goto ; [67.00%] > > [local count: 315357972]: > _2 = do_something (); > ga = _2; > > - [local count: 955630225]: > + [local count: 315357973]: > # prephitmp_5 = PHI > i_10 = inc (i_13); > if (n_7(D) > i_10) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > goto ; [100.00%] > > - [local count: 850510901]: > + [local count: 280668596]: > if (prephitmp_12 != 0) > - goto ; [100.00%] > + goto ; [33.00%] > else > - goto ; [INV] > + goto ; [67.00%] > > - [local count: 850510901]: > + [local count: 280668596]: > goto ; [100.00%] > > - [count: 0]: > + [local count: 70429947]: > # i_23 = PHI > # prephitmp_25 = PHI > > - [local count: 955630225]: > + [local count: 640272252]: > # i_15 = PHI > # prephitmp_16 = PHI > i_22 = inc (i_15); > if (n_7(D) > i_22) > goto ; [89.00%] > else > goto ; [11.00%] > > - [local count: 850510901]: > + [local count: 569842305]: > goto ; [100.00%] > } > > gcc/ChangeLog: > > * tree-ssa-loop-split.c (Fix_loop_bb_probability): New function. > (split_loop): Fix incorrect profile_count and probability. > (do_split_loop_on_cond): Likewise. > --- > gcc/tree-ssa-loop-split.c | 72 ++++++++++++++++++++++++++++++++++----- > 1 file changed, 64 insertions(+), 8 deletions(-) > > diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > index 3f6ad046623..55011aeed97 100644 > --- a/gcc/tree-ssa-loop-split.c > +++ b/gcc/tree-ssa-loop-split.c > @@ -484,6 +484,39 @@ compute_new_first_bound (gimple_seq *stmts, class tree_niter_desc *niter, > return newend; > } > > +/* Fix the two loop's bb count after split based on the split edge probability, > + don't adjust the bbs dominated by true branches of that loop to avoid > + dropping 1s down. */ > +static void > +fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge true_edge, > + edge false_edge) > +{ > + update_ssa (TODO_update_ssa); > + > + /* Proportion first loop's bb counts except those dominated by true > + branch to avoid drop 1s down. */ > + basic_block *bbs1, *bbs2; > + bbs1 = get_loop_body (loop1); > + unsigned j; > + for (j = 0; j < loop1->num_nodes; j++) > + if (bbs1[j] == loop1->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) > + bbs1[j]->count > + = bbs1[j]->count.apply_probability (true_edge->probability); > + free (bbs1); > + > + /* Proportion second loop's bb counts except those dominated by false > + branch to avoid drop 1s down. */ > + basic_block bbi_copy = get_bb_copy (false_edge->dest); > + bbs2 = get_loop_body (loop2); > + for (j = 0; j < loop2->num_nodes; j++) > + if (bbs2[j] == loop2->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) > + bbs2[j]->count > + = bbs2[j]->count.apply_probability (true_edge->probability.invert ()); > + free (bbs2); > +} > + > /* Checks if LOOP contains an conditional block whose condition > depends on which side in the iteration space it is, and if so > splits the iteration space into two loops. Returns true if the > @@ -575,7 +608,10 @@ split_loop (class loop *loop1) > stmts2); > tree cond = build2 (guard_code, boolean_type_node, guard_init, border); > if (!initial_true) > - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + > + edge true_edge, false_edge; > + extract_true_false_edges_from_block (bbs[i], &true_edge, &false_edge); > > /* Now version the loop, placing loop2 after loop1 connecting > them, and fix up SSA form for that. */ > @@ -583,11 +619,11 @@ split_loop (class loop *loop1) > basic_block cond_bb; > > class loop *loop2 = loop_version (loop1, cond, &cond_bb, > - profile_probability::always (), > - profile_probability::always (), > - profile_probability::always (), > - profile_probability::always (), > - true); > + true_edge->probability, > + true_edge->probability.invert (), > + profile_probability::always (), > + profile_probability::always (), > + true); > gcc_assert (loop2); > > edge new_e = connect_loops (loop1, loop2); > @@ -607,6 +643,15 @@ split_loop (class loop *loop1) > tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop1)); > patch_loop_exit (loop1, guard_stmt, guard_next, newend, initial_true); > > + fix_loop_bb_probability (loop1, loop2, true_edge, false_edge); > + > + /* Fix first loop's exit probability after scaling. */ > + edge exit_to_latch1 = single_pred_edge (loop1->latch); > + exit_to_latch1->probability = exit_to_latch1->probability.apply_scale ( > + true_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); > + single_exit (loop1)->probability > + = exit_to_latch1->probability.invert (); > + > /* Finally patch out the two copies of the condition to be always > true/false (or opposite). */ > gcond *force_true = as_a (last_stmt (bbs[i])); > @@ -1486,8 +1531,8 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > initialize_original_copy_tables (); > > struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, > - profile_probability::always (), > - profile_probability::never (), > + invar_branch->probability.invert (), > + invar_branch->probability, > profile_probability::always (), > profile_probability::always (), > true); > @@ -1535,6 +1580,17 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > between loop1 and loop2. */ > connect_loop_phis (loop1, loop2, to_loop2); > > + edge true_edge, false_edge, skip_edge1, skip_edge2; > + extract_true_false_edges_from_block (cond_bb, &true_edge, &false_edge); > + > + skip_edge1 = true_invar ? false_edge : true_edge; > + skip_edge2 = true_invar ? true_edge : false_edge; > + fix_loop_bb_probability (loop1, loop2, skip_edge1, skip_edge2); > + > + /* Fix first loop's exit probability after scaling. */ > + to_loop1->probability = invar_branch->probability.invert (); > + to_loop2->probability = invar_branch->probability; > + > free_original_copy_tables (); > > return true; -- Thanks, Xionghu