From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 5D9113858D35; Wed, 8 Dec 2021 23:47:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5D9113858D35 Received: by mail-pl1-x62a.google.com with SMTP id k4so2614104plx.8; Wed, 08 Dec 2021 15:47:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=z1u9w/JUaz8PA5S9ETcCxKq1gKdqXG05JSgbVQz1E3Y=; b=FXXe5nidM/4cv/vkO3F86iOMlJtbBuvskPJBNfr497wG9uhpyAMpbOBl4lh5HPn6+n n+YR2dHAkcBj0XECBHM5rz58eb/RYkild8pm1IubSVV9V31VAvVfoHSIYNefUxO/MjNR +dXMqeNPQqf0Gy7rfCOPYXCHFvrFkabGt+sQbkhufJO2rpSzEKD2SwzuqtkKn/2cKI4j pTJwjEH/VK3W+4NYmsNmycPFE6ZxzZjG2h8qkK83VrDcyLv1TryetFyIGVbTcPVc28YO EOeB2LCijjVj5dR6QgBEJeOGOxSj0Ai4ur5K03+AYoda1HaL1HndjQt+6yDZ+Vv5L5i0 bXqg== X-Gm-Message-State: AOAM531ybVnHoq4KY9k+eI+ukikVKSJsLqQu8vxjI1qT2WTxCVo/ZDW+ /EtSIabIywYZnRU+ba2WPVI= X-Google-Smtp-Source: ABdhPJwltMIUmy8StZQRXttpzOZNFQp3kj6nGqNP6gt+ZD9s0b1f7lJ6V5Iuq/OgPUnHww+qlfdhuQ== X-Received: by 2002:a17:90a:6906:: with SMTP id r6mr11099763pjj.118.1639007225276; Wed, 08 Dec 2021 15:47:05 -0800 (PST) Received: from [192.168.1.15] (65-130-80-28.slkc.qwest.net. [65.130.80.28]) by smtp.gmail.com with ESMTPSA id qe2sm3647755pjb.42.2021.12.08.15.47.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Dec 2021 15:47:04 -0800 (PST) Message-ID: <6328088e-a807-059a-2a1c-8024e2c35008@gmail.com> Date: Wed, 8 Dec 2021 16:47:02 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.3.2 Subject: Re: [PATCH 3/3] Fix loop split incorrect count and probability Content-Language: en-US To: Xionghu Luo , gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, hubicka@kam.mff.cuni.cz, wschmidt@linux.ibm.com, linkw@gcc.gnu.org, dje.gcc@gmail.com References: <20211208055416.1415283-1-luoxhu@linux.ibm.com> <20211208055416.1415283-4-luoxhu@linux.ibm.com> From: Jeff Law In-Reply-To: <20211208055416.1415283-4-luoxhu@linux.ibm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2021 23:47:08 -0000 On 12/7/2021 10:54 PM, Xionghu Luo via Gcc-patches wrote: > In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two > kind of split. split_loop only works for single loop and insert edge at > exit when split, while split_loop_on_cond is not limited to single loop > and insert edge at latch when split. Both split behavior should consider > loop count and probability update. For split_loop, loop split condition > is moved in front of loop1 and loop2; But split_loop_on_cond moves the > condition between loop1 and loop2, this patch does: > 1) profile count proportion for both original loop and copied loop > without dropping down the true branch's count; > 2) probability update in the two loops and between the two loops. > > Regression tested pass, OK for master? > > Changes diff for split_loop and split_loop_on_cond cases: > > 1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit > ... > [local count: 118111600]: > if (beg_5(D) < end_8(D)) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > if (beg2_6(D) < c_9(D)) > - goto ; [100.00%] > + goto ; [33.00%] > else > - goto ; [100.00%] > + goto ; [67.00%] > > - [local count: 105119324]: > + [local count: 34689377]: > _25 = beg_5(D) + 1; > _26 = end_8(D) - beg_5(D); > _27 = beg2_6(D) + _26; > _28 = MIN_EXPR ; > > - [local count: 955630225]: > + [local count: 315357973]: > # i_16 = PHI > # j_17 = PHI > printf ("a: %d %d\n", i_16, j_17); > i_11 = i_16 + 1; > j_12 = j_17 + 1; > if (j_12 < _28) > - goto ; [89.00%] > + goto ; [29.37%] > else > - goto ; [11.00%] > + goto ; [70.63%] > > - [local count: 850510901]: > + [local count: 280668596]: > goto ; [100.00%] > > - [local count: 105119324]: > + [local count: 70429947]: > # i_22 = PHI > # j_23 = PHI > > [local count: 955630225]: > # i_2 = PHI > # j_1 = PHI > i_20 = i_2 + 1; > j_21 = j_1 + 1; > if (end_8(D) > i_20) > - goto ; [89.00%] > + goto ; [59.63%] > else > - goto ; [11.00%] > + goto ; [40.37%] > > - [local count: 850510901]: > + [local count: 569842305]: > goto ; [100.00%] > > [local count: 105119324]: > # i_29 = PHI > # j_30 = PHI > if (end_8(D) > i_29) > goto ; [80.00%] > else > goto ; [20.00%] > > [local count: 105119324]: > > [local count: 118111600]: > return 0; > > } > [local count: 118111600]: > - if (beg_5(D) < end_8(D)) > + _1 = end_6(D) - beg_7(D); > + j_9 = _1 + beg2_8(D); > + if (end_6(D) > beg_7(D)) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > - if (beg2_6(D) < c_9(D)) > - goto ; [100.00%] > + if (j_9 >= c_11(D)) > + goto ; [33.00%] > else > - goto ; [100.00%] > + goto ; [67.00%] > > - [local count: 105119324]: > - _25 = beg_5(D) + 1; > - _26 = end_8(D) - beg_5(D); > - _27 = beg2_6(D) + _26; > - _28 = MIN_EXPR ; > - > - [local count: 955630225]: > - # i_16 = PHI > - # j_17 = PHI > - printf ("a: %d %d\n", i_16, j_17); > - i_11 = i_16 + 1; > - j_12 = j_17 + 1; > - if (j_12 < _28) > - goto ; [89.00%] > + [local count: 34689377]: > + _27 = end_6(D) + -1; > + _28 = beg_7(D) - end_6(D); > + _29 = j_9 + _28; > + _30 = _29 + 1; > + _31 = MAX_EXPR ; > + > + [local count: 315357973]: > + # i_18 = PHI > + # j_19 = PHI > + printf ("a: %d %d\n", i_18, j_19); > + i_13 = i_18 + -1; > + j_14 = j_19 + -1; > + if (j_14 >= _31) > + goto ; [29.37%] > else > - goto ; [11.00%] > + goto ; [70.63%] > > - [local count: 850510901]: > + [local count: 280668596]: > goto ; [100.00%] > > - [local count: 105119324]: > - # i_22 = PHI > - # j_23 = PHI > + [local count: 70429947]: > + # i_24 = PHI > + # j_25 = PHI > > [local count: 955630225]: > - # i_2 = PHI > - # j_1 = PHI > - i_20 = i_2 + 1; > - j_21 = j_1 + 1; > - if (end_8(D) > i_20) > + # i_3 = PHI > + # j_2 = PHI > + i_22 = i_3 + -1; > + j_23 = j_2 + -1; > + if (beg_7(D) < i_22) > goto ; [89.00%] > else > goto ; [11.00%] > > - [local count: 850510901]: > + [local count: 569842305]: > goto ; [100.00%] > > [local count: 105119324]: > - # i_29 = PHI > - # j_30 = PHI > - if (end_8(D) > i_29) > + # i_32 = PHI > + # j_33 = PHI > + if (beg_7(D) < i_32) > goto ; [80.00%] > else > goto ; [20.00%] > > [local count: 105119324]: > > [local count: 118111600]: > return 0; > > } > > 2) diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit: > ... > [local count: 118111600]: > if (n_7(D) > 0) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 118111600]: > return; > > [local count: 105119324]: > pretmp_3 = ga; > > - [local count: 955630225]: > + [local count: 315357973]: > # i_13 = PHI > # prephitmp_12 = PHI > if (prephitmp_12 != 0) > goto ; [33.00%] > else > goto ; [67.00%] > > [local count: 315357972]: > _2 = do_something (); > ga = _2; > > - [local count: 955630225]: > + [local count: 315357973]: > # prephitmp_5 = PHI > i_10 = inc (i_13); > if (n_7(D) > i_10) > goto ; [89.00%] > else > goto ; [11.00%] > > [local count: 105119324]: > goto ; [100.00%] > > - [local count: 850510901]: > + [local count: 280668596]: > if (prephitmp_12 != 0) > - goto ; [100.00%] > + goto ; [33.00%] > else > - goto ; [INV] > + goto ; [67.00%] > > - [local count: 850510901]: > + [local count: 280668596]: > goto ; [100.00%] > > - [count: 0]: > + [local count: 70429947]: > # i_23 = PHI > # prephitmp_25 = PHI > > - [local count: 955630225]: > + [local count: 640272252]: > # i_15 = PHI > # prephitmp_16 = PHI > i_22 = inc (i_15); > if (n_7(D) > i_22) > goto ; [89.00%] > else > goto ; [11.00%] > > - [local count: 850510901]: > + [local count: 569842305]: > goto ; [100.00%] > > } > > gcc/ChangeLog: > > * tree-ssa-loop-split.c (split_loop): Fix incorrect > profile_count and probability. > (do_split_loop_on_cond): Likewise. > --- > gcc/tree-ssa-loop-split.c | 85 +++++++++++++++++++++++++++++++++++---- > 1 file changed, 77 insertions(+), 8 deletions(-) > > diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > index 3f6ad046623..33128061aab 100644 > --- a/gcc/tree-ssa-loop-split.c > +++ b/gcc/tree-ssa-loop-split.c > > @@ -607,6 +610,38 @@ split_loop (class loop *loop1) > tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop1)); > patch_loop_exit (loop1, guard_stmt, guard_next, newend, initial_true); > > + update_ssa (TODO_update_ssa); > + > + /* Proportion first loop's bb counts except those dominated by true > + branch to avoid drop 1s down. */ > + basic_block *bbs1, *bbs2; > + bbs1 = get_loop_body (loop1); > + unsigned j; > + for (j = 0; j < loop1->num_nodes; j++) > + if (bbs1[j] == loop1->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) > + bbs1[j]->count > + = bbs1[j]->count.apply_probability (true_edge->probability); > + free (bbs1); It looks like there's two copies of this code in this patch, one in split_loop and the other in do_split_loop_on_cond.  Would it make sense to factor it out into its own little function? > + > + /* Proportion second loop's bb counts except those dominated by false > + branch to avoid drop 1s down. */ > + basic_block bbi_copy = get_bb_copy (false_edge->dest); > + bbs2 = get_loop_body (loop2); > + for (j = 0; j < loop2->num_nodes; j++) > + if (bbs2[j] == loop2->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) > + bbs2[j]->count = bbs2[j]->count.apply_probability ( > + true_edge->probability.invert ()); > + free (bbs2); Similarly for this block of code. If those can be reasonably factored out into two helper functions to be called from split_loop and do_split_loop_on_cond, then this is OK with the refactoring. jeff