From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 60CA13857007 for ; Thu, 2 Jul 2020 02:35:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 60CA13857007 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0622XFTW155754; Wed, 1 Jul 2020 22:35:58 -0400 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 320s23r7ak-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Jul 2020 22:35:57 -0400 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0622Ymc9009596; Thu, 2 Jul 2020 02:35:57 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma04wdc.us.ibm.com with ESMTP id 31x5vxjpda-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 02 Jul 2020 02:35:57 +0000 Received: from b01ledav001.gho.pok.ibm.com (b01ledav001.gho.pok.ibm.com [9.57.199.106]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0622Zuio54067688 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 2 Jul 2020 02:35:56 GMT Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DB5C728058; Thu, 2 Jul 2020 02:35:56 +0000 (GMT) Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8820B2805A; Thu, 2 Jul 2020 02:35:56 +0000 (GMT) Received: from genoa (unknown [9.40.192.157]) by b01ledav001.gho.pok.ibm.com (Postfix) with ESMTPS; Thu, 2 Jul 2020 02:35:56 +0000 (GMT) From: Jiufu Guo To: Jiufu Guo via Gcc-patches , Jan Hubicka Cc: wschmidt@linux.ibm.com, segher@kernel.crashing.org Subject: Re: [PATCH V2] PING^2 correct COUNT and PROB for unrolled loop References: <1580717822-6073-1-git-send-email-guojiufu@linux.ibm.com> <20200203162337.GK22868@kam.mff.cuni.cz> Date: Thu, 02 Jul 2020 10:35:43 +0800 In-Reply-To: (Jiufu Guo's message of "Thu, 18 Jun 2020 09:22:18 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-07-02_01:2020-07-01, 2020-07-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 cotscore=-2147483648 mlxscore=0 spamscore=0 priorityscore=1501 bulkscore=0 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2007020017 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jul 2020 02:35:59 -0000 Jiufu Guo writes: I would like to reping this patch. Since this is correcting COUNT and PROB for hot blocks, it helps some cases. https://gcc.gnu.org/legacy-ml/gcc-patches/2020-02/msg00927.html Thanks, Jiufu Guo > Jiufu Guo writes: > > Gentle ping. > https://gcc.gnu.org/legacy-ml/gcc-patches/2020-02/msg00927.html > > BR, > Jiufu Guo > >> Jiufu Guo via Gcc-patches writes: >> >> Hi, >> >> I would like to reping this, hope to get approval for this patch. >> https://gcc.gnu.org/legacy-ml/gcc-patches/2020-02/msg00927.html >> >> BR, >> Jiufu Guo >> >>> Jiufu Guo writes: >>> >>> Hi, >>> >>> I'd like to ping this patch for trunk on stage 1. >>> >>> This patch could fix the issue on incorrect COUNT/FREQUENCES of loop >>> unrolled blocks, and also could help the improve the cold/hot issue of >>> the unrolled loops. >>> >>> patch is also at >>> https://gcc.gnu.org/legacy-ml/gcc-patches/2020-02/msg00927.html >>> >>> Thanks, >>> Jiufu >>> >>>> Jiufu Guo writes: >>>> >>>> Hi! >>>> >>>> I'd like to ping following patch. As near end of gcc10 stage 4, it seems >>>> I would ask approval for GCC11 trunk. >>>> >>>> Thanks, >>>> Jiufu Guo >>>> >>>>> Hi Honza and all, >>>>> >>>>> I updated the patch a little as below. Bootstrap and regtest are ok >>>>> on powerpc64le. >>>>> >>>>> Is OK for trunk? >>>>> >>>>> Thanks for comments. >>>>> Jiufu >>>>> >>>>> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c >>>>> index 727e951..ded0046 100644 >>>>> --- a/gcc/cfgloopmanip.c >>>>> +++ b/gcc/cfgloopmanip.c >>>>> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see >>>>> #include "gimplify-me.h" >>>>> #include "tree-ssa-loop-manip.h" >>>>> #include "dumpfile.h" >>>>> +#include "cfgrtl.h" >>>>> >>>>> static void copy_loops_to (class loop **, int, >>>>> class loop *); >>>>> @@ -1258,14 +1259,30 @@ duplicate_loop_to_header_edge (class loop *loop, edge e, >>>>> /* If original loop is executed COUNT_IN times, the unrolled >>>>> loop will account SCALE_MAIN_DEN times. */ >>>>> scale_main = count_in.probability_in (scale_main_den); >>>>> + >>>>> + /* If we are guessing at the number of iterations and count_in >>>>> + becomes unrealistically small, reset probability. */ >>>>> + if (!(count_in.reliable_p () || loop->any_estimate)) >>>>> + { >>>>> + profile_count new_count_in = count_in.apply_probability (scale_main); >>>>> + profile_count preheader_count = loop_preheader_edge (loop)->count (); >>>>> + if (new_count_in.apply_scale (1, 10) < preheader_count) >>>>> + scale_main = profile_probability::likely (); >>>>> + } >>>>> + >>>>> scale_act = scale_main * prob_pass_main; >>>>> } >>>>> else >>>>> { >>>>> + profile_count new_loop_count; >>>>> profile_count preheader_count = e->count (); >>>>> - for (i = 0; i < ndupl; i++) >>>>> - scale_main = scale_main * scale_step[i]; >>>>> scale_act = preheader_count.probability_in (count_in); >>>>> + /* Compute final preheader count after peeling NDUPL copies. */ >>>>> + for (i = 0; i < ndupl; i++) >>>>> + preheader_count = preheader_count.apply_probability (scale_step[i]); >>>>> + /* Subtract out exit(s) from peeled copies. */ >>>>> + new_loop_count = count_in - (e->count () - preheader_count); >>>>> + scale_main = new_loop_count.probability_in (count_in); >>>>> } >>>>> } >>>>> >>>>> @@ -1381,6 +1398,38 @@ duplicate_loop_to_header_edge (class loop *loop, edge e, >>>>> scale_bbs_frequencies (new_bbs, n, scale_act); >>>>> scale_act = scale_act * scale_step[j]; >>>>> } >>>>> + >>>>> + /* Need to update PROB of exit edge and corresponding COUNT. */ >>>>> + if (orig && is_latch && (!bitmap_bit_p (wont_exit, j + 1)) >>>>> + && bbs_to_scale) >>>>> + { >>>>> + edge new_exit = new_spec_edges[SE_ORIG]; >>>>> + profile_count new_count_in = new_exit->src->count; >>>>> + profile_count preheader_count = loop_preheader_edge (loop)->count (); >>>>> + edge e; >>>>> + edge_iterator ei; >>>>> + >>>>> + FOR_EACH_EDGE (e, ei, new_exit->src->succs) >>>>> + if (e != new_exit) >>>>> + break; >>>>> + >>>>> + gcc_assert (e && e != new_exit); >>>>> + >>>>> + new_exit->probability = preheader_count.probability_in (new_count_in); >>>>> + e->probability = new_exit->probability.invert (); >>>>> + >>>>> + profile_count new_latch_count >>>>> + = new_exit->src->count.apply_probability (e->probability); >>>>> + profile_count old_latch_count = e->dest->count; >>>>> + >>>>> + EXECUTE_IF_SET_IN_BITMAP (bbs_to_scale, 0, i, bi) >>>>> + scale_bbs_frequencies_profile_count (new_bbs + i, 1, >>>>> + new_latch_count, >>>>> + old_latch_count); >>>>> + >>>>> + if (current_ir_type () != IR_GIMPLE) >>>>> + update_br_prob_note (e->src); >>>>> + } >>>>> } >>>>> free (new_bbs); >>>>> free (orig_loops); >>>>> diff --git a/gcc/testsuite/gcc.dg/pr68212.c b/gcc/testsuite/gcc.dg/pr68212.c >>>>> new file mode 100644 >>>>> index 0000000..f3b7c22 >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.dg/pr68212.c >>>>> @@ -0,0 +1,13 @@ >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-options "-O2 -fno-tree-vectorize -funroll-loops --param max-unroll-times=4 -fdump-rtl-alignments" } */ >>>>> + >>>>> +void foo(long int *a, long int *b, long int n) >>>>> +{ >>>>> + long int i; >>>>> + >>>>> + for (i = 0; i < n; i++) >>>>> + a[i] = *b; >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-rtl-dump-times "internal loop alignment added" 1 "alignments"} } */ >>>>> +