From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9829D3858C3A; Wed, 15 Dec 2021 06:40:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9829D3858C3A Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1BF4PEC0010280; Wed, 15 Dec 2021 06:40:22 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3cy2t75jcn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Dec 2021 06:40:22 +0000 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1BF6eL2P029547; Wed, 15 Dec 2021 06:40:21 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 3cy2t75jbw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Dec 2021 06:40:21 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1BF6MYfx019700; Wed, 15 Dec 2021 06:40:20 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma03ams.nl.ibm.com with ESMTP id 3cy7jqsp5h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Dec 2021 06:40:19 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1BF6eHEv32506214 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Dec 2021 06:40:17 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 83C7CAE058; Wed, 15 Dec 2021 06:40:17 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 70464AE053; Wed, 15 Dec 2021 06:40:15 +0000 (GMT) Received: from [9.200.154.17] (unknown [9.200.154.17]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Wed, 15 Dec 2021 06:40:15 +0000 (GMT) Message-ID: <8af589d9-13c1-2ff8-08d3-7caf98fc037a@linux.ibm.com> Date: Wed, 15 Dec 2021 14:40:12 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Subject: Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270] Content-Language: en-US To: Jan Hubicka Cc: wschmidt@linux.ibm.com, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org, linkw@gcc.gnu.org, segher@kernel.crashing.org References: <20211208055416.1415283-1-luoxhu@linux.ibm.com> <20211208055416.1415283-3-luoxhu@linux.ibm.com> <20211213092548.GA91590@kam.mff.cuni.cz> <5a057da8-677c-b5e9-48b3-2cb434e68505@linux.ibm.com> From: Xionghu Luo In-Reply-To: <5a057da8-677c-b5e9-48b3-2cb434e68505@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: OuexXQPhCsdAzh6DG6qj982ULSvf2wPn X-Proofpoint-ORIG-GUID: 4KSMEZbxF5Hd5iEELOXm3zgl9xcOSLY4 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2021-12-15_06,2021-12-14_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 suspectscore=0 clxscore=1015 lowpriorityscore=0 mlxlogscore=999 phishscore=0 spamscore=0 priorityscore=1501 malwarescore=0 mlxscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2112150036 X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Dec 2021 06:40:25 -0000 On 2021/12/14 17:27, Xionghu Luo via Gcc-patches wrote: > > > On 2021/12/13 17:25, Jan Hubicka wrote: >>> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in >>> profile-estimate when predict_extra_loop_exits, outer loop's exit edge >>> is marked as inner loop's extra loop exit and set with incorrect >>> prediction, then a hot inner loop will become cold loop finally through >>> optimizations, this patch add loop check when searching extra exit edges >>> to avoid unexpected predict_edge from predict_paths_for_bb. >>> >>> Regression tested on P8LE, OK for master? >>> >>> gcc/ChangeLog: >>> >>> PR middle-end/103270 >>> * predict.c (predict_extra_loop_exits): Add loop parameter. >>> (predict_loops): Call with loop argument. >> >> With changes to branch predictors it is useful to re-test their >> effectivity on spec and see if their hitrates are still mathcing >> reality. You can do it by buiding spec with -fprofile-generate, train >> it and then build with -fprofile-use -fdump-tree-ipa-profile-details >> and use contrib/analyze_brprob.py that will collect info on how they >> work. >> >> This patch looks good to me, but it would be nice to have things reality >> checked (and since we did not do the stats for some time, there may be >> surprises) so if you could run the specs and post results of >> analyze_brprob, it would be great. I will also try to get to that soon, >> but currently I am bit swamped by other problems I noticed on clang >> builds. >> >> Thanks a lot for working on profile fixes - I am trying now to get >> things into shape. With Martin we added basic testing infrastructure >> for keeping track of profile updates and I am trying to see how it works >> in practice now. Hopefully it will make it easier to judge on profile >> updating patches. I would welcome list of patches I should look at. >> >> I will write separate mail on this. >> Honza > > > With the patch, the analyze_brprob.py outputs below data with PGO build, > there is no verification code in the script, so how to check whether it > is correct? Run it again without the patch and compare "extra loop exit" > field? > > > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 > Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 > loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 > __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% > loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 > extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 > guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 > negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 > loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 > const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 > indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 > polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688244 3.27G 0.6% 53%:2 > recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 > goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 > null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 > continue 956 1.3% 63.70% 65.65% / 79.97% 3780303799 3.78G 0.7% 52%:3 > loop guard 1177 1.6% 56.33% 42.54% / 80.32% 7373601457 7.37G 1.4% 50%:2 > opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 6.0% 21%:2 > loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.6% 18%:1 > loop iterations 4761 6.3% 99.98% 84.27% / 84.27% 73463634555 73.46G 13.9% > pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% > call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.8% 34%:1 > opcode values nonequal (on trees) 12237 16.3% 70.70% 70.86% / 83.54% 36638772333 36.64G 6.9% > guessed loop iterations 16760 22.3% 99.78% 91.49% / 91.49% 162952747918 162.95G 30.9% > > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > no prediction 12730 16.9% 39.29% 33.32% / 79.93% 121106031835 121.11G 23.0% > first match 25261 33.6% 92.17% 88.33% / 88.98% 296652487962 296.65G 56.3% > DS theory 28333 37.7% 63.03% 72.05% / 85.00% 109563734005 109.56G 20.8% > combined 75232 100.0% 73.17% 72.32% / 86.08% 527351738575 527.35G 100.0% > > Loop count: 37870 > avg. # of iter: 8444.77 > median # of iter: 7.00 > avg. (1% cutoff) # of iter: 174.68 > avg. (5% cutoff) # of iter: 55.14 > avg. (10% cutoff) # of iter: 35.21 > avg. (20% cutoff) # of iter: 26.23 > avg. (30% cutoff) # of iter: 21.70 This is the output data collected without the patch, as can be seen, no difference on "extra loop exit". But this issue should be fixed. ./contrib/analyze_brprob_spec.py ~/workspace/tests/spec2017/ benchspec HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688238 3.27G 0.6% 53%:2 recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 continue 956 1.3% 63.70% 65.65% / 79.97% 3780303795 3.78G 0.7% 52%:3 loop guard 1178 1.6% 56.37% 42.54% / 80.32% 7373601533 7.37G 1.4% 50%:2 opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 5.9% 21%:2 loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.4% 18%:1 loop iterations 4772 6.3% 99.98% 84.27% / 84.27% 74045982111 74.05G 13.8% pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.7% 34%:1 opcode values nonequal (on trees) 12240 16.2% 70.71% 70.86% / 83.54% 36638772682 36.64G 6.9% guessed loop iterations 16854 22.4% 99.78% 91.21% / 91.22% 169765264401 169.77G 31.7% HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) no prediction 12731 16.9% 39.30% 33.32% / 79.93% 121106031963 121.11G 22.6% first match 25366 33.7% 92.20% 88.24% / 88.88% 304047352001 304.05G 56.9% DS theory 28337 37.6% 63.03% 72.05% / 85.00% 109563734430 109.56G 20.5% combined 75342 100.0% 73.21% 72.49% / 86.06% 534746603167 534.75G 100.0% Loop count: 38058 avg. # of iter: 8403.32 median # of iter: 7.00 avg. (1% cutoff) # of iter: 173.72 avg. (5% cutoff) # of iter: 54.90 avg. (10% cutoff) # of iter: 35.20 avg. (20% cutoff) # of iter: 26.35 avg. (30% cutoff) # of iter: 21.87 -- Thanks, Xionghu