From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1anam02on2113.outbound.protection.outlook.com [40.107.96.113]) by sourceware.org (Postfix) with ESMTPS id AA06B384C830; Wed, 11 Aug 2021 03:03:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AA06B384C830 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iTbZ+uvRDbpUYv8RyLA0Lc9FNO//SlaSJM8qqbCLoFyzOm/8BiqCv5B7rxfZ1bfUVU4Clj2J0k1rmDVqN6TzYIFwTYFQVpCD1DoJ6Ik6nRPv8vuuctG02P/yX/aKXJm+jMFPnr/asDnYwoLlKVxu36oxod1rogAyGrGL+mlG8QCyMzd4OJ+FyuxQG3a90WVjFf5W+fS4njn0MnMhQ4rUrf1BvrLAyIaSIJUXJ4zuCYSybz1ghtFdp4HYTMdcZ/oEeaXuRdSf3ZVdqcJpG8MGJZKGt+U4f8GTdLgbHrlmujww1uIGMxdKjV2l4rBaXY/cpdo/p7jIhCCQFVs+LunXfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MbdNeL0VSkULUytqOEC4WHtU8h2m/shUimKvxp0ctqg=; b=fCVaf0WYTdvu1LFeF6Re2QIXBw3bDgopSEZ6H1yIWOBj1fDxcjxA0+3xKnmH6Ez01SbUW3GYNQ/QayMyt9SHIWYDk0GBNzphZc7ztZjk2KmC353pMaOhZDON2DZ574tlWvqpqDplBcph5Hxd6YuNBDlMX9vx3GOq9djAhkPafOG/f3nLd5GQj/OlWz3jhquyHjTpNazcaENwWbzn+vdONpHVyD4UQS1VJE63ouPlCfRs3s8zE/u+1oeycx0YwUu5kj6FmE5KcesTnollEDV8I8we9FU4GCOXVJcl5lkn1LQ7jTsmTFpkFH+FIT7I0U2vxOEqv7LxL/QH2YwaarofXA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none Received: from SN6PR01MB4958.prod.exchangelabs.com (2603:10b6:805:c1::15) by SA0PR01MB6362.prod.exchangelabs.com (2603:10b6:806:ec::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4415.14; Wed, 11 Aug 2021 03:03:56 +0000 Received: from SN6PR01MB4958.prod.exchangelabs.com ([fe80::c49e:d520:c095:a631]) by SN6PR01MB4958.prod.exchangelabs.com ([fe80::c49e:d520:c095:a631%3]) with mapi id 15.20.4394.025; Wed, 11 Aug 2021 03:03:56 +0000 From: Feng Xue OS To: Richard Biener , Xionghu Luo CC: "gcc-patches@gcc.gnu.org" , "segher@kernel.crashing.org" , "wschmidt@linux.ibm.com" , "guojiufu@linux.ibm.com" , "linkw@gcc.gnu.org" , "hubicka@ucw.cz" Subject: Re: [PATCH] Fix loop split incorrect count and probability Thread-Topic: [PATCH] Fix loop split incorrect count and probability Thread-Index: AQHXiEXTWKpj+O9eGEaGVcY8GBLnfatmYNQAgAQdmACAAl5fgIAAxter Date: Wed, 11 Aug 2021 03:03:56 +0000 Message-ID: References: <20210803085813.2217766-1-luoxhu@linux.ibm.com> <58f9e351-fc10-406e-5272-270ece73be71@linux.ibm.com>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e9bdb52a-8790-4d76-a258-08d95c74a8b1 x-ms-traffictypediagnostic: SA0PR01MB6362: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: GpdQ6UiKLTg0nXzUGfS0bSge656Kt0a0KjIjUiAKSYK53/X6pA9bp1PnwMtmB01YAu3yL/gizP3MIWOTyVe0oZETcJ6lj/ztPcJnXYiL7hKVveX18nqoR4SgQHFBHQzE/Tu5eBRYumws2K84JDJ5+5L0GBfw3WHOLms/Nk5pvPNNLP03kevHdTXziCcYxebWYsueW7tEF7xLCjbQXrO0Ftdtz58AkbjOp8ArG73UZrCxKogOec1j5W1/R08v/Nin3YaBqGE2dL/hOIIqWkIa5TPgDFO1UH+Ju7CM8/zIDVBlBRtnIaeOqOEQgw4JJmkxy9fDm8ZXKKPajPwyv1t623fvclgoc5exXWT1glfP60Rp0tvudAq1zj5+sbf8oIbIOM0RMNlWZU7LNDS2txNpMm3RBZFQXUGnYlV3hQ8pkVVV4s3BL1ngt2H1kQKeZRLEItXFssRmnvmOFTmLoPKTtj/dihnoUXJ8q21R4LOPnikqZ9KRqu+pZm/GsHyEWczwLhV5/PfzJkvhT/HZfniF8bpzZHDGi58EB/nC5z/prOdyd6T0FRG03MkzMsInScwQAz7z2eg8urZHaE10Ls13+bIVUQClDvDFlzXLm1jVtkFEiwraV2cOl1X6op7ayjrc7LFOe/TwUwLQIxbzFvlVVHNc3EGF+X3oPWuVL4XFT5v6WzKH6pwpUUFFNm+lTAKaUo0fH5jN9XJW9Ju/d4cJbQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR01MB4958.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(4636009)(39850400004)(396003)(346002)(136003)(366004)(376002)(86362001)(110136005)(54906003)(7696005)(71200400001)(83380400001)(316002)(38070700005)(478600001)(26005)(33656002)(8936002)(8676002)(4326008)(122000001)(9686003)(186003)(53546011)(6506007)(66556008)(76116006)(91956017)(66476007)(66946007)(66446008)(64756008)(2906002)(5660300002)(38100700002)(55016002)(52536014); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?dtxRe3LvTUToVKHFvVPM+wnphiqLuomMGK8kmNQMEbp59e0CXZiIF9UHZERi?= =?us-ascii?Q?4fs509MiKs/oYw2gV2IXE4c+7W8GRNl1vOk+HvcbA5ILcnBuxZk8/O0StI6g?= =?us-ascii?Q?Ajg1gmoq5yk6FdmiDw+uwG/tdDeDOgbGkvLL4PAa/RqEAh0ft7kGXLcvpHet?= =?us-ascii?Q?B+OBQq7t5vANiaHGqCu5Q9UXVUro7xyEHEhVCZ95ZzQFeu0SKCBLktfy+9ia?= =?us-ascii?Q?SRJ9BkMc0qX7tXmoU6ToL+/mbSUArXP0lSq9bqbSNRe64pFVhWB+dSBwQIoZ?= =?us-ascii?Q?iudHQA6i4YNc5xFjzjrGlI+Y+Lyqy97q1+1eU0Nen7T5locZW9+AVouTdyGn?= =?us-ascii?Q?j3SJsDmLOI1jaIr+pyR4+tVQrqgmtewF0ZqiKnG6/KuIONwGqiN2i/VNiYY3?= =?us-ascii?Q?LwtUdNu74nNfRWZ0Piw4kBNALosZzIEW4tbRCWGZTIVx/zGFkZdo5Vw7bXPB?= =?us-ascii?Q?I+wRigrT7xk8Knn1VRNBdfV+0jDYRbNJKZQofq7xJ3qk1ztZNR1/EKLiRUB+?= =?us-ascii?Q?VT6JjrSiSRo4cQlpbCqeqTCNHrBB2bnN3mJEGJeK/SMdB9FFdzL5Ip1qGr6q?= =?us-ascii?Q?QpAL6j6ccX5mP0dMDsRz5NUjEc96FMXEhP4onNMFY4BgruuU2FhIuXvprYaR?= =?us-ascii?Q?PS8dTuD9yS83QbyovlPEbS0nAjYRPLJyp1eIXjVDXwig+qIx+3cp05iitI0e?= =?us-ascii?Q?DS145oNm42HojmjM77WNVQ8AEbuJh7T67CCau6RXtzD+M/Qm3E694mvqDoh1?= =?us-ascii?Q?5dkYI5VY5VsKZAn3b0hZXgmzHukErMPQ2a065W0HNCU7XrAm4sCNLnq02LKv?= =?us-ascii?Q?zyAlhVREF2jG1mz3sKvKWGdeg5pXUrZ9LF09gpS5W1oRdlzSngQarouWq3TN?= =?us-ascii?Q?f00OO4Jhf/8NMvohePgFHN4dlYWYzn1HQ5L/NRBN1wxC+NkJzY+0rG6nLeC+?= =?us-ascii?Q?VMKBsreqGxmu699e0AVRKL8Q/VJt+mREclirhH2ixms9LxZrj0Nrs46kgwpR?= =?us-ascii?Q?jM+IN/9cRi8wdPyQTLMEyQdUirs6LoJPQDeGzBTgpFYST9JKM0Vl2EEFgZYd?= =?us-ascii?Q?Ur5ais8eVBPCncKWRrjoe7Hp8hEWKdswI2pWzWWVYUyRlsnwgWPut8pOB47z?= =?us-ascii?Q?3v/gm1y+0qstzLprNaE6r5hOOrlOREeMsoG/1baKV0uBe01ZukZdIslYU27A?= =?us-ascii?Q?QV2mRNVWyv1p+JWNQ4Q6VBJNv0MKxi5m5wfDJIH4rMYKdIZzHM3lIFddPTJh?= =?us-ascii?Q?VU56QWT/HyhT+GVvzlWepWjmBRA0/cuOwYrgGLj87K5ig+DnEhRpbWpDtm1I?= =?us-ascii?Q?hKI=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN6PR01MB4958.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: e9bdb52a-8790-4d76-a258-08d95c74a8b1 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Aug 2021 03:03:56.4482 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: +84IgQWwdpkBEGPkfGFpMke/exlMC93JmtIqAsNpeWvr/qU6tvl3LOMW87jyoIR3QWcnTv3CapLQ6JpvzUpfZOk28g3OFX7fSYcnCqOjCy4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR01MB6362 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Aug 2021 03:04:09 -0000 Any transformation involving cfg alteration would face same problem, it is not that easy to update new cfg with reasonable and seemly-correct profile count. We can adjust probability for impacted condition bbs, but lack of a utility like what static profile estimating pass does, and only propagates count partially. Thanks, Feng ________________________________________ From: Richard Biener Sent: Tuesday, August 10, 2021 10:47 PM To: Xionghu Luo Cc: gcc-patches@gcc.gnu.org; segher@kernel.crashing.org; Feng Xue OS; wschm= idt@linux.ibm.com; guojiufu@linux.ibm.com; linkw@gcc.gnu.org; hubicka@ucw.c= z Subject: Re: [PATCH] Fix loop split incorrect count and probability On Mon, 9 Aug 2021, Xionghu Luo wrote: > Thanks, > > On 2021/8/6 19:46, Richard Biener wrote: > > On Tue, 3 Aug 2021, Xionghu Luo wrote: > > > >> loop split condition is moved between loop1 and loop2, the split bb's > >> count and probability should also be duplicated instead of (100% vs IN= V), > >> secondly, the original loop1 and loop2 count need be propotional from = the > >> original loop. > >> > >> > >> diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c= .151t.lsplit: > >> ... > >> int prephitmp_16; > >> int prephitmp_25; > >> > >> [local count: 118111600]: > >> if (n_7(D) > 0) > >> goto ; [89.00%] > >> else > >> goto ; [11.00%] > >> > >> [local count: 118111600]: > >> return; > >> > >> [local count: 105119324]: > >> pretmp_3 =3D ga; > >> > >> - [local count: 955630225]: > >> + [local count: 315357973]: > >> # i_13 =3D PHI > >> # prephitmp_12 =3D PHI > >> if (prephitmp_12 !=3D 0) > >> goto ; [33.00%] > >> else > >> goto ; [67.00%] > >> > >> - [local count: 315357972]: > >> + [local count: 104068130]: > >> _2 =3D do_something (); > >> ga =3D _2; > >> > >> - [local count: 955630225]: > >> + [local count: 315357973]: > >> # prephitmp_5 =3D PHI > >> i_10 =3D inc (i_13); > >> if (n_7(D) > i_10) > >> goto ; [89.00%] > >> else > >> goto ; [11.00%] > >> > >> [local count: 105119324]: > >> goto ; [100.00%] > >> > >> - [local count: 850510901]: > >> + [local count: 280668596]: > >> if (prephitmp_12 !=3D 0) > >> - goto ; [100.00%] > >> + goto ; [33.00%] > >> else > >> - goto ; [INV] > >> + goto ; [67.00%] > >> > >> - [local count: 850510901]: > >> + [local count: 280668596]: > >> goto ; [100.00%] > >> > >> - [count: 0]: > >> + [local count: 70429947]: > >> # i_23 =3D PHI > >> # prephitmp_25 =3D PHI > >> > >> - [local count: 955630225]: > >> + [local count: 640272252]: > >> # i_15 =3D PHI > >> # prephitmp_16 =3D PHI > >> i_22 =3D inc (i_15); > >> if (n_7(D) > i_22) > >> goto ; [89.00%] > >> else > >> goto ; [11.00%] > >> > >> - [local count: 850510901]: > >> + [local count: 569842305]: > >> goto ; [100.00%] > >> > >> } > >> > >> gcc/ChangeLog: > >> > >> * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > >> (do_split_loop_on_cond): Likewise. > >> --- > >> gcc/tree-ssa-loop-split.c | 16 ++++++++-------- > >> 1 file changed, 8 insertions(+), 8 deletions(-) > >> > >> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > >> index 3a09bbc39e5..8e5a7ded0f7 100644 > >> --- a/gcc/tree-ssa-loop-split.c > >> +++ b/gcc/tree-ssa-loop-split.c > >> @@ -583,10 +583,10 @@ split_loop (class loop *loop1) > >> basic_block cond_bb; > > if (!initial_true) > - cond =3D fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + cond =3D fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + > + edge true_edge =3D EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE > + ? EDGE_SUCC (bbs[i], 0) > + : EDGE_SUCC (bbs[i], 1); > > >> > >> class loop *loop2 =3D loop_version (loop1, cond, &cond_bb, > >> - profile_probability::always (), > >> - profile_probability::always (), > >> - profile_probability::always (), > >> - profile_probability::always (), > >> + true_edge->probability, > >> + true_edge->probability.invert ()= , > >> + true_edge->probability, > >> + true_edge->probability.invert ()= , > >> true); > > > > there is no 'true_edge' variable at this point. > > Sorry, missed the above hunk when split the patch. > > > > >> gcc_assert (loop2); > >> > >> @@ -1486,10 +1486,10 @@ do_split_loop_on_cond (struct loop *loop1, edg= e invar_branch) > >> initialize_original_copy_tables (); > >> > >> struct loop *loop2 =3D loop_version (loop1, boolean_true_node, NUL= L, > >> - profile_probability::always (), > >> - profile_probability::never (), > >> - profile_probability::always (), > >> - profile_probability::always (), > >> + invar_branch->probability.invert (), > >> + invar_branch->probability, > >> + invar_branch->probability.invert (), > >> + invar_branch->probability, > >> true); > >> if (!loop2) > >> { > > > > The patch introduction seems to talk about do_split_loop_on_cond only. > > split_loop faces similar issue though it sets the two branches to 100% vs= 100% > and no scaling which seems also incorrect. > > > Since loop versioning inserts a condition with the passed probabilities > > but in this case a 'boolean_true_node' condition the then and else > > probabilities passed look correct. It's just the scaling arguments > > that look wrong? This loop_version call should get a comment as to > > why we are passing probabilities the way we do. > > This optimization is transforming from: > > for (i =3D 0; i < n; i =3D inc (i)) > { > if (ga) > ga =3D do_something (); > } > > to: > > for (i =3D 0; i < x; i =3D inc (i)) > { > if (true) > ga =3D do_something (); > if (!ga) > break; > } > for (; i < n; i =3D inc (i)) > { > if (false) > ga =3D do_something (); > } > > > `boolean_true_node` is passed in to make the first loop's condition > statement to be 'true', after returning from loop_version, there is a > piece of code forcing the condition in second loop to be false, > and the original condition is moved from *in loop* to *exit edge* > between loop1 and loop2. Yes, one complication is that we use loop_version but do not retain the CFG it creates. Something like the vectorizers slpeel_tree_duplicate_loop_to_edge_cfg would be a better "fit" but then that code doesn't do any scaling yet. But then loop_version uses cfg_hook_duplicate_loop_to_header_edge and I suppose we could write a variant that simply doesn't mangle the CFG with a new condition switching between both loops but keeps them executed after each other ... As said, this adds to the confusion and some awkwardness. > @fxue may have inputs about this since he contributed it years ago. > > > > > It does seem that scaling the loop by the invar_branch probability > > is correct. Since this does similar things to unswitching, I see > > that unswitching does > > > > prob_true =3D edge_true->probability; > > loop_version (loop, unshare_expr (cond), > > NULL, prob_true, > > prob_true.invert (), > > prob_true, prob_true.invert (), > > false); > > > > which maybe suggests that your invar_branch based passing should > > depend on 'true_invar'? > > > > Also compared to unswitching the first loop is always entered, > > so I wonder if the scaling is correct with respect to that > > given unswitching where only ever one loop is entered? > > > Scaling is based on the probability calculated on "if (ga)", if it is > 33% vs 67% before split, then it is reasonable to scale loop1 to 33% > and loop2 to 67% suppose the branch probability is correct enough? > > unswitch also scaled the two loops based on prob_true, if prob_true > is 50%, it decreases X's count to HALF unexpectedly since it should be > executed on both branches? while loop split still kept execute both firs= t > loop and second loop, it seems even more accurate than loop unswitching? I just was saying that both doing exactly the same looks wrong (on either side). > tree-ssa-loop-unswitch.c: > > while (A) > { > if (inv) > B; > > X; > > if (!inv) > C; > } > > where inv is the loop invariant, into > > if (inv) > { > while (A) > { > B; > X; > } > } > else > { > while (A) > { > X; > C; > } > } Yes, here scaling based on the if (inv) probability looks obviously 100% correct to me. But I might be wrong. And thus the splitting case must be still off (so I conclude). Hmm, in fact I think for the loop unswitching case the scaling of the B and C blocks is off? Those should be considered always executed. Note the loop unswitching pass is altering the conditions condition to static true/false but I don't think it performs further adjustments. That said, likely the profile update cannot be done uniformly for all blocks of a loop? Richard.