From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-eopbgr130094.outbound.protection.outlook.com [40.107.13.94]) by sourceware.org (Postfix) with ESMTPS id EBA13385151E for ; Fri, 28 Oct 2022 13:39:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EBA13385151E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=Syrmia.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Syrmia.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=atfyCnYP6lVxuCfSHw8fE//zWeKR0qD6iPG+8mGUbae44R40Bb87fdR+AdZYZlCd7Eou2thswLQp2ivbaFLVQ0pl4zNIExBXgiECe2/5KoH+Ivi6jS1UydpyW6kvrlU3a7iybaj4lXFfTRdQcL4y6wk+AIhJGW6M4XOfo6rkRl2tyfgFUtOb5DCwvzoWr1SoynaiikYy/idPO8U+hdnV+ejg0x/Q4UfM3Zz3qDArOStCgpP4nXWDSoOItUWHC3j30ELlO4dPDsuU18StVgjvXKYgAq/jV/NFWhQ/oQTbuLAog+Fq1kf+qcYNUPLO9gMw1KLIkInw9u8SylyRdUn2CA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WnNPPLfboL3my33iYrtxdRt/Nh3rrJewEGFFLXnEwBw=; b=a5H2Y1Xmwk/CeVoNRbbF7+2klRYPAjGPw5Mf1AxGK3oV/Sz6IcVj29rrxMr/wH2KR3BZq/lz5HrWlsa/OFP4I0N+S+/9l608kXDZbNGkjIHMx3FYJRnattfz8mhMhlZPJL0Vdticn1hWACCiNt1ChDM6VAxOQ63BWWFE5rPeKvgz8i0b3/VMM/AX4nmrFdsYVXQd7ON3bgRIP1k8m+M9fMXzyi6siFiUhnybpsWGGWsMPKGDefd+JWFWfn9JI+RIWvyAKiKREJWC1G9qlrYGiZT0tS4TQP+eaTIG6F6gLVDVz9dnG5VdL2vNqLOVrY5wl6LKpMXwSXbzyzthnBtK/g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=syrmia.com; dmarc=pass action=none header.from=syrmia.com; dkim=pass header.d=syrmia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=syrmia.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WnNPPLfboL3my33iYrtxdRt/Nh3rrJewEGFFLXnEwBw=; b=x5M0HAjPS400NCNsdRJND5ioSEUzW2uSmd+XRfvFXXcrYmUX6vQhfv6XHvpX1PQEAiJsqY5bga5StxRy2GGlxgQuALSlrWDMe9lG/McEQy3N9ktRdZDeoSf3apAnyX48DwhsGZNJcEJRCVJnt2WsoFbkJPlGIZqgU6xatZh9rmc= Received: from AM0PR03MB4882.eurprd03.prod.outlook.com (2603:10a6:208:fb::17) by DU0PR03MB8164.eurprd03.prod.outlook.com (2603:10a6:10:353::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15; Fri, 28 Oct 2022 13:39:54 +0000 Received: from AM0PR03MB4882.eurprd03.prod.outlook.com ([fe80::7f26:4554:fc25:8412]) by AM0PR03MB4882.eurprd03.prod.outlook.com ([fe80::7f26:4554:fc25:8412%3]) with mapi id 15.20.5746.028; Fri, 28 Oct 2022 13:39:54 +0000 From: Dimitrije Milosevic To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , Djordje Todorovic Subject: Re: [PATCH 2/2] ivopts: Consider number of invariants when calculating register pressure. Thread-Topic: [PATCH 2/2] ivopts: Consider number of invariants when calculating register pressure. Thread-Index: AQHY5VSqHj/DluvY9Ue9PcLtS+ZmWK4e+VuAgAASSeGABGo0gIAATDLW Date: Fri, 28 Oct 2022 13:39:54 +0000 Message-ID: References: <20221021135203.626255-1-dimitrije.milosevic@syrmia.com> <20221021135203.626255-3-dimitrije.milosevic@syrmia.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=Syrmia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: AM0PR03MB4882:EE_|DU0PR03MB8164:EE_ x-ms-office365-filtering-correlation-id: e624866e-8264-4862-e1da-08dab8e9e5a4 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: UiaXXLk3PBzJpsp2PB3wWHIHWgL6wL9pYbOKHsJiPB6o8KZdwijUrzf8pYKexb8NZ0bw7iGB16heyFkvKAw1i2RFLaf8fgGc85ZHa9NART7GlvWk0OZ8CZhx/zLXyidETdvNalJXhEyUovuHt0ALc4KNVCtKaJ8JX535D2I9wwObl7QCAuiLL9oOFONcSuAFzWEwrayAIoGHdK1nFZ21LokA6iS10zgTlo1Qh8Tjm4rqv3pIZ82ANoA4iyNYBzTaRZgz4g7zAtY2vkF4oRWbbBU+0u89QDmgYKcX38hMAiLLR2JID6s4pP3kAxSsR3qrYXFQKSgjkDIPZQQAl/FqDvB+6tB5JetdkpRYyXnVbJOm5Kh/g9W1cbsc+8ZzVq405Hy8VbWSn/YvBbRwbhb6Me9rH1AKmgzLYHszmcVKRkjf1HNXAG8zJRjfNLKYHU33ktktjRcDZ1nBVt4mIwJkS7jdKuq4t1NB09G16icXkGNqa+RA7uUP49VPkOv8epBMxOFPENw0fCRPHctwqJ5jRZ/fVE4PuRg0W7DgN2q8WwX01kUeeNRURz+QlyZhQvI90DFnWAYoR3E8RJHK8tzRnLsdgtGmCUnxyB/JSGTPLRyctTzCfWlDoVAVBHzo839OWPFBGgTg0NrTJmy2mrW7g8rZQ/FraeJoREKInkFFG5pEZo+GyhGsGN3PRdo8BYrNb8tZtWqtgZ4j5qN8uklog1H5vTIuUPG6RgL373qig99SdWyXinnUXlx9U++I6wGbSU/sd9RiQg3zx7r0MF8GFRi+W1Ij+diZakOmU74p5ILmIdIzV9BgYbbg4jJTOd1WUQqBMqB/VciVhIHI9pQpcaRGo8Loxglv4n6U2iEPa2s= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AM0PR03MB4882.eurprd03.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(376002)(396003)(366004)(136003)(39840400004)(451199015)(8676002)(71200400001)(316002)(478600001)(186003)(6916009)(54906003)(7696005)(83380400001)(4326008)(53546011)(41300700001)(64756008)(66446008)(91956017)(76116006)(66476007)(66946007)(9686003)(6506007)(8936002)(66556008)(26005)(966005)(107886003)(5660300002)(33656002)(86362001)(52536014)(38070700005)(122000001)(2906002)(38100700002)(55016003);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-2?Q?oZeS2AbbaovAifXZQVFBNipJC01xRpB2Vn+sS5E/LoUhpaf8obvbRXhQjQ?= =?iso-8859-2?Q?doFd4AeHkLyQh/YT4oFgF2VGHGEW0LwW9omSkngjRsZUmxzUaH+yLLISiz?= =?iso-8859-2?Q?HOy/MyFZjG9YbdtKzEpXEW/17VXBLluweN6hShuqIvbAB4S5nM3NDhv8Z/?= =?iso-8859-2?Q?fu1yxOt8QtgJVp98TXQ5X8pQ1qeUpg25qQKYUF6rzz52tFjP5QnWtQK3CL?= =?iso-8859-2?Q?ZVe3O1uSVMGZXMx9/0frq2CaHTHaOdeB8y4Pmz0lFj2jiTYxPmltlynVIm?= =?iso-8859-2?Q?tCgSpt1lwz7busrzjMdS4EK2gJdYBuq7ucwARFm2aS8T5t+W9Bd5EyYxh9?= =?iso-8859-2?Q?MKCg4auPrdhxi3KDb+rRIB3GpZsTDtz2CDSHtnw5oQ6N3H7DYJQ66brRLI?= =?iso-8859-2?Q?5fZJpM4f0quT/1aJpln4C2m/QkegpGukhVQsu8ahxl4VNGTTZdhjjCxhKa?= =?iso-8859-2?Q?1TAx/+f8h7/dH/ClQ+4A7jAVq6wGuD6JX4o1BPg+xI30wJWIdKul024WR/?= =?iso-8859-2?Q?94hNG1zQPRJIbQLYr0QBpniGZfOtWushVWd33u6j1zHs/h6b25zRb8mNaG?= =?iso-8859-2?Q?+fwkchNMsHpHT30hAF9szZwyCUMhnlG0FazO7ZvKIt987NzP9AWuJICuPb?= =?iso-8859-2?Q?s1i0aKJsEZRPow4rpr1kyap3ESKe7c3Sw2BNUmK/UvvwOSDqCRWgfxMp1B?= =?iso-8859-2?Q?3wTlbvxXmpkYmCf691GVU10QPSZOXYjhIuNAVey+7hhZaLWVGLlxYR58vF?= =?iso-8859-2?Q?xa5bBRXUUeCHWttp4T9XTr5CVTcL3OB/7qq/t/7BRXTdLyrKPSJhViArDm?= =?iso-8859-2?Q?twcadfRsHp4j6rfwwezmAOwj9jq38obCI0BmwrndKDkd8IsWsOoU6Jpfgd?= =?iso-8859-2?Q?oZU+qQUvMoA+uawG7oUA9nCRw6osfhLivcauGt1+ncjJAxFFXK2O1z166j?= =?iso-8859-2?Q?sLmo7nUgCt6KJsM4URHTg89rMwCwveu2kzho8hxUwHXAaswLJsZnH3ttKy?= =?iso-8859-2?Q?m9EAC+RGOdTRkSDU5ZGNclJ0hnill26Uj+ApuonrRv/w7dj1PLFb1mq2Rd?= =?iso-8859-2?Q?qafM2oL3JsBzhy+Oq4ufflqskmCgvO/2ZGO2yfSeN0xdi6niEDxSml5kTC?= =?iso-8859-2?Q?4eniNxtD6z7TZAL1Y5tqjMCreLyYq+xZbBmhOZBiI0dEwg6f1AwqwLGiCT?= =?iso-8859-2?Q?a7BYNIwU6DzoROwTFBG+FF/gL6JPeiGugWX/Vo+/r49cplPRjRvls99Jlj?= =?iso-8859-2?Q?U29C+g+MKOkaCglEtgNQpADi4C9hmweO+uHeZMO4XsrTQT+vguofnZiBOI?= =?iso-8859-2?Q?FXx1y47qb5c5N/a+l2Z+/mFXHzOouz4oC6WjpdSsRQ+t/jlks8HZ1GcYVq?= =?iso-8859-2?Q?C2J3JC11US1gqwsDoyJNKjGWF8dONAWE4MhqzLvdMGxIrkalqRiOJjyp83?= =?iso-8859-2?Q?8t/8eI7AdE7rSj5KA1ObQz1QD6IhyVpGGFHsFBrROq6kL/XODqTscNfoSc?= =?iso-8859-2?Q?nVLMf+mLv/FCXcnfPs7H5Nq6Pf2YOk/o9UB0uTnYATuVGxFpSOygSUq85T?= =?iso-8859-2?Q?btzyQ+pLFZdHhxNucBtLpmGAiDblJOPh3wuw7jAChRqq1cDRLCCVsd/gxd?= =?iso-8859-2?Q?HtHIcujQVCBnTAjYdYfT44ZOHBF2HIJjRsL15zP1PQ13KagyoyFyVjAg?= =?iso-8859-2?Q?=3D=3D?= Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: syrmia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: AM0PR03MB4882.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: e624866e-8264-4862-e1da-08dab8e9e5a4 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Oct 2022 13:39:54.5400 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 19214a73-c1ab-4e19-8f59-14bdcb09a66e X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: wli6xptHDoq7h5nUplBFZ8o6TE5kiCuP6GNbfT/127aPK6qI2YddQ9QE8I3Eqd6zu6YqWr79pUekcnsaYgKp23mxWlJTGfVqAjfw65F1uMk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR03MB8164 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Richard,=0A= =0A= > It's n_invs + 2 * n_cands?=0A= =0A= Correct, n_invs + 2 * n_cands, my apologies.=0A= =0A= > The comment says we want to prefer eliminating IVs over invariants.=A0 Yo= ur patch=0A= > undoes that by weighting invariants the same so it does no longer have=0A= > the effect=0A= > of the comment.=0A= =0A= I see how my patch may have confused you.=0A= My concern is the "If we have enough registers." case - if we do have =0A= enough registers to store both the invariants and induction variables, I th= ink the cost =0A= should be equal to the sum of those. =0A= =0A= I understand that adding another n_cands could be used as a tie-breaker for= the two =0A= cases where we do have enough registers and the sum of n_invs and n_cands i= s equal, =0A= however I think there are two problems with that:=0A= - How often does it happen that we have two cases where we do have enough r= egisters,=0A= n_invs + n_cands sums are equal, and n_cands differ? I think that's prett= y rare.=0A= - Bumping up the cost by another n_cands may lead to cost for the "If we do= have=0A= enough registers." case to be higher than for other cases, which doesn't ma= ke sense.=0A= I can refer to the test case that I presented in [0] for the second point.= =0A= Also worth noting is that the estimate_reg_pressure_cost function (used bef= ore c18101f) =0A= follows this:=0A= =0A= /* If we have enough registers, we should use them and not restrict=0A= the transformations unnecessarily. */=0A= if (regs_needed + target_res_regs <=3D available_regs)=0A= return 0;=0A= =0A= As far as preferring to eliminate induction variables if possible, don't we= already do that,=0A= for example:=0A= =0A= /* If the number of candidates runs out available registers, we penalize= =0A= extra candidate registers using target_spill_cost * 2. Because it is= =0A= more expensive to spill induction variable than invariant. */=0A= else=0A= cost =3D target_reg_cost [speed] * available_regs=0A= + target_spill_cost [speed] * (n_cands - available_regs) * 2=0A= + target_spill_cost [speed] * (regs_needed - n_cands);=0A= =0A= To clarify, what my patch did was that it gave every case a base cost of=0A= n_invs + n_cands. This base cost gets bumped up accordingly, for each=0A= one of the cases (by the amount equal to "cost =3D ..." statement prior to= =0A= the return statement in the ivopts_estimate_reg_pressure function).=0A= I agree that my patch isn't clear on my intention, and that it also does=0A= not correspond to the comment. =0A= What I could do is just return n_new as the cost for the =0A= "If we do have enough registers." case, but I would love to hear your =0A= thoughts, if I clarified my intention a little bit.=0A= =0A= [0] https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604304.html=0A= =0A= Regards,=0A= Dimitrije=0A= =0A= From: Richard Biener =0A= Sent: Friday, October 28, 2022 9:38 AM=0A= To: Dimitrije Milosevic =0A= Cc: gcc-patches@gcc.gnu.org ; Djordje Todorovic =0A= Subject: Re: [PATCH 2/2] ivopts: Consider number of invariants when calcula= ting register pressure. =0A= =A0=0A= On Tue, Oct 25, 2022 at 3:00 PM Dimitrije Milosevic=0A= wrote:=0A= >=0A= > Hi Richard,=0A= >=0A= > > don't you add n_invs twice now given=0A= > >=0A= > >=A0 unsigned n_old =3D data->regs_used, n_new =3D n_invs + n_cands;=0A= > >=A0 unsigned regs_needed =3D n_new + n_old, available_regs =3D target_av= ail_regs;=0A= > >=0A= > > ?=0A= >=0A= > If you are referring to the "If we have enough registers." case, correct.= After c18101f,=0A= > for that case, the returned cost is equal to 2 * n_invs + n_cands.=0A= =0A= It's n_invs + 2 * n_cands?=A0 And the comment states the reasoning.=0A= =0A= =A0Before c18101f, for=0A= > that case, the returned cost is equal to n_invs + n_cands. Another soluti= on would be=0A= > to just return n_invs + n_cands if we have enough registers.=0A= =0A= The comment says we want to prefer eliminating IVs over invariants.=A0 Your= patch=0A= undoes that by weighting invariants the same so it does no longer have=0A= the effect=0A= of the comment.=0A= =0A= > Regards,=0A= > Dimitrije=0A= >=0A= >=0A= > From: Richard Biener =0A= > Sent: Tuesday, October 25, 2022 1:07 PM=0A= > To: Dimitrije Milosevic =0A= > Cc: gcc-patches@gcc.gnu.org ; Djordje Todorovic = =0A= > Subject: Re: [PATCH 2/2] ivopts: Consider number of invariants when calcu= lating register pressure.=0A= >=0A= > On Fri, Oct 21, 2022 at 3:57 PM Dimitrije Milosevic=0A= > wrote:=0A= > >=0A= > > From: Dimitrije Milo=B9evi=E6 =0A= > >=0A= > > This patch slightly modifies register pressure model function to consid= er=0A= > > both the number of invariants and the number of candidates, rather than= =0A= > > just the number of candidates. This used to be the case before c18101f.= =0A= >=0A= > don't you add n_invs twice now given=0A= >=0A= >=A0=A0 unsigned n_old =3D data->regs_used, n_new =3D n_invs + n_cands;=0A= >=A0=A0 unsigned regs_needed =3D n_new + n_old, available_regs =3D target_a= vail_regs;=0A= >=0A= > ?=0A= >=0A= > > gcc/ChangeLog:=0A= > >=0A= > >=A0=A0=A0=A0=A0=A0=A0=A0 * tree-ssa-loop-ivopts.cc (ivopts_estimate_reg_= pressure): Adjust.=0A= > >=0A= > > Signed-off-by: Dimitrije Milosevic =0A= > > ---=0A= > >=A0 gcc/tree-ssa-loop-ivopts.cc | 6 +++---=0A= > >=A0 1 file changed, 3 insertions(+), 3 deletions(-)=0A= > >=0A= > > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc= =0A= > > index d53ba05a4f6..9d0b669d671 100644=0A= > > --- a/gcc/tree-ssa-loop-ivopts.cc=0A= > > +++ b/gcc/tree-ssa-loop-ivopts.cc=0A= > > @@ -6409,9 +6409,9 @@ ivopts_estimate_reg_pressure (struct ivopts_data = *data, unsigned n_invs,=0A= > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 + target_spill_cost [speed] * (n_cands= - available_regs) * 2=0A= > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 + target_spill_cost [speed] * (regs_ne= eded - n_cands);=0A= > >=0A= > > -=A0 /* Finally, add the number of candidates, so that we prefer elimin= ating=0A= > > -=A0=A0=A0=A0 induction variables if possible.=A0 */=0A= > > -=A0 return cost + n_cands;=0A= > > +=A0 /* Finally, add the number of invariants and the number of candida= tes,=0A= > > +=A0=A0=A0=A0 so that we prefer eliminating induction variables if poss= ible.=A0 */=0A= > > +=A0 return cost + n_invs + n_cands;=0A= > >=A0 }=0A= > >=0A= > >=A0 /* For each size of the induction variable set determine the penalty= .=A0 */=0A= > > --=0A= > > 2.25.1=0A= > >=