From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from outbound.mail.eo.outlook.com (mail-oln040093003012.outbound.protection.outlook.com [40.93.3.12]) by sourceware.org (Postfix) with ESMTPS id 8ED153858002 for ; Mon, 23 Aug 2021 00:44:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8ED153858002 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JWeQrxHRpsQ6cCVgErPgdkWCU2SC1Wh/XdSdE8OqVQMrdk/sF55mo/BNc3xVf2L/1sqHhcp/mtbwY8SWgZ1WaN/1VpVt9six/p4UK/rxHF+fFkq0NOmx0ZdetU5F1Yxx7nNu/EoHgJ48ZN1qQaE/bFBEsmUm3toFSSMIpang27cY7hCLcmG3FqtK9Xw0JDBRPGrc6LjHD09zeZtaJKtMNvAxKlUU3ATdG9Ep8PO3upOSO7rS/u8Ebo46Or3quTH21nEX3yBZAtvkkKvBPfteuY3+o5TuStTgCliO16aV4o5LLVZAzIREomITfnnbpzlmZhiVRBx7l/ui2ThpUvkW3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=deiuO3hrU0TAJHaTk04uVT5frdfCRjnSyCkhtzboSTI=; b=PiGu0ryXCDhUNVxdbX6OcTQ4RvshSPCLCJAPRVpFZX/TsCQ3zpRKmseV6HTXrCFbFEAx3ZERwoRA0SPcG7Aev6LswpQ64Oh9GEQggG3yCSepmi6bIWZkQuhtsymHJa7lbnbyVtpJtYVEbMmku16a6YlpCWF+14Gd87eMbw72lfSvVHT9xzv9bzERPj4oFs7JY42un3i5VgNeCcgUowoG7U9UVYeosPgMuWtMGD44UYA3TDII6bIsC5vg02l9MmOoaDcq5uQpZsvWDldi4CEC6JOtxWxB7vAThyM1ZV/RMki2WNJ7ZOufMHU3sUVw5TatRrl4krBdDx8IV26pku+pTw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=microsoft.com; dmarc=pass action=none header.from=microsoft.com; dkim=pass header.d=microsoft.com; arc=none Received: from MWHPR2101MB0811.namprd21.prod.outlook.com (2603:10b6:301:7b::39) by MW4PR21MB1940.namprd21.prod.outlook.com (2603:10b6:303:72::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4457.1; Mon, 23 Aug 2021 00:44:00 +0000 Received: from MWHPR2101MB0811.namprd21.prod.outlook.com ([fe80::ddce:fd89:895a:ad37]) by MWHPR2101MB0811.namprd21.prod.outlook.com ([fe80::ddce:fd89:895a:ad37%8]) with mapi id 15.20.4457.003; Mon, 23 Aug 2021 00:44:00 +0000 From: Victor Tong To: Richard Biener , Marc Glisse CC: "gcc-patches@gcc.gnu.org" Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176] Thread-Topic: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176] Thread-Index: AQHXJoDrR4fklL4IYku64pH3Fv0p3qrIMcqAgDljMP+ABwtigIAO0uy/gAKMnwCAAg2QAIACfg6AgAwMpn6ALklBAIAO4VisgAAa8ICAA99yAIAVZQA5 Date: Mon, 23 Aug 2021 00:44:00 +0000 Message-ID: References: <1e409dd-c5a8-a59f-d7ba-86ae805ab54f@hippo.saclay.inria.fr> <688987e-6fa7-9526-5068-5fae14b0c56b@hippo.saclay.inria.fr> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=True; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2021-08-23T00:43:59.195Z; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=General; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_ContentBits=0; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Method=Standard; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 8e778671-f979-4cd4-2807-08d965cf191e x-ms-traffictypediagnostic: MW4PR21MB1940: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 3dfsyHy3eQbZP+SBrujlQP30htej0f80MTF1Xo7h8Zbg3j5L8NI98MWkzHkbT/CcuxpxT2m+6l0Dsmr1Qpot8qJSeTR4M6wlITkOVcpQ6Clx5sogetTaMlD7Vbw0tV7Un+WmqLlSyn0f1DMV0qr03GxWwIu33EavFeaKouxkjyy8/178oM9qv1VuGzptXIcvO3AE9AJia1+nN/xEzyIE2yXXeV0W4lsyAKZ+lVptOsUOiTL2RUdMQzuG4ccNDfm+ShfCi3QnkgpIajehVeKemp8Ew7/5fiZCXsCNLBJphktGbzuo341OS1kHbCrtsCNMQgfJhNneNz1I1ItFkGOurCNUooXQ741bYWDkgfz27W/czb64vDZ/F6LcRb3peqeDzwNRYexaKS7tykC5drKL03DMBu4U6rchUjSkIina6y0QJE2D8z9a0V1Oh8N5Wqx7kGuAazEpMW8JnuDJt6NyivoHAJ+SVugNGtKmpdOZ4vmWp8R1sPZT4+fUYu3Mt0alow/tFCh8QV2CVDR1sTjPMWpRXAnrrIPYuPL3C/3+Znz4C1SiJHN7l1F0HEjFxwOpUcrDii6bXpnq+c/b1TITFg5WCRi1fZpb4NSJys1dD8RvTQsZAXZU60nRc8MbOHqJxMtb+ZUxjibBk7617DdWSPQejED+a2OkIfGdxhF6TIt6wTXRPV8ONTmNBKKIUgAvnUdHIC5VGKHREjjdzORho/o09edOCB/mN2HHODZ7k5QtUnTgXpn6m6bz0BuzF0ujbvA2LQ9EyHBio1ZQYZVSSoH3sq5mTtER7N7vwoinyqY= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MWHPR2101MB0811.namprd21.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(10290500003)(4326008)(508600001)(66946007)(2906002)(55016002)(8676002)(30864003)(86362001)(33656002)(7696005)(52536014)(64756008)(316002)(8936002)(38070700005)(66446008)(82960400001)(122000001)(26005)(66556008)(966005)(76116006)(8990500004)(71200400001)(82950400001)(6506007)(53546011)(38100700002)(186003)(9686003)(66476007)(5660300002)(110136005)(83380400001); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?gK5SUvEiGZmZdTWAyuH3/9phEl/vT21nE4NvHQGTpSBM4K5bMUeiTyv7TN?= =?iso-8859-1?Q?sFYIPmxvPmR8Qa1iQ7p8EBGBwWbjf1ad1atnNO7xiHoLkCdGnxTkoozGYn?= =?iso-8859-1?Q?Z80jQpKofGCOptqkB8Udrvs6jrEhATpp2vvPi6KetXilo4f2DtIhMxZxhr?= =?iso-8859-1?Q?pTUUdLLPbnZBiK35Gbbi/ypXg++h7ui6BbZo8v0NWy2mq8iQ9+QibBAmQ3?= =?iso-8859-1?Q?uBVlQPTuIlz+I+Zuwp4pgDh0bx4oRzNBYqvj7SrUnbajqBomJicFo6xnSV?= =?iso-8859-1?Q?qSgON1/gyaHrIWJILIZCnWazvHNidF9advm3rIFsA4kXDWA/lRodtjqsjZ?= =?iso-8859-1?Q?Sp1Fz13G8V4g08M7BX0zhKljoL1cQfrYJ3M3gDF9tJurWBwwHi8HJBQ7oV?= =?iso-8859-1?Q?xFFPq84m0MkXMUg/K6j1f/Rr41YxNUaxqvgpqXQrJO6EiW1U28G3ndQ0mU?= =?iso-8859-1?Q?cjUcOECQpoRz+OEIY6VcSSBTrdtNZH63GLn/y0xZvnF3GkpgVjsilVyOzY?= =?iso-8859-1?Q?QrzPxns9EQe7VVplLEX+exS+NeF8JVWludvggAYWpzMQTYxi5cMwS/jXXC?= =?iso-8859-1?Q?kiF7PC8czF5c8DTPwz4U1ddRQ1uGEB91VTL34uE22YxxTFKSLFXVwMU0Wz?= =?iso-8859-1?Q?OvpowYre7W4P3txQXWW/5kBK/JMJk0WE7GTIFbmzpce+JkdGVSS3RdgYD3?= =?iso-8859-1?Q?Kbp+e4fi5NGKvENLU64jmbLKgAI/2pOXfjyWpMeqnXQTwzRQit8LUdY3iN?= =?iso-8859-1?Q?JJCCdOr37hTgByZA94nxiKSV8WjYrlCvn3+2NVrmJCW7ShaE6zX8sqEXOz?= =?iso-8859-1?Q?rF7FAe3iNc3PrhNSsI4zJKPOpdc9AlTk2rQcIDjZw8OTqmiBrWNpQdvg3C?= =?iso-8859-1?Q?5xUpyYlQBGMzq+DDcYUkIHQhqnc4oeeGq0lWco1hk4iv18r/tPTJlzVTn7?= =?iso-8859-1?Q?3Gpsq3e8KV8AidHAIyu5kzFA1AE+gkAziG6IhArRkmAV3oTVoSK1VE4tKn?= =?iso-8859-1?Q?lt0lrbeepcJjElSPOO36n5kJzlj/W1fifpwh38nRQ2P1WHQZQiZKjhMf8G?= =?iso-8859-1?Q?6IkUQnrpG0rmAJTrTJEX5cXShNr74MqX81tC4bWpWlNu2wmpyIh17H51Vf?= =?iso-8859-1?Q?qNx/tdA+F5KZStTwx6NvE2GF7gnrk3wQeazUeK2Y18GbfUh5DSViYjofUb?= =?iso-8859-1?Q?fi5NhexZpo0iDRiftlffWlVYhLSK+22W7WR1+osvBhNb7hU0rUMLIqaS+D?= =?iso-8859-1?Q?oRUav5W5az7Hm4rwk2qpoFT8FrPykh9O0vknMRJVRoPtYNKkAqzGqs4Scb?= =?iso-8859-1?Q?57ulhg40pVRo5MBCbCuQJ3U30ygWb8HdUld0/FxDiyP9QS0AjMHb4IdnF2?= =?iso-8859-1?Q?tkzAxGb59v?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MWHPR2101MB0811.namprd21.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8e778671-f979-4cd4-2807-08d965cf191e X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Aug 2021 00:44:00.3128 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: B/un0JyjjTKswXPgmxDLcx5vuf/yyiRFhgNUz/KDqn/zUdWqYpU2zUGgv14XwwMw1YeFrBLTPTnl/4CXSxLoVQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR21MB1940 X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_NONE, TXREP, T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Aug 2021 00:44:07 -0000 Thanks for the feedback. I updated the pattern and it passes all tests (exi= sting and the new ones I wrote). I added some brackets since there were som= e warnings about missing brackets on the || and &&. Here's the updated patt= ern:=0A= =0A= (simplify=0A= (minus (convert1? @0) (convert2? (minus@2 (convert3? @@0) @1)))=0A= (if (INTEGRAL_TYPE_P (type)=0A= && !TYPE_OVERFLOW_SANITIZED(type)=0A= && INTEGRAL_TYPE_P (TREE_TYPE(@2))=0A= && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@2))=0A= && INTEGRAL_TYPE_P (TREE_TYPE(@0))=0A= && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))=0A= && =0A= (TYPE_PRECISION (type) <=3D TYPE_PRECISION (TREE_TYPE (@2)) ||=0A= (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@2)) &&=0A= (TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (TREE_TYPE (= @2)) ||=0A= (TYPE_PRECISION (TREE_TYPE (@0)) =3D=3D TYPE_PRECISION (TREE_T= YPE (@2)) &&=0A= !TYPE_UNSIGNED (TREE_TYPE (@0)))=0A= ))))=0A= (convert @1)))=0A= =0A= =0A= >>> TYPE_PRECISION (type) <=3D TYPE_PRECISION (TREE_TYPE (@2))=0A= =0A= Did you mean > instead of <=3D? With the condition you proposed, that would= trigger the optimization in cases where values may get truncated which I t= hink should be avoided for this optimization.=0A= =0A= >>> Maybe the new transform could be about scalars, and we could restrict t= he old one to vectors, to simplify the code,=0A= =0A= I tried limiting the existing pattern to vector types by changing it to the= following:=0A= =0A= (simplify=0A= (minus @0 (nop_convert1? (minus (nop_convert2? @0) @1)))=0A= (if (VECTOR_TYPE_P(type))=0A= (view_convert @1)))=0A= =0A= I found that the new pattern doesn't cover some cases. Specifically, it doe= sn't cover a case in pr92734-2.c:=0A= =0A= unsigned=0A= f10 (unsigned x, int y)=0A= {=0A= unsigned a =3D (int) x - y;=0A= return x - a;=0A= }=0A= =0A= I think the pattern isn't triggering because of the !TYPE_UNSIGNED (TREE_TY= PE (@0)) check. I'm slightly concerned that changing the new pattern to cov= er the existing cases would add complexity to the new pattern, making it di= fficult to understand.=0A= =0A= I also think the new pattern could be simplified by removing the convert on= @0. I don't think it's needed for the regression pattern that I was seeing= , but I had added it to be more thorough so the pattern covers more cases. = =0A= =0A= From: Richard Biener =0A= Sent: Monday, August 9, 2021 2:58 AM=0A= To: Marc Glisse =0A= Cc: Victor Tong ; gcc-patches@gcc.gnu.org =0A= Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division fo= llowed by multiply [PR95176] =0A= =A0=0A= On Sat, Aug 7, 2021 at 12:49 AM Marc Glisse wrote:= =0A= >=0A= > On Fri, 6 Aug 2021, Victor Tong wrote:=0A= >=0A= > > Thanks for the feedback. Here's the updated pattern:=0A= > >=0A= > >=A0 /* X - (X - Y) --> Y */=0A= > >=A0 (simplify=0A= > >=A0=A0=A0 (minus (convert1? @0) (convert2? (minus@2 (convert3? @@0) @1))= )=0A= > >=A0=A0=A0 (if (ANY_INTEGRAL_TYPE_P (type)=0A= > >=A0=A0=A0=A0=A0=A0=A0 && TYPE_OVERFLOW_UNDEFINED(type)=0A= > >=A0=A0=A0=A0=A0=A0=A0 && !TYPE_OVERFLOW_SANITIZED(type)=0A= > >=A0=A0=A0=A0=A0=A0=A0 && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@2))=0A= > >=A0=A0=A0=A0=A0=A0=A0 && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@2))=0A= > >=A0=A0=A0=A0=A0=A0=A0 && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@2))=0A= > >=A0=A0=A0=A0=A0=A0=A0 && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0))=0A= > >=A0=A0=A0=A0=A0=A0=A0 && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))=0A= > >=A0=A0=A0=A0=A0=A0=A0 && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))=0A= > >=A0=A0=A0=A0=A0=A0=A0 && TYPE_PRECISION (TREE_TYPE (@2)) <=3D TYPE_PRECI= SION (type)=0A= > >=A0=A0=A0=A0=A0=A0=A0 && TYPE_PRECISION (TREE_TYPE (@0)) <=3D TYPE_PRECI= SION (type))=0A= > >=A0=A0=A0 (convert @1)))=0A= > >=0A= > > I kept the TYPE_OVERFLOW_SANITIZED checks because I saw other patterns = that leverage undefined overflows check for it. I think this new pattern sh= ouldn't be applied if overflow sanitizer checks are enabled.=0A= > >=0A= > >>> why is this testing TREE_TYPE (@0)?=0A= > >=0A= > > I'm checking the type of @0 because I'm concerned that there could be a= case where @0's type isn't an integer type with undefined overflow. I trie= d creating a test case and couldn't seem to create one where this is violat= ed but I kept the checks to avoid causing a regression. If I'm being overca= utious and you feel that the type checks on @0 aren't needed, I can remove = them. I think the precision check on TREE_TYPE(@0) is needed to avoid trunc= ation cases though.=0A= >=0A= > It doesn't matter if @0 has undefined overflow, but it can matter that it= =0A= > be signed (yes, the 2 are correlated...) if it has the same precision as= =0A= > @2. Otherwise (int64_t)(-1u)-(int64_t)((int)(-1u)-0) is definitely not 0= =0A= > and it has type:int64_t, @2:int, @0:unsigned.=0A= >=0A= > Ignoring the sanitizer, the confusing double matching of constants, and= =0A= > restricting to scalars, I think the tightest condition (without vrp) that= =0A= > works is=0A= >=0A= > TYPE_PRECISION (type) <=3D TYPE_PRECISION (TREE_TYPE (@2)) ||=0A= >=A0=A0 TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@2)) &&=0A= >=A0=A0=A0 (TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (TREE_TYPE (@2= )) ||=0A= >=A0=A0=A0=A0 TYPE_PRECISION (TREE_TYPE (@0)) =3D=3D TYPE_PRECISION (TREE_T= YPE (@2)) &&=0A= >=A0=A0=A0=A0 !TYPE_UNSIGNED (TREE_TYPE (@0))=0A= >=A0=A0=A0 )=0A= >=0A= > (where implicitly undefined =3D> signed) but of course it is ok to handle= =0A= > only a subset. It is too late for me to think about @0 vs @@0.=0A= >=0A= > The patch you posted seems to accept the wrong=0A= > (int128t)LLONG_MAX-(int128_t)((int)LLONG_MAX-0) -> (int128_t)0=0A= >=0A= > Using ANY_INTEGRAL_TYPE_P allows vectors, and TYPE_PRECISION mustn't be= =0A= > used for vectors (maybe we should add more checking to catch such uses),= =0A= > and they may not be very happy with convert either...=0A= >=0A= > >>> Once we'd "inline" nop_convert genmatch would complain about this.=0A= >=0A= > Maybe the new transform could be about scalars, and we could restrict the= =0A= > old one to vectors, to simplify the code,=0A= =0A= Hmm, I guess that might indeed be a good idea.=0A= =0A= > but that wouldn't help genmatch=0A= > with the fact that the pattern x-(x-y) would appear twice. Is it really= =0A= > that bad? It is suspicious, but can be justified.=0A= =0A= I think genmatch will either complain or might end up generating=0A= unreachable code.=0A= Note with the present patch there's probably order forcing patterns inbetwe= en so=0A= it could simply work.=A0 Otherwise whether it works or not (as expected)=0A= depends on placing=0A= positions of the patterns ...=0A= =0A= So no, it's not inherently "bad" but it's not designed to work.=0A= =0A= > > Is someone working on inlining nop_convert? I'd like to avoid breaking = someone else's work if that's being worked on right now.=0A= > >=0A= > > I've also added some extra tests to cover this new pattern. I've attach= ed a patch with my latest changes.=0A= > >=0A= > >=0A= > > From: Richard Biener =0A= > > Sent: Wednesday, July 28, 2021 2:59 AM=0A= > > To: Victor Tong =0A= > > Cc: Marc Glisse ; gcc-patches@gcc.gnu.org =0A= > > Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize divisio= n followed by multiply [PR95176]=0A= > >=0A= > > On Tue, Jun 29, 2021 at 1:10 AM Victor Tong wrot= e:=0A= > >>=0A= > >> Thanks Richard and Marc.=0A= > >>=0A= > >> I wrote the following test case to compare the outputs of fn1() and fn= 1NoOpt() below with my extra pattern being applied. I tested the two functi= ons with all of the integers from INT_MIN to INT_MAX.=0A= > >>=0A= > >> long=0A= > >> fn1 (int x)=0A= > >> {=0A= > >>=A0=A0=A0 return 42L - (long)(42 - x);=0A= > >> }=0A= > >>=0A= > >> #pragma GCC push_options=0A= > >> #pragma GCC optimize ("O0")=0A= > >> long=0A= > >> fn1NoOpt (int x)=0A= > >> {=0A= > >>=A0=A0=A0 volatile int y =3D (42 - x);=0A= > >>=A0=A0=A0 return 42L - (long)y;=0A= > >> }=0A= > >> #pragma GCC pop_options=0A= > >>=0A= > >> int main ()=0A= > >> {=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 for (long i=3DINT_MIN; i<=3DINT_MAX;i++)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 {=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 auto valNoOpt =3D f= n1NoOpt(i);=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 auto valOpt =3D fn1= (i);=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (valNoOpt !=3D v= alOpt)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 printf("valOpt=3D%ld, valNoOpt=3D%ld\n", valOpt, valNoOpt);=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 }=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 return 0;=0A= > >> }=0A= > >>=0A= > >> I saw that the return values of fn1() and fn1NoOpt() differed when the= input was between INT_MIN and INT_MIN+42 inclusive. When passing values in= this range to fn1NoOpt(), a signed overflow is triggered which causes the = value to differ (undefined behavior). This seems to go in line with what Ma= rc described and I think the transformation is correct in the scenario abov= e. I do think that type casts that result in truncation (i.e. from a higher= precision to a lower one) or with unsigned types will result in an incorre= ct transformation so those scenarios need to be avoided.=0A= > >>=0A= > >> Given that the extra pattern I'm adding is taking advantage the undefi= ned behavior of signed integer overflow, I'm considering keeping the existi= ng nop_convert pattern in place and adding a new pattern to cover these new= cases. I'd also like to avoid touching nop_convert given that it's used in= a number of other patterns.=0A= > >>=0A= > >> This is the pattern I have currently:=0A= > >>=0A= > >>=A0=A0=A0 (simplify=0A= > >>=A0=A0=A0=A0=A0 (minus (convert1? @0) (convert2? (minus (convert3? @2) = @1)))=0A= > >>=A0=A0=A0=A0=A0 (if (operand_equal_p(@0, @2, 0)=0A= > >=0A= > > The operand_equal_p should be reflected by using @0 in place of @2.=0A= > >=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && INTEGRAL_TYPE_P (type)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && TYPE_OVERFLOW_UNDEFINED(type)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && !TYPE_OVERFLOW_SANITIZED(type)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && INTEGRAL_TYPE_P (TREE_TYPE(@1))=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@1))= =0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@1))= =0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && !TYPE_UNSIGNED (TREE_TYPE (@1))=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && !TYPE_UNSIGNED (type)=0A= > >=0A= > > please group checks on common argument.=A0 I think a single test=0A= > > on INTEGRAL_TYPE_P (type) is enough, it could be ANY_INTEGRAL_TYPE_P=0A= > > to include vector and complex types.=0A= > >=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && TYPE_PRECISION (TREE_TYPE (@1)) <=3D TYP= E_PRECISION (type)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && INTEGRAL_TYPE_P (TREE_TYPE(@0))=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))= =0A= > >=0A= > > why is this testing TREE_TYPE (@0)?=A0 The outer subtract is using 'typ= e',=0A= > > the inner subtract uses TREE_TYPE (@1) though you could place=0A= > > a capture on the minus to make what you test more obvious, like=0A= > >=0A= > >=A0=A0 (minus (convert1? @0) (convert2? (minus@3 (convert3? @2) @1)))=0A= > >=0A= > > TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3))=0A= > >=0A= > > there's one set of checks too much I think.=0A= > >=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))= =0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && !TYPE_UNSIGNED (TREE_TYPE (@0))=0A= > >=0A= > > we only ever have TYPE_OVERFLOW_UNDEFINED on singed types so=0A= > > the !TYPE_UNSIGNED checks are superfluous.=0A= > >=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && TYPE_PRECISION (TREE_TYPE (@0)) <=3D TYP= E_PRECISION (type)=0A= > >>=A0=A0=A0=A0=A0=A0=A0=A0=A0 && TREE_TYPE(@1) =3D=3D TREE_TYPE(@2))=0A= > >=0A= > > This check means that convert3 is never present since a MINUS requires= =0A= > > compatible types.=0A= > >=0A= > > Sorry for the late reply.=0A= > >=0A= > > Note the pattern will necessarily be partly redundant with the=0A= > >=0A= > >=A0=A0 (simplify=0A= > >=A0=A0=A0 (minus (nop_convert1? (minus (nop_convert2? @0) @1)) @0)=0A= > >=A0=A0=A0 (if (!ANY_INTEGRAL_TYPE_P (type)=0A= > >=A0=A0=A0=A0=A0=A0=A0=A0 || TYPE_OVERFLOW_WRAPS (type))=0A= > >=A0=A0=A0 (negate (view_convert @1))=0A= > >=A0=A0=A0 (view_convert (negate @1))))=0A= > >=0A= > > pattern.=A0 Once we'd "inline" nop_convert genmatch would complain=0A= > > about this.=0A= > >=0A= > >>=A0=A0=A0=A0=A0 (convert @1)))=0A= > >>=0A= > >> Is there a more concise/better way of writing the pattern? I was looki= ng for similar checks in match.pd and I couldn't find anything that I could= leverage.=0A= > >>=0A= > >> I also kept my pattern to the specific scenario I'm seeing with the re= gression to lower the risk of something breaking. I've limited @1 and @2 to= have the same type.=0A= > >>=0A= > >> I'm also in favor of adding/running computer verification to make sure= the transformation is legal. I've written some tests to verify that the pa= ttern is being applied in the right scenarios and not being applied in othe= rs, but I think there are too many possibilities to manually write them all= . Is there anything in GCC that can be used to verify that match.pd transfo= rmations are correct? I'm thinking of something like Alive https://nam06.sa= felinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fgithub.com%2FAliveToolk= it%2Falive2&data=3D04%7C01%7Cvitong%40microsoft.com%7C1abb4ed5e6e745eb7= 56a08d95b1c3d72%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63764099913359= 4608%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik= 1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3DolTfA5yAjtKs9Sckdnwn4iRGSynb7wX0Xx6= uKrUNHhs%3D&reserved=3D0.=0A= > >>=0A= > >> Thanks,=0A= > >> Victor=0A= > >>=0A= > >>=0A= > >>=0A= > >> From: Richard Biener =0A= > >> Sent: Monday, June 21, 2021 12:08 AM=0A= > >> To: Marc Glisse =0A= > >> Cc: Victor Tong ; gcc-patches@gcc.gnu.org =0A= > >> Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize divisi= on followed by multiply [PR95176]=0A= > >>=0A= > >> On Sat, Jun 19, 2021 at 7:05 PM Marc Glisse wro= te:=0A= > >> >=0A= > >> > On Fri, 18 Jun 2021, Richard Biener wrote:=0A= > >> >=0A= > >> > >> Option 2: Add a new pattern to support scenarios that the existin= g nop_convert pattern bails out on.=0A= > >> > >>=0A= > >> > >> Existing pattern:=0A= > >> > >>=0A= > >> > >> (simplify=0A= > >> > >>=A0=A0=A0 (minus (nop_convert1? @0) (nop_convert2? (minus (nop_con= vert3? @@0) @1)))=0A= > >> > >>=A0=A0=A0 (view_convert @1))=0A= > >> >=0A= > >> > I tried to check with a program when=0A= > >> >=0A= > >> > T3 x;=0A= > >> > T1 y;=0A= > >> > (T2)x-(T2)((T1)x-y)=0A= > >> >=0A= > >> > can be safely replaced with=0A= > >> >=0A= > >> > (T2)y=0A= > >> >=0A= > >> > From the output, it looks like this is safe when T1 is at least as l= arge=0A= > >> > as T2. It is wrong when T1 is unsigned and smaller than T2. And when= T1 is=0A= > >> > signed and smaller than T2, it is ok if T3 is the same type as T1 (s= igned=0A= > >> > then) or has strictly less precision (any sign), and not in other ca= ses.=0A= > >> >=0A= > >> > Note that this is when signed implies undefined overflow and unsigne= d=0A= > >> > implies wrapping, and I wouldn't put too much faith in this recently= =0A= > >> > dusted program. And it doesn't say how to write the match.pd pattern= with=0A= > >> > '?', "@@", disabling it if TYPE_OVERFLOW_SANITIZED, etc.=0A= > >> >=0A= > >> > Mostly, I wanted to say that if we are going to go handle more than= =0A= > >> > nop_convert for more than just 1 or 2 easy transformations, I think = some=0A= > >> > kind of computer verification would be useful, it would save a lot o= f time=0A= > >> > and headaches.=0A= > >>=0A= > >> True.=A0 I wonder if auto-generating such tests from match.pd rules wo= uld=0A= > >> be a good project to work on (GSoC?).=A0 When there's define_match=0A= > >> involved things get a little tricky, but then one item on the TODO lis= t=0A= > >> is "inlining" those at the use site (exploding the decision tree even = more).=0A= > >>=0A= > >> Richard.=0A= > >>=0A= > >> > (I just check by brute force all possible precisions (from 1 to 6) a= nd=0A= > >> > signedness for T1, T2 and T3, all possible values for x and y, compu= te=0A= > >> > the before and after formulas, accepting if there is UB before, reje= cting=0A= > >> > if there is UB after (and not before), and manually try to see a pat= tern=0A= > >> > in the list of types that work)=0A= > >> >=0A= > >> > --=0A= > >> > Marc Glisse=0A= >=0A= > --=0A= > Marc Glisse=