From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40062.outbound.protection.outlook.com [40.107.4.62]) by sourceware.org (Postfix) with ESMTPS id 81B0C3858D35 for ; Tue, 2 Nov 2021 14:36:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 81B0C3858D35 Received: from AS9PR06CA0271.eurprd06.prod.outlook.com (2603:10a6:20b:45a::25) by HE1PR08MB2795.eurprd08.prod.outlook.com (2603:10a6:7:33::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.19; Tue, 2 Nov 2021 14:36:26 +0000 Received: from VE1EUR03FT045.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:45a:cafe::8a) by AS9PR06CA0271.outlook.office365.com (2603:10a6:20b:45a::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.15 via Frontend Transport; Tue, 2 Nov 2021 14:36:26 +0000 X-MS-Exchange-Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=temperror action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT045.mail.protection.outlook.com (10.152.19.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.14 via Frontend Transport; Tue, 2 Nov 2021 14:36:24 +0000 Received: ("Tessian outbound e7ce0d853b63:v108"); Tue, 02 Nov 2021 14:36:23 +0000 X-CR-MTA-TID: 64aa7808 Received: from 95c03dcf9ab7.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BD68AF26-337E-4132-85B2-0F8F384628B7.1; Tue, 02 Nov 2021 14:36:18 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 95c03dcf9ab7.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 02 Nov 2021 14:36:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=myUvtPH8W6dcucUPSJ8Bf3mhtsa2djm6QOa8nZLF6vFY7cOsk55/LXLccy7r8W4k/g7HzVRRf4Wsseuf0GEVxp3bJUGP64/SirYK0yo+lqlpTeVFVVYFDI8zpEHnSuOj8wHJ/simgBBkfXEcfQBRsiNOx7L8muowR/z2wwQZhf5xpUXErrh+xdP+d0GCw4LuwBumwvi5oNfpE6w9IeFWuptFEmONDzW+vwd4u2UNerfOm9uf5g6S9V52x0wULsHBN7jCnlyqJjC/yaqPUOfML7BF+jnlKxYv77BduwB32zn66GIUc4wqbJcus7gfAdy0SbOzAeS6u8v/NrQQKFXJnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JyQZeHNIbstAHt0EhneGJYzD8pDGtGtlyUZZ20u3qU8=; b=YAFkrPwOpq+so+jXe3l5UX6AQDU1fyu2julEqG7fDDQO3KQZM7OtQhbgkk7AxE+073insLC2zLmrmw+mNdq7To30EJocB46yKyzGaOuz/2wVqaLYxngTk7soQWlnBHs87+4nrbQcrJe6OCgnQst6Ea3XB4bmLEejqrfpKM+66Y7rReXe0/wZeb/QnyWGx83Goh/U2iAOurHVYDNLbWuwKnPYhxj/WVjC+uTx/lEknRfaeW2zW+8zIfuu1jge9g3jGJU7R8f8rK0UrHspewMeYwnYA1n2JWkQOGkyreTo5NrPk1rMLjLWcYvUalrHCtN6TPiXqrMT98VVbKqtTz+ISQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB4863.eurprd08.prod.outlook.com (2603:10a6:802:ac::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.17; Tue, 2 Nov 2021 14:36:16 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::31cf:ea55:2234:c50b]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::31cf:ea55:2234:c50b%7]) with mapi id 15.20.4649.020; Tue, 2 Nov 2021 14:36:15 +0000 From: Tamar Christina To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd Subject: RE: [PATCH]middle-end Add an RPO pass after successful vectorization Thread-Topic: [PATCH]middle-end Add an RPO pass after successful vectorization Thread-Index: AQHXz/CMBQinBHoVLE+bQuMv+Yr2xavwSpoAgAAAZYA= Date: Tue, 2 Nov 2021 14:36:15 +0000 Message-ID: References: <9nnp8so9-p3nq-r26-3098-s96334191030@fhfr.qr> In-Reply-To: <9nnp8so9-p3nq-r26-3098-s96334191030@fhfr.qr> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: B34232FC5EEF1742B4E88CB124FE6558.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: 6c1e33df-066b-4c8c-22bd-08d99e0e2590 x-ms-traffictypediagnostic: VE1PR08MB4863:|HE1PR08MB2795: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 90GDT/gG0b6uRZrEl52H0rz+K1mB7egzkCSD6qdLHqSIp39R6Wky/bmpIVOTy5tVnJ3JPyXNUAk77StNpnQAhD5SPn+J7JiXsAf5Hc07ymF+K6rl976wki7hhZ8h+9QIovoufWjO1vUFeSPJ/eczbazeEazKMEM3iwmKRXwyIcjBBZ9yBebJsm+oSDGmLeFE1AgAScReJh+TiXXlaUVbWbS5JM4FwGL1iPPJmE8T2RyQBN0zPEW5yG+1t0YIL3sZtHerKjjelWuRrTW/04kDfXZUhjvLkWwbvoVXhpgNVnQDORHIgqkcp+Hpl86RWuaOoSHPbgqD5MoGDkdNqI75gM0lxANW2dPEdnuI1yuaQiIhxXZ0/nEP7m8kOYIQd5hVJDoEPBQMfM4P83pYZZdSO+tM3Sd4EDNJU2PRBjehk8ZsheZqJeiz/a3oX0I0vebPrgxM1ba3u2CgCTHFiorDgrskzurrsv12hWdGAXvHZA4dxVCNuLUe3VSNzSIqOmQ5sDw8QtRovH10aCQe2puhRUwJM2xSuZ95dtKQZavuWAPhhulmkpKqsPHFHeMrhnegK/LDNo/OXP4kqFn6bvhVJ5264HRekKj9gqMhgf2TY+jX5/IDk4POXwyht8aPJQkvZFz1wrnN4djT8E+mehIiy0oEXRtBnXwNmIZ7xNywW4goFo4bZ7CzduSWtOwsGOaWK6U8yBndc5z8L3Dl0tjTqQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(54906003)(38100700002)(186003)(76116006)(86362001)(66476007)(6916009)(66946007)(122000001)(26005)(316002)(64756008)(38070700005)(66556008)(66446008)(5660300002)(8676002)(55016002)(8936002)(71200400001)(4326008)(7696005)(83380400001)(9686003)(53546011)(2906002)(508600001)(6506007)(33656002)(52536014); DIR:OUT; SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4863 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT045.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 953b41bc-28f6-4197-b361-08d99e0e205d X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: DHjAO6bH+o5y4GTSHe843/YpDBcQjhd0Z6SM+mlZwDigOt4Kpq5a7NSjWJWQUr9V1Wray+SeK4jC8OIz0o7hwbwLaiOe5yKb7lrYM5nJXLaFw83P77AbxqW6aFGIuCVUsaETYZG9GKezCq8leeEx1mXdirXNUobAs8mvWAmiYzIlOQDpOZT7pGLq2AAm14vIqFKySI1V3DvbMSv2RJ6fFWi7Lqbah45DKkEyQwpzMrioxdkzF3BbHl2E6G6ME6N/Wn6SHhduyDOK7o7CLBbMM91rvfX5d1xm6U0baK2a6gAyUmythaaMJ0gRi5hELm0r9QKzzVO38Zdy4oJK7kECYlwX9Tk3SqSNokTUY9JpwmYLpxz13d6TP58T9GkHeujFTMxbCTLTxstaUzaRvC9yCPl10UsQiMzl1FH7tcbDrGhrBvo146xwqZJjajFcSxn2uleF8GzROPP6ktK/SCiFwDEUyoFP9+g+W6Zo2HYEg5Lvo2JSTdfBXAA5GXWIiqCA3S2dGGZOMV3EWDnuGwNSR7RAiYitVtGRPMYaOSLqCLSti3EEA+2PQ6skty1FRwWGLCafblfrCP9DtPoo0Bn2QkYZuQJCNVBt2LC5A3OZJNe5DnQQxbfp3ld7cVd5tAPxSo/O4R2LXTWU6PC7+UU+8NvMZn0ioJxv/RUy+G7lI6hkPyP6NuJRzYG4kqPywmRNmsi9rM0oKFPuqIU+/Zlmpg== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(81166007)(55016002)(53546011)(7696005)(6506007)(356005)(83380400001)(2906002)(26005)(8936002)(508600001)(47076005)(36860700001)(316002)(54906003)(9686003)(5660300002)(86362001)(70586007)(186003)(33656002)(63370400001)(6862004)(82310400003)(8676002)(52536014)(336012)(63350400001)(4326008)(70206006); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Nov 2021 14:36:24.5530 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6c1e33df-066b-4c8c-22bd-08d99e0e2590 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT045.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR08MB2795 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Nov 2021 14:36:39 -0000 > -----Original Message----- > From: Richard Biener > Sent: Tuesday, November 2, 2021 2:24 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH]middle-end Add an RPO pass after successful > vectorization >=20 > On Tue, 2 Nov 2021, Tamar Christina wrote: >=20 > > Hi All, > > > > Following my current SVE predicate optimization series a problem has > > presented itself in that the way vector masks are generated for masked > > operations relies on CSE to share masks efficiently. > > > > The issue however is that masking is done using the & operand and & is > > associative and so reassoc decides to reassociate the masked operations= . >=20 > But it does this for the purpose of canonicalization and thus CSE. Yes, but it turns something like (a & b) & mask into a & (b & mask). When (a & b) is used somewhere else you now lose the CSE. So it's actually= hurting In this case. >=20 > > This makes CSE then unable to CSE an unmasked and a masked operation > > leading to duplicate operations being performed. > > > > To counter this we want to add an RPO pass over the vectorized loop > > body when vectorization succeeds. This makes it then no longer > > reliant on the RTL level CSE. > > > > I have not added a testcase for this as it requires the changes in my > > patch series, however the entire series relies on this patch to work > > so all the tests there cover it. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu and > > no issues. > > > > Ok for master? >=20 > You are running VN over _all_ loop bodies rather only those vectorized. > We loop over vectorized loops earlier for optimizing masked store sequenc= es. > I suppose you could hook in there. I'll also notice that we have > pass_pre_slp_scalar_cleanup which eventually runs plus we have a late FRE= . > So I don't understand why it doesn't work to CSE later. >=20 Atm, say you have the conditions a > b, and a > b & a > c We generate mask1 =3D (a > b) & loop_mask mask2 =3D (a > b & a > c) & loop_mask with the intention that mask1 can be re-used in mask2. Reassoc changes this to mask2 =3D a > b & (a > c & loop_mask) Which has now unmasked (a > b) in mask2, which leaves us unable to combine the mask1 and mask2. It doesn't generate incorrect code, just inefficient. > for (i =3D 1; i < number_of_loops (cfun); i++) > { > loop_vec_info loop_vinfo; > bool has_mask_store; >=20 > loop =3D get_loop (cfun, i); > if (!loop || !loop->aux) > continue; > loop_vinfo =3D (loop_vec_info) loop->aux; > has_mask_store =3D LOOP_VINFO_HAS_MASK_STORE (loop_vinfo); > delete loop_vinfo; > if (has_mask_store > && targetm.vectorize.empty_mask_is_expensive (IFN_MASK_STORE)) > optimize_mask_stores (loop); > loop->aux =3D NULL; > } >=20 Ah thanks, I'll make the changes. Thanks, Tamar >=20 > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * tree-vectorizer.c (vectorize_loops): Do local CSE through RPVN > upon > > successful vectorization. > > > > --- inline copy of patch -- > > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c index > > > 4712dc6e7f907637774482a71036a0bd381c2bd2..1e370d60fb19b03c3b6bce45c > 660 > > af4b6d32dc51 100644 > > --- a/gcc/tree-vectorizer.c > > +++ b/gcc/tree-vectorizer.c > > @@ -81,7 +81,7 @@ along with GCC; see the file COPYING3. If not see > > #include "gimple-pretty-print.h" > > #include "opt-problem.h" > > #include "internal-fn.h" > > - > > +#include "tree-ssa-sccvn.h" > > > > /* Loop or bb location, with hotness information. */ > > dump_user_location_t vect_location; @@ -1323,6 +1323,27 @@ > > vectorize_loops (void) > > ??? Also while we try hard to update loop-closed SSA form we fail > > to properly do this in some corner-cases (see PR56286). */ > > rewrite_into_loop_closed_ssa (NULL, > > TODO_update_ssa_only_virtuals); > > + > > + for (i =3D 1; i < number_of_loops (cfun); i++) > > + { > > + loop =3D get_loop (cfun, i); > > + if (!loop || !single_exit (loop)) > > + continue; > > + > > + bitmap exit_bbs; > > + /* Perform local CSE, this esp. helps because we emit code for > > + predicates that need to be shared for optimal predicate usage. > > + However reassoc will re-order them and prevent CSE from working > > + as it should. CSE only the loop body, not the entry. */ > > + exit_bbs =3D BITMAP_ALLOC (NULL); > > + bitmap_set_bit (exit_bbs, single_exit (loop)->dest->index); > > + bitmap_set_bit (exit_bbs, loop->latch->index); > > + > > + do_rpo_vn (cfun, loop_preheader_edge (loop), exit_bbs); > > + > > + BITMAP_FREE (exit_bbs); > > + } > > + > > return TODO_cleanup_cfg; > > } > > > > > > > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)