From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2084.outbound.protection.outlook.com [40.107.7.84]) by sourceware.org (Postfix) with ESMTPS id 3282E38618B8 for ; Wed, 27 Sep 2023 07:57:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3282E38618B8 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P+pb15xg+csjIHc5DspXBwnV7VV42YbyjlTRjDooMCQ=; b=QFPfqp9hknu+nsealsdqC4c1k1e5HetF8zKHar2txTTeifGV28qlLv+aDlTDYlNQBDG9gw9F0Wq+b+V1vZ9N8YQBKGoItJ3V8nqN8bjHvBhqwQM6JBd6Zywl4im0fzMky98IJLf1kWiUGPxbRoPk3gMcxezrhP6HxNIh0MiI2Ck= Received: from AS9PR05CA0053.eurprd05.prod.outlook.com (2603:10a6:20b:489::33) by DU2PR08MB9992.eurprd08.prod.outlook.com (2603:10a6:10:490::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21; Wed, 27 Sep 2023 07:56:55 +0000 Received: from AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:489:cafe::81) by AS9PR05CA0053.outlook.office365.com (2603:10a6:20b:489::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21 via Frontend Transport; Wed, 27 Sep 2023 07:56:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT027.mail.protection.outlook.com (100.127.140.124) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.19 via Frontend Transport; Wed, 27 Sep 2023 07:56:54 +0000 Received: ("Tessian outbound d084e965c4eb:v175"); Wed, 27 Sep 2023 07:56:53 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 03da9d58414c6005 X-CR-MTA-TID: 64aa7808 Received: from 7f5efb5f710c.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D7F3360B-E959-42DF-98A7-6D07B26A1D09.1; Wed, 27 Sep 2023 07:56:46 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 7f5efb5f710c.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 27 Sep 2023 07:56:46 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=deg7digOIi084GugggdAtX8p0nsJ7rDR4GCvGTuokD8xobCde0l4ydQeXyHx4ygWx5WNXgudOT8soPQj3dpaLqJTnbVGrVqvarQucNU6vVmZTDMIuEV/TUbqt4YF9+A704VsLc7qKuUZIYx8BF9xhDXsSIi1I+2FGdsr3HJ8ZELUlinQdiu73AWD7T5owLgqRQ3l+u/zAPTf7h4cMnPDAkjOl+Ykiwz6QHT9O+sO6z4ZJ8FRCyf4wVvqOZ3bOBuhmb2fvHgEsZ2JD7dGMiJdRr61BcDcwIvJ72UHlHbpTt107tWmcUgSHXPEC7l67BXn4HcPRhHElymArbw0Pi71IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P+pb15xg+csjIHc5DspXBwnV7VV42YbyjlTRjDooMCQ=; b=n4dT1NspgDJRxKIkR0RUAD8CQcHo1tPVhdl8rqLGYntxqniBGb2LoTtvCz6BEHzA4QzGzJhr8sYJEJC82mNeGtEC7MDKhfSTKQToPSMtBINa8RzXHmXuBwW33Dq8S2Y5gu9oQV8ZHRade/107dOGzO8pXaiascOcpNFlsInW+QtVFSK45O/z5u6BK2wKeGL9+e4cB7+po/Y5xAZGPApfDFVZN6FwQgRgjoUZXpArbSBYisK50+igVMHtuMMR7By4dtcwMIB+1xfSSTpLBiBGHOauWHoR5/hivVU+iRlfhr8AKY2KQ1ifhsAjO8DGyPXtXWi05o2hjBqAJFPRYKW+sw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P+pb15xg+csjIHc5DspXBwnV7VV42YbyjlTRjDooMCQ=; b=QFPfqp9hknu+nsealsdqC4c1k1e5HetF8zKHar2txTTeifGV28qlLv+aDlTDYlNQBDG9gw9F0Wq+b+V1vZ9N8YQBKGoItJ3V8nqN8bjHvBhqwQM6JBd6Zywl4im0fzMky98IJLf1kWiUGPxbRoPk3gMcxezrhP6HxNIh0MiI2Ck= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PA4PR08MB6318.eurprd08.prod.outlook.com (2603:10a6:102:e2::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.28; Wed, 27 Sep 2023 07:56:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6813.027; Wed, 27 Sep 2023 07:56:43 +0000 From: Tamar Christina To: Richard Biener CC: Andrew Pinski , "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154] Thread-Topic: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154] Thread-Index: AQHZ8NyYUy/9C+tZQUuY9xAIpvunf7At3wQAgAAUNTCAAE7DgIAAC3HA Date: Wed, 27 Sep 2023 07:56:43 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|PA4PR08MB6318:EE_|AM7EUR03FT027:EE_|DU2PR08MB9992:EE_ X-MS-Office365-Filtering-Correlation-Id: 08447c4b-1ed5-4591-df8a-08dbbf2f50bb x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 6mXN9UT9nxQYVjTrMTDIwpjm38y2a2MEZLy68DpoabUl38ZcPx6aKyUW+3q63BvWoXrjakeOiVsfv6AvL5rfW55xuHKzQcB2HOIoPqjyVCHp9z37llCRL0wcR8wC0KsNai4TO5NkB4yYMT+pRICJ88nRxEthZCCDyDIIySy0RUtjwoTxrM6Fzn/YkNDvBH/sJYIMkV/cn5f6vNbijVV+DxSm1gyqxFSEp2PZaYyRBNE4h2qKpTWv1/BxvBHLM79r/H0OZ0VnhqB3R3jvGDJvMZvdCnYty/hpfg3RWBIjvBJ9AKOdyJSMG8ozq+ubNNrPu5quA9HzkbPxyt5UA8uq2NW3Q/K62hZZtnfpwZX6GS35Saukc9XM8/E2lDOOuJQR0XcSTSkfZB+FDJlVDqX3jkdEOZ4hWAbl0rPhxCSnDv+Ps4mBzLaN1mfVPnl4Vid98syQBUIGpbpyJX/kV8Wsm+Ob5MWRIetFWXZOrQECLyRRpUFfbTV/lzoyR87SfMm8RBwWrzGIFYpYglc1TzANNprUlnuPSRIvSPH/aQbEagTbcMNAjAjZl19Ihd4chEoRdU53bG4Nr/4h8ojQ6ySmOkLgPX7dfPGwA3KxxzM+O5y4RKnIceH1tnIHchyqLFqL4ntc7WN38AvUT0uvGYFdJ+yaIXwGaZQV8tcHugKUkmg= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376002)(396003)(136003)(366004)(39860400002)(346002)(230922051799003)(451199024)(186009)(1800799009)(8676002)(8936002)(41300700001)(76116006)(2906002)(55016003)(26005)(84970400001)(4326008)(30864003)(52536014)(478600001)(66476007)(316002)(54906003)(66946007)(5660300002)(6916009)(64756008)(66556008)(66446008)(6506007)(71200400001)(7696005)(53546011)(9686003)(83380400001)(38070700005)(86362001)(122000001)(38100700002)(33656002)(357404004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6318 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 0a18dc64-72cc-4740-ed93-08dbbf2f4a7a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: fBTJbeD9YAEvlP/vddqoedJ6uonxGMkK6Lnx1rrze+IJWzf2Q0CPfl0hKdFpc+CnkfsaWHpkDO5l/XMnfNUuf+tVmg8hFtG0CQL0BkBetveYUVZULQWWYK7E8NXiqf2ul3Kuz0wKVqRez0hrGNkLkCEcDWip137joK/FpD5rXTvB7lN8DDhjh2C1BNFzmdREm5e3YaqnD4exD2Fh1S41sCPm2QJtsocb3EfejF56IM9TMrrF7v35nkC+/ScBLUnW553I+kenS2fBi73xXAO9+PQKXIpLMyrY0VULkXp6oU/gNADElNwC4ChcQLoM6k2D4jrbjY0PEhRLryMxybAbzYlyFO5Ehx4XLDiQmWvjkR7NiyYewel+ZxgekRi/Y1bJN2YfzD9HZiFs03XK/yT9/Nl786b8gcTrKZm5WXq6lOH3Qq3OMXtPMLssJPFJDG7xdUgAXCcnQ3Y9GD9cC/VRR/vZxwuOo+DiUHR/fRBL6BejVSJC87lSFcK3Dg16Bxoc6KQQ81xaglQ1fe0hvoUazqX907Rv24QHrAz3gu+caRPxJem7XfTAxLnjWLepmijBjNoI54me1Vb7l+MYQUiWKhQg72btvVUSm6ZwesijSFXfVtNPP9XeaTbQdCFnSqog7p/Q+Q9iMEeeKmi+23ksdOmnTAAzrkOi8GKnzKFKTB0cnBT8sgwlY+yyW4/LY5P/Rw55+OemF0eUJr4e3Qxl3QLws5XtKu1k+AsoOhSmyBOOkTY7dtexZXjlFVqk3VcY7+ZZY281oSYXwKd/1RNmVI7103uMNfXfheGxB5h7Wy0= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(376002)(39850400004)(136003)(396003)(346002)(230922051799003)(451199024)(82310400011)(1800799009)(186009)(36840700001)(46966006)(40470700004)(30864003)(6506007)(83380400001)(84970400001)(2906002)(7696005)(47076005)(478600001)(26005)(6862004)(8676002)(5660300002)(107886003)(70206006)(52536014)(336012)(4326008)(70586007)(8936002)(36860700001)(54906003)(41300700001)(316002)(82740400003)(9686003)(53546011)(356005)(81166007)(86362001)(33656002)(40460700003)(40480700001)(55016003)(357404004);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Sep 2023 07:56:54.1133 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 08447c4b-1ed5-4591-df8a-08dbbf2f50bb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2PR08MB9992 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Biener > Sent: Wednesday, September 27, 2023 8:12 AM > To: Tamar Christina > Cc: Andrew Pinski ; gcc-patches@gcc.gnu.org; nd > ; jlaw@ventanamicro.com > Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | = (1 << > signbit(x)) [PR109154] >=20 > On Wed, 27 Sep 2023, Tamar Christina wrote: >=20 > > > -----Original Message----- > > > From: Andrew Pinski > > > Sent: Wednesday, September 27, 2023 2:17 AM > > > To: Tamar Christina > > > Cc: gcc-patches@gcc.gnu.org; nd ; rguenther@suse.de; > > > jlaw@ventanamicro.com > > > Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to > > > x | (1 << > > > signbit(x)) [PR109154] > > > > > > On Tue, Sep 26, 2023 at 5:51?PM Tamar Christina > > > > > > wrote: > > > > > > > > Hi All, > > > > > > > > For targets that allow conversion between int and float modes this > > > > adds a new optimization transforming fneg (fabs (x)) into x | (1 > > > > << signbit(x)). Such sequences are common in scientific code > > > > working with > > > gradients. > > > > > > > > The transformed instruction if the target has an inclusive-OR that > > > > takes an immediate is both shorter an faster. For those that > > > > don't the immediate has to be seperate constructed but this still > > > > ends up being faster as the immediate construction is not on the cr= itical > path. > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > > > > > Ok for master? > > > > > > I think this should be part of isel instead of match. > > > Maybe we could use genmatch to generate the code that does the > > > transformations but this does not belong as part of match really. > > > > I disagree.. I don't think this belongs in isel. Isel is for structural > transformations. > > If there is a case for something else I'd imagine backwardprop is a bet= ter > choice. > > > > But I don't see why it doesn't belong here considering it *is* a > > mathematical optimization and the file has plenty of transformations > > such as mask optimizations and vector conditional rewriting. >=20 > But the mathematical transform would more generally be fneg (fabs (x)) -> > copysign (x, -1.) and that can be optimally expanded at RTL expansion tim= e? Ah sure, atm I did copysign (x, -1) -> x | 1 << signbits. I can do it the = other way around. And I guess since copysign (-x, y), copysign(|x|, y) -> copysign (= x, y) that should solve the trigonometry problem too. Cool will do that instead, thanks! Tamar >=20 > Richard. >=20 > > Regards, > > Tamar > > > > > > > > Thanks, > > > Andrew > > > > > > > > > > > Thanks, > > > > Tamar > > > > > > > > gcc/ChangeLog: > > > > > > > > PR tree-optimization/109154 > > > > * match.pd: Add new neg+abs rule. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > PR tree-optimization/109154 > > > > * gcc.target/aarch64/fneg-abs_1.c: New test. > > > > * gcc.target/aarch64/fneg-abs_2.c: New test. > > > > * gcc.target/aarch64/fneg-abs_3.c: New test. > > > > * gcc.target/aarch64/fneg-abs_4.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_1.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_2.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_3.c: New test. > > > > * gcc.target/aarch64/sve/fneg-abs_4.c: New test. > > > > > > > > --- inline copy of patch -- > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > > > > > > 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d69482 > > > 6cffad0f > > > > b17e1136600a 100644 > > > > --- a/gcc/match.pd > > > > +++ b/gcc/match.pd > > > > @@ -9476,3 +9476,57 @@ and, > > > > } > > > > (if (full_perm_p) > > > > (vec_perm (op@3 @0 @1) @3 @2)))))) > > > > + > > > > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X). */ > > > > + > > > > +(simplify > > > > + (negate (abs @0)) > > > > + (if (FLOAT_TYPE_P (type) > > > > + /* We have to delay this rewriting till after forward prop > > > > +because > > > otherwise > > > > + it's harder to do trigonometry optimizations. e.g. cos(-fa= bs(x)) is not > > > > + matched in one go. Instead cos (-x) is matched first > > > > + followed by > > > cos(|x|). > > > > + The bottom op approach makes this rule match first and it'= s not > untill > > > > + fwdprop that we match top down. There are manu such > > > > + simplications > > > so we > > > > + delay this optimization till later on. */ > > > > + && canonicalize_math_after_vectorization_p ()) (with { > > > > + tree itype =3D unsigned_type_for (type); > > > > + machine_mode mode =3D TYPE_MODE (type); > > > > + const struct real_format *float_fmt =3D FLOAT_MODE_FORMAT (mod= e); > > > > + auto optab =3D VECTOR_TYPE_P (type) ? optab_vector : optab_def= ault; } > > > > + (if (float_fmt > > > > + && float_fmt->signbit_rw >=3D 0 > > > > + && targetm.can_change_mode_class (TYPE_MODE (itype), > > > > + TYPE_MODE (type), ALL_REG= S) > > > > + && target_supports_op_p (itype, BIT_IOR_EXPR, optab)) > > > > + (with { wide_int wone =3D wi::one (element_precision (type)); > > > > + int sbit =3D float_fmt->signbit_rw; > > > > + auto stype =3D VECTOR_TYPE_P (type) ? TREE_TYPE (itype)= : itype; > > > > + tree sign_bit =3D wide_int_to_tree (stype, wi::lshift (= wone, sbit));} > > > > + (view_convert:type > > > > + (bit_ior (view_convert:itype @0) > > > > + { build_uniform_cst (itype, sign_bit); } ))))))) > > > > + > > > > +/* Repeat the same but for conditional negate. */ > > > > + > > > > +(simplify > > > > + (IFN_COND_NEG @1 (abs @0) @2) > > > > + (if (FLOAT_TYPE_P (type)) > > > > + (with { > > > > + tree itype =3D unsigned_type_for (type); > > > > + machine_mode mode =3D TYPE_MODE (type); > > > > + const struct real_format *float_fmt =3D FLOAT_MODE_FORMAT (mod= e); > > > > + auto optab =3D VECTOR_TYPE_P (type) ? optab_vector : optab_def= ault; } > > > > + (if (float_fmt > > > > + && float_fmt->signbit_rw >=3D 0 > > > > + && targetm.can_change_mode_class (TYPE_MODE (itype), > > > > + TYPE_MODE (type), ALL_REG= S) > > > > + && target_supports_op_p (itype, BIT_IOR_EXPR, optab)) > > > > + (with { wide_int wone =3D wi::one (element_precision (type)); > > > > + int sbit =3D float_fmt->signbit_rw; > > > > + auto stype =3D VECTOR_TYPE_P (type) ? TREE_TYPE (itype)= : itype; > > > > + tree sign_bit =3D wide_int_to_tree (stype, wi::lshift (= wone, sbit));} > > > > + (view_convert:type > > > > + (IFN_COND_IOR @1 (view_convert:itype @0) > > > > + { build_uniform_cst (itype, sign_bit); } > > > > + (view_convert:itype @2) ))))))) > > > > \ No newline at end of file > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..f823013c3ddf6b3a266 > > > c3abfcbf2 > > > > 642fc2a75fa6 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c > > > > @@ -0,0 +1,39 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** t1: > > > > +** orr v[0-9]+.2s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x2_t t1 (float32x2_t a) > > > > +{ > > > > + return vneg_f32 (vabs_f32 (a)); } > > > > + > > > > +/* > > > > +** t2: > > > > +** orr v[0-9]+.4s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x4_t t2 (float32x4_t a) > > > > +{ > > > > + return vnegq_f32 (vabsq_f32 (a)); } > > > > + > > > > +/* > > > > +** t3: > > > > +** adrp x0, .LC[0-9]+ > > > > +** ldr q[0-9]+, \[x0, #:lo12:.LC0\] > > > > +** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > > > > +** ret > > > > +*/ > > > > +float64x2_t t3 (float64x2_t a) > > > > +{ > > > > + return vnegq_f64 (vabsq_f64 (a)); } > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..141121176b309e4b2a > > > a413dc5527 > > > > 1a6e3c93d5e1 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > > > > @@ -0,0 +1,31 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float32_t f1 (float32_t a) > > > > +{ > > > > + return -fabsf (a); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float64_t f2 (float64_t a) > > > > +{ > > > > + return -fabs (a); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..b4652173a95d104ddf > > > a70c497f06 > > > > 27a61ea89d3b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c > > > > @@ -0,0 +1,36 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** ... > > > > +** ldr q[0-9]+, \[x0\] > > > > +** orr v[0-9]+.4s, #128, lsl #24 > > > > +** str q[0-9]+, \[x0\], 16 > > > > +** ... > > > > +*/ > > > > +void f1 (float32_t *a, int n) > > > > +{ > > > > + for (int i =3D 0; i < (n & -8); i++) > > > > + a[i] =3D -fabsf (a[i]); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** ... > > > > +** ldr q[0-9]+, \[x0\] > > > > +** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > > > > +** str q[0-9]+, \[x0\], 16 > > > > +** ... > > > > +*/ > > > > +void f2 (float64_t *a, int n) > > > > +{ > > > > + for (int i =3D 0; i < (n & -8); i++) > > > > + a[i] =3D -fabs (a[i]); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..10879dea74462d34b2 > > > 6160eeb0bd > > > > 54ead063166b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > > > > @@ -0,0 +1,39 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#pragma GCC target "+nosve" > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** negabs: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +double negabs (double x) > > > > +{ > > > > + unsigned long long y; > > > > + memcpy (&y, &x, sizeof(double)); > > > > + y =3D y | (1UL << 63); > > > > + memcpy (&x, &y, sizeof(double)); > > > > + return x; > > > > +} > > > > + > > > > +/* > > > > +** negabsf: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float negabsf (float x) > > > > +{ > > > > + unsigned int y; > > > > + memcpy (&y, &x, sizeof(float)); > > > > + y =3D y | (1U << 31); > > > > + memcpy (&x, &y, sizeof(float)); > > > > + return x; > > > > +} > > > > + > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..0c7664e6de77a49768 > > > 2952653ffd > > > > 417453854d52 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c > > > > @@ -0,0 +1,37 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** t1: > > > > +** orr v[0-9]+.2s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x2_t t1 (float32x2_t a) > > > > +{ > > > > + return vneg_f32 (vabs_f32 (a)); } > > > > + > > > > +/* > > > > +** t2: > > > > +** orr v[0-9]+.4s, #128, lsl #24 > > > > +** ret > > > > +*/ > > > > +float32x4_t t2 (float32x4_t a) > > > > +{ > > > > + return vnegq_f32 (vabsq_f32 (a)); } > > > > + > > > > +/* > > > > +** t3: > > > > +** adrp x0, .LC[0-9]+ > > > > +** ldr q[0-9]+, \[x0, #:lo12:.LC0\] > > > > +** orr v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b > > > > +** ret > > > > +*/ > > > > +float64x2_t t3 (float64x2_t a) > > > > +{ > > > > + return vnegq_f64 (vabsq_f64 (a)); } > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..a60cd31b9294af2dac6 > > > 9eed1c93f > > > > 899bd5c78fca > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c > > > > @@ -0,0 +1,29 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float32_t f1 (float32_t a) > > > > +{ > > > > + return -fabsf (a); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float64_t f2 (float64_t a) > > > > +{ > > > > + return -fabs (a); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..1bf34328d8841de8e6 > > > b0a5458562 > > > > a9f00e31c275 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c > > > > @@ -0,0 +1,34 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#include > > > > +#include > > > > + > > > > +/* > > > > +** f1: > > > > +** ... > > > > +** ld1w z[0-9]+.s, p[0-9]+/z, \[x0, x2, lsl 2\] > > > > +** orr z[0-9]+.s, z[0-9]+.s, #0x80000000 > > > > +** st1w z[0-9]+.s, p[0-9]+, \[x0, x2, lsl 2\] > > > > +** ... > > > > +*/ > > > > +void f1 (float32_t *a, int n) > > > > +{ > > > > + for (int i =3D 0; i < (n & -8); i++) > > > > + a[i] =3D -fabsf (a[i]); > > > > +} > > > > + > > > > +/* > > > > +** f2: > > > > +** ... > > > > +** ld1d z[0-9]+.d, p[0-9]+/z, \[x0, x2, lsl 3\] > > > > +** orr z[0-9]+.d, z[0-9]+.d, #0x8000000000000000 > > > > +** st1d z[0-9]+.d, p[0-9]+, \[x0, x2, lsl 3\] > > > > +** ... > > > > +*/ > > > > +void f2 (float64_t *a, int n) > > > > +{ > > > > + for (int i =3D 0; i < (n & -8); i++) > > > > + a[i] =3D -fabs (a[i]); > > > > +} > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > > > > new file mode 100644 > > > > index > > > > > > > > 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0 > > > 1f6604ca7b > > > > e87e3744d494 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c > > > > @@ -0,0 +1,37 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3" } */ > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } > > > > +} } */ > > > > + > > > > +#include > > > > + > > > > +/* > > > > +** negabs: > > > > +** mov x0, -9223372036854775808 > > > > +** fmov d[0-9]+, x0 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +double negabs (double x) > > > > +{ > > > > + unsigned long long y; > > > > + memcpy (&y, &x, sizeof(double)); > > > > + y =3D y | (1UL << 63); > > > > + memcpy (&x, &y, sizeof(double)); > > > > + return x; > > > > +} > > > > + > > > > +/* > > > > +** negabsf: > > > > +** movi v[0-9]+.2s, 0x80, lsl 24 > > > > +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > > > > +** ret > > > > +*/ > > > > +float negabsf (float x) > > > > +{ > > > > + unsigned int y; > > > > + memcpy (&y, &x, sizeof(float)); > > > > + y =3D y | (1U << 31); > > > > + memcpy (&x, &y, sizeof(float)); > > > > + return x; > > > > +} > > > > + > > > > > > > > > > > > > > > > > > > > -- > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)