From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-eopbgr20087.outbound.protection.outlook.com [40.107.2.87]) by sourceware.org (Postfix) with ESMTPS id 1890F3858C52 for ; Fri, 23 Sep 2022 13:55:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1890F3858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=e6d5quw56qsmPyO4TQeLHt4rATq1laKHtpIKEkMlTb/PJZlcW2682HFxrItjrsoCPjeGUy5Szq8GAwMPIYR5hUKaM9akQaxm6IOkmLt5jojNpIJDNUCJ1Jo67tPAg09Tgxxd298egyxU1N45RiLkIqlIXil5tnYVhgEna+Mp8xVVtrc2qaQu7FTLmVBzh2mseckgB1Cv++ScxQ9yhoMqk9J3H9QHrdf6QjnMUpCB624TjpXyvDJyJFZwnG1yDm5iOQTzTaN3bhf6gVz37+YCOF1+r12Pe5J5WUSxu3S0L1YaoBKS9qgjn2aIu1r1C/8aGPowSC+u+wlrWXwqvSRejw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=x3EgbXqpXLQ3I0oaOsQALjNYemkFaYq6/9FuU8BjfiA=; b=auH98yEIh264sWkQsOJbXlpmh1Brgu7CNDvjFKtS+Uq9c+Tffybs3hDO/air1doBqbgGViF8F2HF4iCnXpW8fnRScKi/IFMS13fI+i361JxhvDaHPS4+e7yGPlHVXvESPHVvxuNjFszBCoz8DIcR+58hoaQYbfH2dtmIchSCMXRekcd3tQV1U0/YEdKXhk6A98dHr8Sr5UiDegN+MUOpPR4J3ukhEjl6vPoXOXgLcM2C5YH0aiBgRC7uzu8tjDlx26GokP3nOt4ZqYaTF81AtBe2kWeppFWqavjPktXYiern4/HfKEdt9pjf/vJFSY9vXLIcS+hC3sTWSGfZoVBvYA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=x3EgbXqpXLQ3I0oaOsQALjNYemkFaYq6/9FuU8BjfiA=; b=zV09Ba5treC+X5Re/aopXa8+GU9wBAr5gF+quUI86meEwFs0dX03uqi2narLDIc25GC2yiCeYF3FCWAHogRl+qPxuNd/JZTMWlhVXcasx95+P/ZDvob0c8XBGduwPzYYF9Dv1uH1RHLBdlAjYFGxFSXv64wBhPmeQsj3nqL/5p0= Received: from DB9PR01CA0002.eurprd01.prod.exchangelabs.com (2603:10a6:10:1d8::7) by AS8PR08MB5877.eurprd08.prod.outlook.com (2603:10a6:20b:291::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.20; Fri, 23 Sep 2022 13:55:03 +0000 Received: from DBAEUR03FT056.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:1d8:cafe::d) by DB9PR01CA0002.outlook.office365.com (2603:10a6:10:1d8::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.20 via Frontend Transport; Fri, 23 Sep 2022 13:55:03 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT056.mail.protection.outlook.com (100.127.142.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.14 via Frontend Transport; Fri, 23 Sep 2022 13:55:03 +0000 Received: ("Tessian outbound 9236804a5e9b:v124"); Fri, 23 Sep 2022 13:55:03 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 616e0dd0fa3ff813 X-CR-MTA-TID: 64aa7808 Received: from ff16355d606d.3 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 7B09ABEE-2BE8-4E9E-866F-06174F4121A4.1; Fri, 23 Sep 2022 13:54:57 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ff16355d606d.3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 23 Sep 2022 13:54:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=j4/GpgU3ak+OfuuoSzQkFMislI870jiTk4HDaigMy1pb9feFpS0HPfTOLPtU2wjzoWRXkmFIC3v5oVud3e4jwLKU9mfzxJSk5UxxAN4YiL/OWNwm4URy1T0+D2V0s94UidDeGd4ylXsaNopkoW5SGcTNDj/8cLkYgEmk7nWkbs9ykn3H0TxO8vsRzfGlTSip12R1343/KKcIhOmzo/Qc3AEgqrmWX4+l2PvZW47Nt5W5Vt6UlbMKrdN1//w2KyPiTqWZQc7bIrncMjbdmpiNoSU5lNtbEEKoq6m/iIqhQOr0U9WHq4JbHsdB7RIDtAtKpPZDYgnlBEvI3PD3Vuccyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=x3EgbXqpXLQ3I0oaOsQALjNYemkFaYq6/9FuU8BjfiA=; b=R/7bFQZ5xEz815L4a8mp39XJi8GBKQSpTV6bxXcD11J+6yCUXNfgJrbh7b5uyFYe4vKjBdorxeNStsxZuWPAuKauYs7y82PdyUzSBZtaCcWyA2qEvYHhJU8ezn5K/9iT6pjM5JEDV75UaIH+w5yhpF5vO5e9yKw2guDQq4N7U9ZDrH+JQY54VIBD8eL+d5ilO9MXLxF0F7J/Jw3gEyZFHWTAApaHkeC9WOZjLKepzRQz5L6zekQYQfz96iUva0S71mGqrklR9lgebp40Bue7LG4AjKIP4PnHMde4HvthzRD0Mr1EcxD4gTaUOdmY5RBH0hlRHDaFpZsKzlLGiWe4wA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=x3EgbXqpXLQ3I0oaOsQALjNYemkFaYq6/9FuU8BjfiA=; b=zV09Ba5treC+X5Re/aopXa8+GU9wBAr5gF+quUI86meEwFs0dX03uqi2narLDIc25GC2yiCeYF3FCWAHogRl+qPxuNd/JZTMWlhVXcasx95+P/ZDvob0c8XBGduwPzYYF9Dv1uH1RHLBdlAjYFGxFSXv64wBhPmeQsj3nqL/5p0= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PA4PR08MB7596.eurprd08.prod.outlook.com (2603:10a6:102:272::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.20; Fri, 23 Sep 2022 13:54:54 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40%4]) with mapi id 15.20.5632.021; Fri, 23 Sep 2022 13:54:53 +0000 From: Tamar Christina To: Tamar Christina , Richard Biener CC: Richard Sandiford , nd , Tamar Christina via Gcc-patches , "juzhe.zhong@rivai.ai" Subject: RE: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR. Thread-Topic: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR. Thread-Index: AQHYgXABctzsum1rOEqzkjE/pN3Irq1UEC2AgADu/gCAAwfuwIAAL3eAgAABWoCAABOUDYCVBuAwgABB9ICAAAE88IAAD48Q Date: Fri, 23 Sep 2022 13:54:53 +0000 Message-ID: References: <1C4185AB-6EE6-4B8B-838C-465098DAFD3B@suse.de> <997q6no-qsqp-1oro-52sp-899sr075p4po@fhfr.qr> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: E00B3B029E90C44890A3F9A9ED0E377E.0 Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|PA4PR08MB7596:EE_|DBAEUR03FT056:EE_|AS8PR08MB5877:EE_ X-MS-Office365-Filtering-Correlation-Id: c8ab6e92-b84b-4d94-1f4b-08da9d6b36cc x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: igvkkTi5cCIg72vKlwjISnxdSU65TqkINlTuwCowp0Xeu7g9P4+gDqGEjUnOvp90V/n3jLnSbij/Wvg8YddNBUADvWA9rUYLw1WR3VZuiLFutdJQr7O/hyIAjNw3hg+wMOqBzwIh35ot5ouOVddL/aC0VVsUmnFnurrwsOne2HFNC2pOcvFB08V3/SJ/lgrbTYgtfZCQAgHEu+kHPf98UBMvqp3Xm2+6gexO6lF+CG8E2O9ZKqhf3jJHpYdRbt41YI7mB9m5PH167MqCUX9VyCaFGF/dKzWpnN1IRXEvfl8/xPJwjvifpoFCull5BU2wO5UDp42Z9HDCtKr05+fPrpnLaxYJ1wGfk5bvW3jPuEKCD49YuhC5MLUCxrkTbLF/EaOlnRS6ZO3PMcKRdwnQp5HJxvLGD4II8qiHmy38MMoblUfxFt1E0Z5e/zgLkTEp9VFyPQopVIjNlxvegfwRfVd/VEiOW0tIxPeEunQbGzp3em0oKkYYNVeUlpvtwZRU7mUDHdZhCxqsXjyGeEIBY/XTb1lyJ3nss5BIbbhr3NuiPIjv/8r0MWwDtO6vd8Eqolc2tK08I2chj86SVT6tkIQsvAvj8dkVWlv/1LD46sSS9P6DFKvJFcC6fbnf9ZVBUIOT/2+VzIAaXGvFhW9nYUSR6lyEdwY8Z9dbPihyPz25H+vONi3SC3gAh3nz22M+nksJkDv51KyEKu5ZLiiaHwDQMoow7G+rukc4xhWOLSBTdXGmCcCm1RQLxS0xtchLroacE5DKyq8N73vG6cT67cvPjINacbhygbvEvxcGDFRzIrwlnW/bGSqeM8DOcuf0zHohAgFYN1rxmOoMgj9Mug== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(39860400002)(136003)(396003)(376002)(346002)(366004)(451199015)(83380400001)(71200400001)(7696005)(6506007)(38100700002)(122000001)(53546011)(5660300002)(316002)(186003)(41300700001)(110136005)(2906002)(66946007)(26005)(66476007)(76116006)(9686003)(66556008)(84970400001)(33656002)(38070700005)(2940100002)(66446008)(478600001)(52536014)(4326008)(8676002)(30864003)(54906003)(8936002)(55016003)(86362001)(64756008);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB7596 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a942c9c6-0e25-43a5-bf84-08da9d6b3118 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: vw1fktOhDOA+IIw8R3BCUXtnWug2WE5V/0UiAPHe15a0F2eR5BEpVUBno6Tr4NJ770+XccJBDfPxnVgV79W8Xd7yFdrQuO99DPlHu83c2xfzICxbwS/Ww7PRd9HX2sMdRkDEkOl8Agwaa8HXqpgWtiSYsVpRjeXHlXDJv9yo8DVwSLhpECxXzYpXAt4DAJ39ffYSWTiRGiWdLWN4FKp25BT6TJTa9GaBwruDfSsjyRWnY4aWb38ltY1vHl+MFLVN78Y6hMmx9b96ktO2+c0P3pmahbl7U9VMWTYKmbQUQrH5qtafKZuT2adr1OkeVAXznhSKf7jdtP0IzvgxmnVWrFnamiMTbMFe8yx6Z3r2OYnghDVZeCtH2H8Yd8FEBprdIilRgyeIkxPNRqMrYFeytVs6nF47niELKfv88JWkwYI66eNE+vrLYhfb6kR9deRulM8szIJaytUQTiPbM7HiDOnEJDPu+aKQZiku3vFTGymjsiUmAd59cVAwIwdtXowXVYA7N9bPFcmqxTSjg7E/Sl07++sa4DTXHqkriYYQNZQjd4Tyo7SlrPBKUYiEM4hShGCupz5hFsDs6Wsk/nwJ3G534F8gvkrZRMzp0k801dKUTDNvddN5VyJNCb6NoMu9+qpKu8d5zP0A3DaafQp7+YWm4zLA9menVIysXfofJNv0tUc6gHUDw+urQ/ocKDlL6oVPAKv+LwA3oT8UUhd0CFa3ZywvJp1oEuLHVyhz7uci08IOT3U8RiiHpkDwf47vDlJoyWMbmdxbOKjV3cv8WaOcmeMFo9nQVnoZAa0C1B5duTWk64V+pZUfkQRf3S35NzTWVD73MtuiTm9EFqCaag== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(136003)(39860400002)(396003)(346002)(376002)(451199015)(40470700004)(46966006)(36840700001)(7696005)(356005)(30864003)(40480700001)(9686003)(2940100002)(82310400005)(110136005)(70206006)(26005)(54906003)(52536014)(33656002)(83380400001)(478600001)(41300700001)(81166007)(36860700001)(8936002)(70586007)(186003)(336012)(84970400001)(55016003)(5660300002)(53546011)(107886003)(6506007)(40460700003)(8676002)(4326008)(2906002)(316002)(82740400003)(47076005)(86362001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2022 13:55:03.2130 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c8ab6e92-b84b-4d94-1f4b-08da9d6b36cc X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB5877 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Gcc-patches bounces+tamar.christina=3Darm.com@gcc.gnu.org> On Behalf Of Tamar > Christina via Gcc-patches > Sent: Friday, September 23, 2022 9:14 AM > To: Richard Biener > Cc: Richard Sandiford ; nd ; > Tamar Christina via Gcc-patches ; > juzhe.zhong@rivai.ai > Subject: RE: [PATCH]middle-end Add optimized float addsub without > needing VEC_PERM_EXPR. >=20 > > -----Original Message----- > > From: Richard Biener > > Sent: Friday, September 23, 2022 8:54 AM > > To: Tamar Christina > > Cc: Richard Sandiford ; Tamar Christina via > > Gcc-patches ; nd ; > > juzhe.zhong@rivai.ai > > Subject: RE: [PATCH]middle-end Add optimized float addsub without > > needing VEC_PERM_EXPR. > > > > On Fri, 23 Sep 2022, Tamar Christina wrote: > > > > > Hi, > > > > > > Attached is the respun version of the patch, > > > > > > > >> > > > > >> Wouldn't a target need to re-check if lanes are NaN or denormal > > > > >> if after a SFmode lane operation a DFmode lane operation follows= ? > > > > >> IIRC that is what usually makes punning "integer" vectors as FP > > > > >> vectors > > costly. > > > > > > I don't believe this is a problem, due to NANs not being a single > > > value and according to the standard the sign bit doesn't change the > > meaning of a NAN. > > > > > > That's why specifically for negates generally no check is performed > > > and it's Assumed that if a value is a NaN going in, it's a NaN > > > coming out, and this Optimization doesn't change that. Also under > > > fast-math we don't guarantee a stable representation for NaN (or zero= s, > etc) afaik. > > > > > > So if that is still a concern I could add && !HONORS_NAN () to the > > constraints. > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > > > Ok for master? > > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > * match.pd: Add fneg/fadd rule. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/aarch64/simd/addsub_1.c: New test. > > > * gcc.target/aarch64/sve/addsub_1.c: New test. > > > > > > --- inline version of patch --- > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > > > > 1bb936fc4010f98f24bb97671350e8432c55b347..2617d56091dfbd41ae49f980e > > e0a > > > f3757f5ec1cf 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -7916,6 +7916,59 @@ and, > > > (simplify (reduc (op @0 VECTOR_CST@1)) > > > (op (reduc:type @0) (reduc:type @1)))) > > > > > > +/* Simplify vector floating point operations of alternating sub/add = pairs > > > + into using an fneg of a wider element type followed by a normal a= dd. > > > + under IEEE 754 the fneg of the wider type will negate every even = entry > > > + and when doing an add we get a sub of the even and add of every o= dd > > > + elements. */ > > > +(simplify > > > + (vec_perm (plus:c @0 @1) (minus @0 @1) VECTOR_CST@2) (if > > > +(!VECTOR_INTEGER_TYPE_P (type) && !BYTES_BIG_ENDIAN) > > > > shouldn't this be FLOAT_WORDS_BIG_ENDIAN instead? > > > > I'm still concerned what > > > > (neg:V2DF (subreg:V2DF (reg:V4SF) 0)) > > > > means for architectures like RISC-V. Can one "reformat" FP values in > > vector registers so that two floats overlap a double (and then back)? > > > > I suppose you rely on target_can_change_mode_class to tell you that. >=20 > Indeed, the documentation says: >=20 > "This hook returns true if it is possible to bitcast values held in regis= ters of > class rclass from mode from to mode to and if doing so preserves the low- > order bits that are common to both modes. The result is only meaningful i= f > rclass has registers that can hold both from and to." >=20 > This implies to me that if the bitcast shouldn't be possible the hook sho= uld > reject it. > Of course you always where something is possible, but perhaps not cheap t= o > do. >=20 > The specific implementation for RISC-V seem to imply to me that they > disallow any FP conversions. So seems to be ok. >=20 > > > > > > > + (with > > > + { > > > + /* Build a vector of integers from the tree mask. */ > > > + vec_perm_builder builder; > > > + if (!tree_to_vec_perm_builder (&builder, @2)) > > > + return NULL_TREE; > > > + > > > + /* Create a vec_perm_indices for the integer vector. */ > > > + poly_uint64 nelts =3D TYPE_VECTOR_SUBPARTS (type); > > > + vec_perm_indices sel (builder, 2, nelts); > > > + } > > > + (if (sel.series_p (0, 2, 0, 2)) > > > + (with > > > + { > > > + machine_mode vec_mode =3D TYPE_MODE (type); > > > + auto elem_mode =3D GET_MODE_INNER (vec_mode); > > > + auto nunits =3D exact_div (GET_MODE_NUNITS (vec_mode), 2); > > > + tree stype; > > > + switch (elem_mode) > > > + { > > > + case E_HFmode: > > > + stype =3D float_type_node; > > > + break; > > > + case E_SFmode: > > > + stype =3D double_type_node; > > > + break; > > > + default: > > > + return NULL_TREE; > > > + } > > > > Can't you use GET_MODE_WIDER_MODE and double-check the mode-size > > doubles? I mean you obviously miss DFmode -> TFmode. >=20 > Problem is I need the type, not the mode, but all even > build_pointer_type_for_mode requires the new scalar type. So I couldn't > find anything to help here given that there's no inverse relationship bet= ween > modes and types. >=20 I meant build_vector_type_for_mode here. > > > > > + tree ntype =3D build_vector_type (stype, nunits); > > > + if (!ntype) > > > > You want to check that the above results in a vector mode. >=20 > Does it? Technically you can cast a V2SF to both a V1DF or DF can't you? > Both seem equally valid here. >=20 > > > + return NULL_TREE; > > > + > > > + /* The format has to have a simple sign bit. */ > > > + const struct real_format *fmt =3D FLOAT_MODE_FORMAT > (vec_mode); > > > + if (fmt =3D=3D NULL) > > > + return NULL_TREE; > > > + } > > > + (if (fmt->signbit_rw =3D=3D GET_MODE_UNIT_BITSIZE (vec_mode) - = 1 > > > > shouldn't this be a check on the component mode? I think you'd want > > to check that the bigger format signbit_rw is equal to the smaller > > format mode size plus its signbit_rw or so? >=20 > Tbh, both are somewhat weak guarantees. In a previous patch of mine I'd > added a new field "is_ieee" > to the real formats to denote that they are an IEEE type. Maybe I should > revive that instead? >=20 > Regards, > Tamar >=20 > > > > > + && fmt->signbit_rw =3D=3D fmt->signbit_ro > > > + && targetm.can_change_mode_class (TYPE_MODE (ntype), > > TYPE_MODE (type), ALL_REGS) > > > + && (optimize_vectors_before_lowering_p () > > > + || target_supports_op_p (ntype, NEGATE_EXPR, optab_vector))) > > > + (plus (view_convert:type (negate (view_convert:ntype @1))) > > > +@0))))))) > > > + > > > (simplify > > > (vec_perm @0 @1 VECTOR_CST@2) > > > (with > > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c > > > b/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c > > > new file mode 100644 > > > index > > > > > > 0000000000000000000000000000000000000000..1fb91a34c421bbd2894faa0db > > bf1 > > > b47ad43310c4 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c > > > @@ -0,0 +1,56 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */ > > > +/* { dg-options "-Ofast" } */ > > > +/* { dg-add-options arm_v8_2a_fp16_neon } */ > > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } > > > +} } */ > > > + > > > +#pragma GCC target "+nosve" > > > + > > > +/* > > > +** f1: > > > +** ... > > > +** fneg v[0-9]+.2d, v[0-9]+.2d > > > +** fadd v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s > > > +** ... > > > +*/ > > > +void f1 (float *restrict a, float *restrict b, float *res, int n) { > > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > > + { > > > + res[i+0] =3D a[i+0] + b[i+0]; > > > + res[i+1] =3D a[i+1] - b[i+1]; > > > + } > > > +} > > > + > > > +/* > > > +** d1: > > > +** ... > > > +** fneg v[0-9]+.4s, v[0-9]+.4s > > > +** fadd v[0-9]+.8h, v[0-9]+.8h, v[0-9]+.8h > > > +** ... > > > +*/ > > > +void d1 (_Float16 *restrict a, _Float16 *restrict b, _Float16 *res, > > > +int n) { > > > + for (int i =3D 0; i < (n & -8); i+=3D2) > > > + { > > > + res[i+0] =3D a[i+0] + b[i+0]; > > > + res[i+1] =3D a[i+1] - b[i+1]; > > > + } > > > +} > > > + > > > +/* > > > +** e1: > > > +** ... > > > +** fadd v[0-9]+.2d, v[0-9]+.2d, v[0-9]+.2d > > > +** fsub v[0-9]+.2d, v[0-9]+.2d, v[0-9]+.2d > > > +** ins v[0-9]+.d\[1\], v[0-9]+.d\[1\] > > > +** ... > > > +*/ > > > +void e1 (double *restrict a, double *restrict b, double *res, int > > > +n) { > > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > > + { > > > + res[i+0] =3D a[i+0] + b[i+0]; > > > + res[i+1] =3D a[i+1] - b[i+1]; > > > + } > > > +} > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/addsub_1.c > > > b/gcc/testsuite/gcc.target/aarch64/sve/addsub_1.c > > > new file mode 100644 > > > index > > > > > > 0000000000000000000000000000000000000000..ea7f9d9db2c8c9a3efe5c7951a > > 31 > > > 4a29b7a7a922 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/addsub_1.c > > > @@ -0,0 +1,52 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-Ofast" } */ > > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } > > > +} } */ > > > + > > > +/* > > > +** f1: > > > +** ... > > > +** fneg z[0-9]+.d, p[0-9]+/m, z[0-9]+.d > > > +** fadd z[0-9]+.s, z[0-9]+.s, z[0-9]+.s > > > +** ... > > > +*/ > > > +void f1 (float *restrict a, float *restrict b, float *res, int n) { > > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > > + { > > > + res[i+0] =3D a[i+0] + b[i+0]; > > > + res[i+1] =3D a[i+1] - b[i+1]; > > > + } > > > +} > > > + > > > +/* > > > +** d1: > > > +** ... > > > +** fneg z[0-9]+.s, p[0-9]+/m, z[0-9]+.s > > > +** fadd z[0-9]+.h, z[0-9]+.h, z[0-9]+.h > > > +** ... > > > +*/ > > > +void d1 (_Float16 *restrict a, _Float16 *restrict b, _Float16 *res, > > > +int n) { > > > + for (int i =3D 0; i < (n & -8); i+=3D2) > > > + { > > > + res[i+0] =3D a[i+0] + b[i+0]; > > > + res[i+1] =3D a[i+1] - b[i+1]; > > > + } > > > +} > > > + > > > +/* > > > +** e1: > > > +** ... > > > +** fsub z[0-9]+.d, z[0-9]+.d, z[0-9]+.d > > > +** movprfx z[0-9]+.d, p[0-9]+/m, z[0-9]+.d > > > +** fadd z[0-9]+.d, p[0-9]+/m, z[0-9]+.d, z[0-9]+.d > > > +** ... > > > +*/ > > > +void e1 (double *restrict a, double *restrict b, double *res, int > > > +n) { > > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > > + { > > > + res[i+0] =3D a[i+0] + b[i+0]; > > > + res[i+1] =3D a[i+1] - b[i+1]; > > > + } > > > +} > > > > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 > > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, > > Boudien Moerman; HRB 36809 (AG Nuernberg)