From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2070.outbound.protection.outlook.com [40.107.21.70]) by sourceware.org (Postfix) with ESMTPS id 865E43858C52 for ; Fri, 23 Sep 2022 13:14:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 865E43858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=AVEkXylSnr++IgrhsEPU0gTeid+KMcdgMBNPAyIzrIhoLFtR1VD0104gqosbUAYWimoDhMAHBbHrtU1d6liBp0bn5XIrvWbs2b22+ULuZ8fYJYP+y30T7SBD6DMVp2Lv2hjZFbI9rJjq4fi7C9y0L9hQhfjTJxlh50FMwI+TlqhuJs22I7o0BhyUyTh+3q85wLRblqYA20gI1sYkivcEkTmvAeQSBdMsVu8NPPOy+t0SLWl63sfdaXY7E5GQTZPrx1O6Wu4XzPpjstIBQZzFV4c4b1ef5f1v3KFn9PMM3mXmBKGxxG/+5DzhVvOvUpyO39XRkRgf3VmsKkv6bDVidQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ah0CcV+YGURr1IJ2/6rLUrvpJNGjFPUPkBGUpDSteRI=; b=nz/Pgw3ss3i82DMXRhQOHQzXpq/M8hoEN5M31hfrEixeCPEy5iJ4Fq6jDY/I26MNzYzKlzH/nhaGhUcs5/GE8t6QNslzH922oueBP+N1OXThPwYTpGCqNceToci4vHzJhHfFGvVAgd7G99JNUT74GGqXX90utiQ++ocLrzo0u0PAipm9egmnQ7vhEP/U6ljehDk1kLw+ZP/7YoarcuBbMRRW8I4KbrgpEQVC2AFACAMYxm6CBBzFPTj+eUlHwcpkbdXfRraRPu8dJdCGC8j0uiB4BaeH24bSbcWisMS9VrgltdrHdPtsV2ruOHaHlnliv2EI3WjEJXljOulTWFY3QQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ah0CcV+YGURr1IJ2/6rLUrvpJNGjFPUPkBGUpDSteRI=; b=HAZ2so67mX55cQ+Q/DgnwL1Jce8clboWudi2vZCb1tKX7EvY7bJy2H13mQeVYWITiUDMAUeHxcJMGlsGh+B3VCVhPyyvccC6vBjmwX31+ECLErEpTHoCW2nIuTrb0smCX7LEnFxWVr37o+lbPTMf88be6i6UFbANu4X6y4euOI4= Received: from AS8PR04CA0090.eurprd04.prod.outlook.com (2603:10a6:20b:313::35) by AS2PR08MB8480.eurprd08.prod.outlook.com (2603:10a6:20b:55e::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.17; Fri, 23 Sep 2022 13:14:02 +0000 Received: from AM7EUR03FT044.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:313:cafe::72) by AS8PR04CA0090.outlook.office365.com (2603:10a6:20b:313::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.20 via Frontend Transport; Fri, 23 Sep 2022 13:14:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT044.mail.protection.outlook.com (100.127.140.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.14 via Frontend Transport; Fri, 23 Sep 2022 13:14:02 +0000 Received: ("Tessian outbound 88978e6d60db:v124"); Fri, 23 Sep 2022 13:14:02 +0000 X-CR-MTA-TID: 64aa7808 Received: from ffb39f0f7e65.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 4B8727F5-3771-42EC-9244-573145D33798.1; Fri, 23 Sep 2022 13:13:57 +0000 Received: from EUR01-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ffb39f0f7e65.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 23 Sep 2022 13:13:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fJOt/h6CbGFpe+Kh2IRSmTnCcNf7kVMpHgV6pzTIbh4CgD09ZWauSI7wDEklmn33q5i4azw1SGD+cUqtYW+sgu51NPvD+mSUv1tFmaM+l9zoTMKyPdps8+bzVAEmHFH8IEkO4FlfX6wdlfP8ZAAJv2Sjct7J7GC0jjm8qYjV1ODdaajdVFD7Oon1MK1/8a3sMGk15FglevL+cHWXALwQaAYk1Jz3yIcEzjvqpXRmannzAhiqiHVyNONCgz3ZiXDyZOCA86aF5QCoIIGm4T5FLxlk/XgSqbuPQfcN0PTrRCg535T2uZvltKW9h4SuIhw7awyTKDcrOhOeBLItj/4huA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ah0CcV+YGURr1IJ2/6rLUrvpJNGjFPUPkBGUpDSteRI=; b=AZXCDV/2bepK8XUflmAm6XwmuOVHg/9nP0HGDs+yW1ujrerAsGKNRVi1Xtr8eEGDopQDtEUj3EymK5IGnTRRF2pgxJp01RAOtUwCRpbds05vyKP2yNcPfPk1/phmmBlDeTYZRTliDKZa1DyduFo7JRLFdUZuBH/cuKafiyNUbLiD3fe5YZHF2Dw5X2VBVc+FvvcA0aKggl+eHJB++hLr+jEH0WCw5H2tZASyVzJB2Uvp4c6MTgCsEiVyxzaFH92Sjr6wEeDy/54YMHyFvDEUKdUKqSUB1FsJeaKfWukV7j+a+gKTezTGQzMUza3zCc5JcEPH01XbBna4xgLzhr6t1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ah0CcV+YGURr1IJ2/6rLUrvpJNGjFPUPkBGUpDSteRI=; b=HAZ2so67mX55cQ+Q/DgnwL1Jce8clboWudi2vZCb1tKX7EvY7bJy2H13mQeVYWITiUDMAUeHxcJMGlsGh+B3VCVhPyyvccC6vBjmwX31+ECLErEpTHoCW2nIuTrb0smCX7LEnFxWVr37o+lbPTMf88be6i6UFbANu4X6y4euOI4= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB8PR08MB5385.eurprd08.prod.outlook.com (2603:10a6:10:119::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.16; Fri, 23 Sep 2022 13:13:54 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40%4]) with mapi id 15.20.5632.021; Fri, 23 Sep 2022 13:13:53 +0000 From: Tamar Christina To: Richard Biener CC: Richard Sandiford , Tamar Christina via Gcc-patches , nd , "juzhe.zhong@rivai.ai" Subject: RE: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR. Thread-Topic: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR. Thread-Index: AQHYgXABctzsum1rOEqzkjE/pN3Irq1UEC2AgADu/gCAAwfuwIAAL3eAgAABWoCAABOUDYCVBuAwgABB9ICAAAE88A== Date: Fri, 23 Sep 2022 13:13:53 +0000 Message-ID: References: <1C4185AB-6EE6-4B8B-838C-465098DAFD3B@suse.de> <997q6no-qsqp-1oro-52sp-899sr075p4po@fhfr.qr> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 921DAC14675FCC43B9A9B3FAD3240BA4.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|DB8PR08MB5385:EE_|AM7EUR03FT044:EE_|AS2PR08MB8480:EE_ X-MS-Office365-Filtering-Correlation-Id: f364738e-2fe5-46ec-b24e-08da9d657c26 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: VhsBO5VXpum8QsnBaVf/fzruLxI/gG+NtHmpeQM37cebE42zxzPlbU5ozJ6i5bFjDUzGXwWqaOjymGoyhClHj3y7b/WHkSdB+tfP5CsLOXsTv1XoaF7RV7z6fKYS+ZY1fg4iY9BkpGjsPyfRxsKzLReDtV6eMkbEjukm5RLaHKX1eu0G4G3IcE+hBjHbTiJ4gRUhvfWqgfn/3CI9JoZCSuC1JdxOJ/t0xb1xvgfH5KB5PUXc9pkVkkae8iMxBqKBifTuPh1o3Tmo+98TcU5GWhnFT7RZOj50XzlZ7+GzW/pK0+bwWPAi+8MNtUIZnSuas7Ft1duqiD+jeRd5EJtE1PKvclBW+O3/oqahmO/T+vxtEmTWXTsOAAsag4iqLBQ1e7Vy3firrBrLBmDadIKstjCmu601wyNItv1PHo8y3HCACZMc0R2Ma1m7m4bZxQ4ZRfZ9w0V/OmeF+amVMuCffr+5SFSxsE1XzCJrsXvDzP+9RrYiPDXH29v15HG8DbDLXo+2vFd/cmzn24KSvijD9WrYKiPXcGKZBUwfAB9CGtzw9RN7lcv6oghFmdPEfGtX+yg4y3Arl92H6j6wY09VAcAyQiwmiUBN4E8sx4eaFzc0gEZzLj8mEqm189N5xX+UkAYsxDqqT7o8gj5jEVHMLU40B50syQEdoTcYkCj1PrCXVufYNP2lFMwBIU6s477RXLpKBhMGGJWEBOfJCj5s8hMC7/9bmNjgUpdJlIOKNJAQU8mGyD/LvW67b50vCTP65poQ6iLwlcdr20nSZh+94WReCm0447OT6guL4OZdjHnLCNUkwt99hAJVDJs087LuORj/mwO6Li6H81dXalNNKg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(396003)(136003)(376002)(39860400002)(346002)(366004)(451199015)(86362001)(2906002)(84970400001)(76116006)(33656002)(52536014)(66946007)(66556008)(54906003)(66446008)(64756008)(8676002)(8936002)(4326008)(71200400001)(478600001)(5660300002)(41300700001)(316002)(53546011)(7696005)(83380400001)(66476007)(6506007)(38070700005)(26005)(38100700002)(6916009)(122000001)(55016003)(9686003)(186003);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5385 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: aca8d913-a873-4c93-0cb3-08da9d6576bb X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: EVdpWUWl6Jn908uY6yg5o6iFbTpX0LUTDhO9fUYJKdzrrUweU4HbmKdsH4yoMd12tFaN7Rc/M1f/1M+Lhc9WhPNtkyjnESLfDzQDm/D4FaMHtu2Xoz1gctWd9mwdT7OcINlOkavuod7eaNpVUqGuBlC4ogpGYWn1WvzlZW3JwKnatPnPHwjhlXWK4D+eB0geLKhDCzEsEru3WIXUiIpYlbohNU9dx84HdyQVjqWQe2HSaGeHMxzw6k6niZbjuhKSwF9kDPZ4wxwCSbigOD9FiZIXBJLh+qNivo1/v4/NHbeRCGcS7eyT5dtoA3YtLuivsgwVagrN6UchgHJQ0nu7ZivXp4LAYVFYnHD6ls3pFU4zKVbyZ9ZBNu09/oAijRCgmAZ3AA3atVbMrE2azw+0hqBEtomdqzXIbZDYC2pQqc0q4XQpj7roOqnvCMg7rDGxZ81lGJzK6RdK8m05uxTamVYyMmpAkzjOnB7CiMV630cY2df6wLArqW7BLGNcTBds0GdlGeTWnBNTvYgyYa7JetyNUHHxDUleBIYg6pzdwer01VmEmZIO1MrKx0nDwcN+At0GfWYePkSZMSHGI+aMeOc+hj0nGcML58NhUYadwzupnji2wRVvKZyFyQWt4DJ8YvFyxagZx275h3M7GMoIb+isV7a+R8UYdOrcEJC59vyCBd+naPZbeRaxXWbssNdMRWZlRUGDoIeuPAqtJ0Fo2Qv1F1skKOe1uAU7ENbWO5ouGCJqz66MGLruYtqYuez3tg4itKmnfjJsiYPErlyoZfC6vcgC8PJSEYJRkOmm/jKGsz8uqOCVlq4rQ4Cs2mv+4g3q16AJB4+hY1Eg/WeObA== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(396003)(346002)(376002)(39860400002)(136003)(451199015)(46966006)(36840700001)(40470700004)(336012)(107886003)(478600001)(47076005)(186003)(7696005)(53546011)(40460700003)(6506007)(55016003)(9686003)(26005)(40480700001)(2906002)(70586007)(6862004)(8936002)(316002)(82310400005)(5660300002)(41300700001)(52536014)(70206006)(8676002)(4326008)(356005)(81166007)(54906003)(82740400003)(83380400001)(36860700001)(84970400001)(33656002)(86362001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2022 13:14:02.5733 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f364738e-2fe5-46ec-b24e-08da9d657c26 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8480 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Biener > Sent: Friday, September 23, 2022 8:54 AM > To: Tamar Christina > Cc: Richard Sandiford ; Tamar Christina via > Gcc-patches ; nd ; > juzhe.zhong@rivai.ai > Subject: RE: [PATCH]middle-end Add optimized float addsub without > needing VEC_PERM_EXPR. >=20 > On Fri, 23 Sep 2022, Tamar Christina wrote: >=20 > > Hi, > > > > Attached is the respun version of the patch, > > > > > >> > > > >> Wouldn't a target need to re-check if lanes are NaN or denormal > > > >> if after a SFmode lane operation a DFmode lane operation follows? > > > >> IIRC that is what usually makes punning "integer" vectors as FP ve= ctors > costly. > > > > I don't believe this is a problem, due to NANs not being a single > > value and according to the standard the sign bit doesn't change the > meaning of a NAN. > > > > That's why specifically for negates generally no check is performed > > and it's Assumed that if a value is a NaN going in, it's a NaN coming > > out, and this Optimization doesn't change that. Also under fast-math > > we don't guarantee a stable representation for NaN (or zeros, etc) afai= k. > > > > So if that is still a concern I could add && !HONORS_NAN () to the > constraints. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * match.pd: Add fneg/fadd rule. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/simd/addsub_1.c: New test. > > * gcc.target/aarch64/sve/addsub_1.c: New test. > > > > --- inline version of patch --- > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > > 1bb936fc4010f98f24bb97671350e8432c55b347..2617d56091dfbd41ae49f980e > e0a > > f3757f5ec1cf 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -7916,6 +7916,59 @@ and, > > (simplify (reduc (op @0 VECTOR_CST@1)) > > (op (reduc:type @0) (reduc:type @1)))) > > > > +/* Simplify vector floating point operations of alternating sub/add pa= irs > > + into using an fneg of a wider element type followed by a normal add= . > > + under IEEE 754 the fneg of the wider type will negate every even en= try > > + and when doing an add we get a sub of the even and add of every odd > > + elements. */ > > +(simplify > > + (vec_perm (plus:c @0 @1) (minus @0 @1) VECTOR_CST@2) (if > > +(!VECTOR_INTEGER_TYPE_P (type) && !BYTES_BIG_ENDIAN) >=20 > shouldn't this be FLOAT_WORDS_BIG_ENDIAN instead? >=20 > I'm still concerned what >=20 > (neg:V2DF (subreg:V2DF (reg:V4SF) 0)) >=20 > means for architectures like RISC-V. Can one "reformat" FP values in vec= tor > registers so that two floats overlap a double (and then back)? >=20 > I suppose you rely on target_can_change_mode_class to tell you that. Indeed, the documentation says: "This hook returns true if it is possible to bitcast values held in registe= rs of class rclass from mode from to mode to and if doing so preserves the low-order bits that= are common to both modes. The result is only meaningful if rclass has registers= that can hold both from and to." This implies to me that if the bitcast shouldn't be possible the hook shoul= d reject it. Of course you always where something is possible, but perhaps not cheap to = do. The specific implementation for RISC-V seem to imply to me that they disall= ow any FP conversions. So seems to be ok. >=20 >=20 > > + (with > > + { > > + /* Build a vector of integers from the tree mask. */ > > + vec_perm_builder builder; > > + if (!tree_to_vec_perm_builder (&builder, @2)) > > + return NULL_TREE; > > + > > + /* Create a vec_perm_indices for the integer vector. */ > > + poly_uint64 nelts =3D TYPE_VECTOR_SUBPARTS (type); > > + vec_perm_indices sel (builder, 2, nelts); > > + } > > + (if (sel.series_p (0, 2, 0, 2)) > > + (with > > + { > > + machine_mode vec_mode =3D TYPE_MODE (type); > > + auto elem_mode =3D GET_MODE_INNER (vec_mode); > > + auto nunits =3D exact_div (GET_MODE_NUNITS (vec_mode), 2); > > + tree stype; > > + switch (elem_mode) > > + { > > + case E_HFmode: > > + stype =3D float_type_node; > > + break; > > + case E_SFmode: > > + stype =3D double_type_node; > > + break; > > + default: > > + return NULL_TREE; > > + } >=20 > Can't you use GET_MODE_WIDER_MODE and double-check the mode-size > doubles? I mean you obviously miss DFmode -> TFmode. Problem is I need the type, not the mode, but all even build_pointer_type_f= or_mode requires the new scalar type. So I couldn't find anything to help here giv= en that there's no inverse relationship between modes and types. >=20 > > + tree ntype =3D build_vector_type (stype, nunits); > > + if (!ntype) >=20 > You want to check that the above results in a vector mode. Does it? Technically you can cast a V2SF to both a V1DF or DF can't you? Both seem equally valid here. > > + return NULL_TREE; > > + > > + /* The format has to have a simple sign bit. */ > > + const struct real_format *fmt =3D FLOAT_MODE_FORMAT (vec_mode); > > + if (fmt =3D=3D NULL) > > + return NULL_TREE; > > + } > > + (if (fmt->signbit_rw =3D=3D GET_MODE_UNIT_BITSIZE (vec_mode) - 1 >=20 > shouldn't this be a check on the component mode? I think you'd want to > check that the bigger format signbit_rw is equal to the smaller format mo= de > size plus its signbit_rw or so? Tbh, both are somewhat weak guarantees. In a previous patch of mine I'd ad= ded a new field "is_ieee" to the real formats to denote that they are an IEEE type. Maybe I should r= evive that instead? Regards, Tamar >=20 > > + && fmt->signbit_rw =3D=3D fmt->signbit_ro > > + && targetm.can_change_mode_class (TYPE_MODE (ntype), > TYPE_MODE (type), ALL_REGS) > > + && (optimize_vectors_before_lowering_p () > > + || target_supports_op_p (ntype, NEGATE_EXPR, optab_vector))) > > + (plus (view_convert:type (negate (view_convert:ntype @1))) > > +@0))))))) > > + > > (simplify > > (vec_perm @0 @1 VECTOR_CST@2) > > (with > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c > > b/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..1fb91a34c421bbd2894faa0db > bf1 > > b47ad43310c4 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c > > @@ -0,0 +1,56 @@ > > +/* { dg-do compile } */ > > +/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */ > > +/* { dg-options "-Ofast" } */ > > +/* { dg-add-options arm_v8_2a_fp16_neon } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +#pragma GCC target "+nosve" > > + > > +/* > > +** f1: > > +** ... > > +** fneg v[0-9]+.2d, v[0-9]+.2d > > +** fadd v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s > > +** ... > > +*/ > > +void f1 (float *restrict a, float *restrict b, float *res, int n) { > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > + { > > + res[i+0] =3D a[i+0] + b[i+0]; > > + res[i+1] =3D a[i+1] - b[i+1]; > > + } > > +} > > + > > +/* > > +** d1: > > +** ... > > +** fneg v[0-9]+.4s, v[0-9]+.4s > > +** fadd v[0-9]+.8h, v[0-9]+.8h, v[0-9]+.8h > > +** ... > > +*/ > > +void d1 (_Float16 *restrict a, _Float16 *restrict b, _Float16 *res, > > +int n) { > > + for (int i =3D 0; i < (n & -8); i+=3D2) > > + { > > + res[i+0] =3D a[i+0] + b[i+0]; > > + res[i+1] =3D a[i+1] - b[i+1]; > > + } > > +} > > + > > +/* > > +** e1: > > +** ... > > +** fadd v[0-9]+.2d, v[0-9]+.2d, v[0-9]+.2d > > +** fsub v[0-9]+.2d, v[0-9]+.2d, v[0-9]+.2d > > +** ins v[0-9]+.d\[1\], v[0-9]+.d\[1\] > > +** ... > > +*/ > > +void e1 (double *restrict a, double *restrict b, double *res, int n) > > +{ > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > + { > > + res[i+0] =3D a[i+0] + b[i+0]; > > + res[i+1] =3D a[i+1] - b[i+1]; > > + } > > +} > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/addsub_1.c > > b/gcc/testsuite/gcc.target/aarch64/sve/addsub_1.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..ea7f9d9db2c8c9a3efe5c7951a > 31 > > 4a29b7a7a922 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/addsub_1.c > > @@ -0,0 +1,52 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-Ofast" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +/* > > +** f1: > > +** ... > > +** fneg z[0-9]+.d, p[0-9]+/m, z[0-9]+.d > > +** fadd z[0-9]+.s, z[0-9]+.s, z[0-9]+.s > > +** ... > > +*/ > > +void f1 (float *restrict a, float *restrict b, float *res, int n) { > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > + { > > + res[i+0] =3D a[i+0] + b[i+0]; > > + res[i+1] =3D a[i+1] - b[i+1]; > > + } > > +} > > + > > +/* > > +** d1: > > +** ... > > +** fneg z[0-9]+.s, p[0-9]+/m, z[0-9]+.s > > +** fadd z[0-9]+.h, z[0-9]+.h, z[0-9]+.h > > +** ... > > +*/ > > +void d1 (_Float16 *restrict a, _Float16 *restrict b, _Float16 *res, > > +int n) { > > + for (int i =3D 0; i < (n & -8); i+=3D2) > > + { > > + res[i+0] =3D a[i+0] + b[i+0]; > > + res[i+1] =3D a[i+1] - b[i+1]; > > + } > > +} > > + > > +/* > > +** e1: > > +** ... > > +** fsub z[0-9]+.d, z[0-9]+.d, z[0-9]+.d > > +** movprfx z[0-9]+.d, p[0-9]+/m, z[0-9]+.d > > +** fadd z[0-9]+.d, p[0-9]+/m, z[0-9]+.d, z[0-9]+.d > > +** ... > > +*/ > > +void e1 (double *restrict a, double *restrict b, double *res, int n) > > +{ > > + for (int i =3D 0; i < (n & -4); i+=3D2) > > + { > > + res[i+0] =3D a[i+0] + b[i+0]; > > + res[i+1] =3D a[i+1] - b[i+1]; > > + } > > +} > > > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, > Boudien Moerman; HRB 36809 (AG Nuernberg)