From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2044.outbound.protection.outlook.com [40.107.22.44]) by sourceware.org (Postfix) with ESMTPS id 6F60C385841E for ; Tue, 1 Nov 2022 15:12:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6F60C385841E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=f1XQ5ldubIPCZ3+VYO8TxYTW/W5px+T8sVZHuQgtxS2iDQgzEWVzmoOwt3WOZu3DQE7gxPV6GboFxfnS0T+B7BQJYMxydz0v34EMZAGMI0NR9HjkkKubjg3/hUHaTNoTLNxjGi7EWQa4zLU+hKMCvrybQ2O5QwowN3nhGb5DcUy+Tdchw5Dfyz7YGHJLfJhcH0mLGRn6kBELUfz8jpmx8abakaZ+5/jmGagFvLCl7RoQjEGQyvP7qEfBTBVhLZw+7RNQX7OcQRp9RcDlNU4kwkwrsdcMNNseE32a4XboJc15ZmjQ9D/9AaGGzhaonVzs2tzaht3/YibleTNy6FC1Ew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ns/MET/R+alk+ioLDtbhdjibkawFd+QI/gyIRe1SOAY=; b=R27p38yVqv3TO2axlidGiSOOqe4Gmp/3XD5tJuw48LmuP1zPDMr5YXmfvE0CDuSsuFYIXJcOJxHDbaqmoFRgnDKVTHyFkwFNQYyarLH3zVq4P2UQcV9dJpXECrd8xfTY39HgO/yZXWIXv/MQXGWg+sDYZwPGs3stKRSQjFDxw6PSzkP+Stnk9oBqK2p4QYkRvn27sbrc8PjeNvzY/iwg31E+dTwqHEQhIJ7jpueGQLW8UrnNowWag2LeYuINXI11LfeWvU3Xmig9zHwfI8zKvw4zRRmElYySg2zP8yVq4EbDzyvjnKnUkNrhYmZnSYzWfw5PYLEoaUYtbmAoUrKkZQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ns/MET/R+alk+ioLDtbhdjibkawFd+QI/gyIRe1SOAY=; b=8zyBmG9iVlp9k2CoJdnxtFfy+adKbdeV8EFsUaABm868HsJzvgL1OftzvmggaR/VgvJZkSHp3CLOwV3+stEVYD9CyLyfminFtnOBPIreg/DWmfZUiJQrI7D+70MtrEgQIi9eChm9gEeqRmUCCWk6g2AkRpW6HQ1UKNBos8odams= Received: from DBBPR09CA0022.eurprd09.prod.outlook.com (2603:10a6:10:c0::34) by AS8PR08MB6693.eurprd08.prod.outlook.com (2603:10a6:20b:39c::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.21; Tue, 1 Nov 2022 15:12:12 +0000 Received: from DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:c0:cafe::b) by DBBPR09CA0022.outlook.office365.com (2603:10a6:10:c0::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5546.16 via Frontend Transport; Tue, 1 Nov 2022 15:12:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT063.mail.protection.outlook.com (100.127.142.255) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Tue, 1 Nov 2022 15:12:12 +0000 Received: ("Tessian outbound 2ff13c8f2c05:v130"); Tue, 01 Nov 2022 15:12:12 +0000 X-CR-MTA-TID: 64aa7808 Received: from b46e2d1ab0c3.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2FB4987F-2D0E-46F5-A639-C1AA5035122C.1; Tue, 01 Nov 2022 15:12:02 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b46e2d1ab0c3.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Nov 2022 15:12:02 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IOMjkvpoOjXFpKD/9xNkHr7oO0juooMM6P2IvOnPcVPbx4YxKOJe+n30g+5O+3OB2L5hqJSu1W9Y3aJE1ysrttnoGO0MOi1UYE0edzZfyU108agE6Oaae6OXcnq2uqdBGJPCvG5XxpMY3daXdPN+7aA/YObd+FKsEX/gKYXUkxRA4xinIDhyn1FlbB6yWbfXXLe6UjN68k8kC3niJL/IziK+dSK3iafJ3C5biIw7N7c7UvL7QpLO40XkjCzAiRWl8hg7M1Ud/fIrOV12soYMhCj2U500ILam3AgvUt0U29jhkM05Jty5PbDXO4VOtdLTjl3bzVXf9nbKT4RkQ9NWLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ns/MET/R+alk+ioLDtbhdjibkawFd+QI/gyIRe1SOAY=; b=aFles/eGQC4xo2EGqVgJkVvay/rQGYq4wwNAxHlF+VuZmLVxYJF9RrgaxM7fuLCOIv3YVZk1A3xFlcNe3uOs45Hua1Pu7B3fHSEmlAsTuEs/mdekgtA0WB1AWt2WGY4wkTdGu9ao6137VRXGsBi99/lZKZFxUKiF7RWQL/o6X8CxoQW0JQ1fuWDSlfGIuto1ZurAKlCthpdWKfz7CzBF9Aq6STZjNRSboGOCK2Iq0B+8FxzyzkGZRE0Qz3TL3Dovv1EAqAGytxbDtirBYDLH+HxGmjVWLU4wiUZJojKUzrKXZhsCCDOfa9kYOqMTgAR/iTRc368b2LUMFEpwqgdDKQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ns/MET/R+alk+ioLDtbhdjibkawFd+QI/gyIRe1SOAY=; b=8zyBmG9iVlp9k2CoJdnxtFfy+adKbdeV8EFsUaABm868HsJzvgL1OftzvmggaR/VgvJZkSHp3CLOwV3+stEVYD9CyLyfminFtnOBPIreg/DWmfZUiJQrI7D+70MtrEgQIi9eChm9gEeqRmUCCWk6g2AkRpW6HQ1UKNBos8odams= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS4PR08MB7478.eurprd08.prod.outlook.com (2603:10a6:20b:4e5::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16; Tue, 1 Nov 2022 15:11:57 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.021; Tue, 1 Nov 2022 15:11:57 +0000 From: Tamar Christina To: Richard Sandiford CC: "gcc-patches@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: RE: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. Thread-Topic: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. Thread-Index: AQHY7SAlWreqI6axPUuXHMahz+7ngK4qKs2fgAAB64A= Date: Tue, 1 Nov 2022 15:11:57 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 316D8EAA2197E441AA826C325450E962.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|AS4PR08MB7478:EE_|DBAEUR03FT063:EE_|AS8PR08MB6693:EE_ X-MS-Office365-Filtering-Correlation-Id: e5c76f5b-a842-4202-7bfc-08dabc1b73ea x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: aZLQzEcz7T6jh4K3aSeJnNU8jWKLsZQT5TRRr8uszmqhmp/vEzl/WdOEBYRbnvjpMa5ZpQkEKyND9SQSPSgEaThTWLLrOrcGbBM6SLmFBnlfK5DNQNFplhEhQgPa7jgAKuJcZnTPDKEnKe9uvQIGAQ3/VJ95pPk7bjtOe9MgDmBN0MjZVaeHKrF6ribqPV2/WISdQ9JT7BNEKVzmyjcyOmUyQkRO7wriXp5hLCktVMuWbuoGF3Y+Jrt5LPR78YXt3bXYskgx04WrsbCtp2LLCKBHGCXEH3wLo4s7eqYv4vlFM3e1r9bNI8M/OBy61AiKHoczMAONPz7YiqbH+eY+zU0vsPCg37rYe7SujT4ZjMzBpshglvUk0XYJYguOj+FjEJXxvPNxt1s/N8QOkXB36OFce5fjBLni8kRiKftNPa9Qrrr5nZ5DrwbwxaU6YH6kyr5wcXOq0013O0KYdWOMQLKdFjroQpzn0MH5ISUzHPn9gZu9xVPUdm07bp3hbdyhRZ2SUAliQwT258rwKRbH7XnAQ9GgZF5cf4uvNqMksV0obQAIZA8CGpkKyb0ZnEQeRwd8z+b/EG/6+4L3/Jm2RfrStFgxYTnv3DthpeAHkzmulVy5R9gAAOSPPHcwon53Pk+4zSTubopS6dijaA+m5qKxXwOaEcm+QejPcLCYLJJqkgBiLrwgpADZ98oCUSYLT4RTut7q4Y0Br5h+biCsaWTFJMGu6CtUgSwKkVp5tQ62HGwdVZOG8qupmiuccM//xRaK5EtEuINwuZg54dDUkWJyDoOpL/tUPisb5R3Mxu73rXRVKlwLOk7r0cD8jzS4O857Bu5zi2yeVShn5fMYtw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(376002)(366004)(39860400002)(136003)(396003)(451199015)(122000001)(84970400001)(5660300002)(52536014)(186003)(76116006)(64756008)(66946007)(54906003)(6636002)(4326008)(66446008)(316002)(38070700005)(55016003)(33656002)(71200400001)(8936002)(41300700001)(38100700002)(6862004)(30864003)(2906002)(66476007)(6506007)(26005)(53546011)(7696005)(66556008)(9686003)(86362001)(8676002)(478600001)(83380400001)(579004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS4PR08MB7478 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3c010435-f0f0-4901-c815-08dabc1b6b58 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: HShOk2VORW/En2GV1hkmryk55cQ7lLBUmVxjzMoRrSDvUBLnMEzp3K7ouLvCqDUJYrH6RB8bs7zh4ngvwgPckbcRmCsvkr+RuWq8axcRXY6EaZ1Q3mr2GPFOyxIIRxlcVfy32LDtBtFRTQAwZ9Q0KTnr8BGZ1K3WhFGqxwnoW4iolXdwPgSqVc4KAZr80pQ5PuOsLAj5OEkQQQO+L9Uh2mW2ZYnJp11yO358sMTmJVjcsv0zjA8PRu2PwTjjM7KBGLo3k54QJX0wb6GBFW5fVXUyJuQcMqGLYNiGLLauuGRGNA2l/uFbPus0mXOhJ5daE2ImgMrjfH9TFfJ0U/956MsF7D6HxpBItZdCcZoQRCmKBxPF62vyTwEAm3no1LOUjvKxjRyXVG/s8JA+jPGD3fNpkv5iGtWJsX1AsQWIaTU7dp2hytQl0lhK9PJdKBraQV+V5+cjA9BuB5p5bwMESMweEpxUEQNgg70deKSlUyEaj8OTA7CSod9eNjJ4QFSn7dh/CqYo7/6Eooz0AS97bZHk58HPJr84XBI3fM5MzZbtgCLYiY4usy63aBUt3gFaJWM8R6aqYSBIwOnduXxJdlPzHpPmPEA4TxKjGZWwNd3Y9zKaNMyEXNJ5Q+6Rl4jLS9km+Sb9aVd144MJxi0k/qs5qD2HBqmMCCqeDSrQ6CW7rZ7BA4RXa8K/15Vz4O05Z0mYlLXrbz65dDbWeipeDCcrEzV9Kz9oXvRjHLgp9/TFOQnjXV564x7uysTkosCBt8JD8jX+AX7VzxzoEm2TSkxbjFpjwPUfzqd7qCVqV1ayoj93gkXTDLk5FBKAg6eZEN2l7i0BoHIKRHzioooBqA== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(346002)(376002)(136003)(39860400002)(396003)(451199015)(40470700004)(36840700001)(46966006)(55016003)(52536014)(40480700001)(41300700001)(36860700001)(40460700003)(6506007)(7696005)(4326008)(33656002)(54906003)(6636002)(70586007)(8676002)(316002)(70206006)(83380400001)(82740400003)(81166007)(356005)(8936002)(53546011)(6862004)(5660300002)(86362001)(9686003)(26005)(82310400005)(47076005)(478600001)(30864003)(186003)(336012)(2906002)(84970400001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Nov 2022 15:12:12.1232 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e5c76f5b-a842-4202-7bfc-08dabc1b73ea X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6693 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Sandiford > Sent: Tuesday, November 1, 2022 2:59 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. >=20 > Tamar Christina writes: > > Hi All, > > > > The backend has an existing V2HFmode that is used by pairwise operation= s. > > This mode was however never made fully functional. Amongst other > > things it was never declared as a vector type which made it unusable fr= om > the mid-end. > > > > It's also lacking an implementation for load/stores so reload ICEs if > > this mode is every used. This finishes the implementation by providing= the > above. > > > > Note that I have created a new iterator VHSDF_P instead of extending > > VHSDF because the previous iterator is used in far more things than jus= t > load/stores. > > > > It's also used for instance in intrinsics and extending this would > > force me to provide support for mangling the type while we never > > expose it through intrinsics. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New. > > (mov, movmisalign, aarch64_dup_lane, > > aarch64_store_lane0, aarch64_simd_vec_set, > > @aarch64_simd_vec_copy_lane, vec_set, > > reduc__scal_, reduc__scal_, > > aarch64_reduc__internal, > aarch64_get_lane, > > vec_init, vec_extract): Support V2HF. > > * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): > > Add E_V2HFmode. > > * config/aarch64/iterators.md (VHSDF_P): New. > > (V2F, VALL_F16_FULL, nunits, Vtype, Vmtype, Vetype, stype, VEL, > > Vel, q, vp): Add V2HF. > > * config/arm/types.md (neon_fp_reduc_add_h): New. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/sve/slp_1.c: Update testcase. > > > > --- inline copy of patch -- > > diff --git a/gcc/config/aarch64/aarch64-simd.md > > b/gcc/config/aarch64/aarch64-simd.md > > index > > > 25aed74f8cf939562ed65a578fe32ca76605b58a..93a2888f567460ad10ec050ea7 > d4 > > f701df4729d1 100644 > > --- a/gcc/config/aarch64/aarch64-simd.md > > +++ b/gcc/config/aarch64/aarch64-simd.md > > @@ -19,10 +19,10 @@ > > ;; . > > > > (define_expand "mov" > > - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") > > - (match_operand:VALL_F16 1 "general_operand"))] > > + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") > > + (match_operand:VALL_F16_FULL 1 "general_operand"))] > > "TARGET_SIMD" > > - " > > +{ > > /* Force the operand into a register if it is not an > > immediate whose use can be replaced with xzr. > > If the mode is 16 bytes wide, then we will be doing @@ -46,12 > > +46,11 @@ (define_expand "mov" > > aarch64_expand_vector_init (operands[0], operands[1]); > > DONE; > > } > > - " > > -) > > +}) > > > > (define_expand "movmisalign" > > - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") > > - (match_operand:VALL_F16 1 "general_operand"))] > > + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") > > + (match_operand:VALL_F16_FULL 1 "general_operand"))] > > "TARGET_SIMD && !STRICT_ALIGNMENT" > > { > > /* This pattern is not permitted to fail during expansion: if both > > arguments @@ -85,10 +84,10 @@ (define_insn > "aarch64_simd_dup" > > ) > > > > (define_insn "aarch64_dup_lane" > > - [(set (match_operand:VALL_F16 0 "register_operand" "=3Dw") > > - (vec_duplicate:VALL_F16 > > + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=3Dw") > > + (vec_duplicate:VALL_F16_FULL > > (vec_select: > > - (match_operand:VALL_F16 1 "register_operand" "w") > > + (match_operand:VALL_F16_FULL 1 "register_operand" "w") > > (parallel [(match_operand:SI 2 "immediate_operand" "i")]) > > )))] > > "TARGET_SIMD" > > @@ -142,6 +141,29 @@ (define_insn > "*aarch64_simd_mov" > > mov_reg, neon_move")] > > ) > > > > +(define_insn "*aarch64_simd_movv2hf" > > + [(set (match_operand:V2HF 0 "nonimmediate_operand" > > + "=3Dw, m, m, w, ?r, ?w, ?r, w, w") > > + (match_operand:V2HF 1 "general_operand" > > + "m, Dz, w, w, w, r, r, Dz, Dn"))] > > + "TARGET_SIMD_F16INST > > + && (register_operand (operands[0], V2HFmode) > > + || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" > > + "@ > > + ldr\\t%s0, %1 > > + str\\twzr, %0 > > + str\\t%s1, %0 > > + mov\\t%0.2s[0], %1.2s[0] > > + umov\\t%w0, %1.s[0] > > + fmov\\t%s0, %1 > > + mov\\t%0, %1 > > + movi\\t%d0, 0 > > + * return aarch64_output_simd_mov_immediate (operands[1], 32);" > > + [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\ > > + neon_logic, neon_to_gp, f_mcr,\ > > + mov_reg, neon_move, neon_move")] > > +) > > + > > (define_insn "*aarch64_simd_mov" > > [(set (match_operand:VQMOV 0 "nonimmediate_operand" > > "=3Dw, Umn, m, w, ?r, ?w, ?r, w") > > @@ -182,7 +204,7 @@ (define_insn > "*aarch64_simd_mov" > > > > (define_insn "aarch64_store_lane0" > > [(set (match_operand: 0 "memory_operand" "=3Dm") > > - (vec_select: (match_operand:VALL_F16 1 "register_operand" > "w") > > + (vec_select: (match_operand:VALL_F16_FULL 1 > "register_operand" > > +"w") > > (parallel [(match_operand 2 "const_int_operand" > "n")])))] > > "TARGET_SIMD > > && ENDIAN_LANE_N (, INTVAL (operands[2])) =3D=3D 0" > > @@ -1035,11 +1057,11 @@ (define_insn "one_cmpl2" > > ) > > > > (define_insn "aarch64_simd_vec_set" > > - [(set (match_operand:VALL_F16 0 "register_operand" "=3Dw,w,w") > > - (vec_merge:VALL_F16 > > - (vec_duplicate:VALL_F16 > > + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=3Dw,w,w") > > + (vec_merge:VALL_F16_FULL > > + (vec_duplicate:VALL_F16_FULL > > (match_operand: 1 > "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) > > - (match_operand:VALL_F16 3 "register_operand" "0,0,0") > > + (match_operand:VALL_F16_FULL 3 "register_operand" "0,0,0") > > (match_operand:SI 2 "immediate_operand" "i,i,i")))] > > "TARGET_SIMD" > > { > > @@ -1061,14 +1083,14 @@ (define_insn "aarch64_simd_vec_set" > > ) > > > > (define_insn "@aarch64_simd_vec_copy_lane" > > - [(set (match_operand:VALL_F16 0 "register_operand" "=3Dw") > > - (vec_merge:VALL_F16 > > - (vec_duplicate:VALL_F16 > > + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=3Dw") > > + (vec_merge:VALL_F16_FULL > > + (vec_duplicate:VALL_F16_FULL > > (vec_select: > > - (match_operand:VALL_F16 3 "register_operand" "w") > > + (match_operand:VALL_F16_FULL 3 "register_operand" "w") > > (parallel > > [(match_operand:SI 4 "immediate_operand" "i")]))) > > - (match_operand:VALL_F16 1 "register_operand" "0") > > + (match_operand:VALL_F16_FULL 1 "register_operand" "0") > > (match_operand:SI 2 "immediate_operand" "i")))] > > "TARGET_SIMD" > > { > > @@ -1376,7 +1398,7 @@ (define_insn "vec_shr_" > > ) > > > > (define_expand "vec_set" > > - [(match_operand:VALL_F16 0 "register_operand") > > + [(match_operand:VALL_F16_FULL 0 "register_operand") > > (match_operand: 1 "aarch64_simd_nonimmediate_operand") > > (match_operand:SI 2 "immediate_operand")] > > "TARGET_SIMD" > > @@ -3503,7 +3525,7 @@ (define_insn "popcount2" > > ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP > smax/smin). > > (define_expand "reduc__scal_" > > [(match_operand: 0 "register_operand") > > - (unspec: [(match_operand:VHSDF 1 "register_operand")] > > + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] > > FMAXMINV)] > > "TARGET_SIMD" > > { > > @@ -3518,7 +3540,7 @@ (define_expand "reduc__scal_" > > > > (define_expand "reduc__scal_" > > [(match_operand: 0 "register_operand") > > - (unspec: [(match_operand:VHSDF 1 "register_operand")] > > + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] > > FMAXMINNMV)] > > "TARGET_SIMD" > > { > > @@ -3562,8 +3584,8 @@ (define_insn > "aarch64_reduc__internalv2si" > > ) > > > > (define_insn "aarch64_reduc__internal" > > - [(set (match_operand:VHSDF 0 "register_operand" "=3Dw") > > - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] > > + [(set (match_operand:VHSDF_P 0 "register_operand" "=3Dw") > > + (unspec:VHSDF_P [(match_operand:VHSDF_P 1 "register_operand" > > + "w")] > > FMAXMINV))] > > "TARGET_SIMD" > > "\\t%0, %1." > > @@ -4208,7 +4230,7 @@ (define_insn > "*aarch64_get_lane_zero_extend" > > (define_insn_and_split "aarch64_get_lane" > > [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" > "=3D?r, w, Utv") > > (vec_select: > > - (match_operand:VALL_F16 1 "register_operand" "w, w, w") > > + (match_operand:VALL_F16_FULL 1 "register_operand" "w, w, w") > > (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] > > "TARGET_SIMD" > > { > > @@ -7989,7 +8011,7 @@ (define_expand "aarch64_st1" > > ;; Standard pattern name vec_init. > > > > (define_expand "vec_init" > > - [(match_operand:VALL_F16 0 "register_operand") > > + [(match_operand:VALL_F16_FULL 0 "register_operand") > > (match_operand 1 "" "")] > > "TARGET_SIMD" > > { > > @@ -8068,7 +8090,7 @@ (define_insn "aarch64_urecpe" > > > > (define_expand "vec_extract" > > [(match_operand: 0 "aarch64_simd_nonimmediate_operand") > > - (match_operand:VALL_F16 1 "register_operand") > > + (match_operand:VALL_F16_FULL 1 "register_operand") > > (match_operand:SI 2 "immediate_operand")] > > "TARGET_SIMD" > > { > > diff --git a/gcc/config/aarch64/aarch64.cc > > b/gcc/config/aarch64/aarch64.cc index > > > f05bac713e88ea8c7feaa2367d55bd523ca66f57..1e08f8453688210afe1566092b > 19 > > b59c9bdd0c97 100644 > > --- a/gcc/config/aarch64/aarch64.cc > > +++ b/gcc/config/aarch64/aarch64.cc > > @@ -3566,6 +3566,7 @@ aarch64_classify_vector_mode (machine_mode > mode) > > case E_V8BFmode: > > case E_V4SFmode: > > case E_V2DFmode: > > + case E_V2HFmode: > > return TARGET_SIMD ? VEC_ADVSIMD : 0; > > > > default: > > diff --git a/gcc/config/aarch64/iterators.md > > b/gcc/config/aarch64/iterators.md index > > > 37d8161a33b1c399d80be82afa67613a087389d4..1df09f7fe2eb35aed96113476 > 541 > > e0faa5393551 100644 > > --- a/gcc/config/aarch64/iterators.md > > +++ b/gcc/config/aarch64/iterators.md > > @@ -160,6 +160,10 @@ (define_mode_iterator VDQF [V2SF V4SF V2DF]) > > (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST") > > (V8HF "TARGET_SIMD_F16INST") > > V2SF V4SF V2DF]) > > +;; Advanced SIMD Float modes suitable for pairwise operations. > > +(define_mode_iterator VHSDF_P [(V4HF "TARGET_SIMD_F16INST") > > + (V8HF "TARGET_SIMD_F16INST") > > + V2SF V4SF V2DF (V2HF > "TARGET_SIMD_F16INST")]) > > > > ;; Advanced SIMD Float modes, and DF. > > (define_mode_iterator VDQF_DF [V2SF V4SF V2DF DF]) @@ -188,15 > +192,23 > > @@ (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF > V2DI]) > > (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF]) > > > > ;; Advanced SIMD Float modes with 2 elements. > > -(define_mode_iterator V2F [V2SF V2DF]) > > +(define_mode_iterator V2F [V2SF V2DF V2HF]) > > > > ;; All Advanced SIMD modes on which we support any arithmetic > operations. > > (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF > > V4SF V2DF]) > > > > -;; All Advanced SIMD modes suitable for moving, loading, and storing. > > +;; All Advanced SIMD modes suitable for moving, loading, and storing > > +;; except V2HF. > > (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI > > V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) > > > > +;; All Advanced SIMD modes suitable for moving, loading, and storing > > +;; including V2HF (define_mode_iterator VALL_F16_FULL [V8QI V16QI > > +V4HI V8HI V2SI V4SI V2DI > > + V4HF V8HF V4BF V8BF V2SF V4SF V2DF > > + (V2HF "TARGET_SIMD_F16INST")]) >=20 > This name might cause confusion with the SVE iterators, where FULL means > "every bit of the register is used". How about something like VMOVE > instead? >=20 > With this change, I guess VALL_F16 represents "The set of all modes for > which the vld1 intrinsics are provided" and VMOVE or whatever is "All > Advanced SIMD modes suitable for moving, loading, and storing". > That is, VMOVE extends VALL_F16 with modes that are not manifested via > intrinsics. >=20 > > + > > + > > ;; The VALL_F16 modes except the 128-bit 2-element ones. > > (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI > V4SI > > V4HF V8HF V2SF V4SF]) > > @@ -1076,7 +1088,7 @@ (define_mode_attr nunits [(V8QI "8") (V16QI > "16") > > (V2SF "2") (V4SF "4") > > (V1DF "1") (V2DF "2") > > (DI "1") (DF "1") > > - (V8DI "8")]) > > + (V8DI "8") (V2HF "2")]) > > > > ;; Map a mode to the number of bits in it, if the size of the mode > > ;; is constant. > > @@ -1090,6 +1102,7 @@ (define_mode_attr s [(HF "h") (SF "s") (DF "d") > > (SI "s") (DI "d")]) > > > > ;; Give the length suffix letter for a sign- or zero-extension. > > (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) > > +(define_mode_attr sizel [(QI "b") (HI "h") (SI "")]) > > > > ;; Give the number of bits in the mode (define_mode_attr sizen [(QI > > "8") (HI "16") (SI "32") (DI "64")]) @@ -1134,8 +1147,9 @@ > > (define_mode_attr Vtype [(V8QI "8b") (V16QI "16b") > > (V2SI "2s") (V4SI "4s") > > (DI "1d") (DF "1d") > > (V2DI "2d") (V2SF "2s") > > - (V4SF "4s") (V2DF "2d") > > - (V4HF "4h") (V8HF "8h") > > + (V2HF "2h") (V4SF "4s") > > + (V2DF "2d") (V4HF "4h") > > + (V8HF "8h") > > (V2x8QI "8b") (V2x4HI "4h") > > (V2x2SI "2s") (V2x1DI "1d") > > (V2x4HF "4h") (V2x2SF "2s") >=20 > Where is the 2h used, and is it valid syntax in that context? >=20 The singular instrance in the ISA where 2h is a valid syntax is for faddp. I'll double check the usage contexts but it should be the only place. I'll check and get back to you as I respin the patch. Thanks, Tamar > Same for later instances of 2h. >=20 > Thanks, > Richard >=20 > > @@ -1175,9 +1189,10 @@ (define_mode_attr Vmtype [(V8QI ".8b") > (V16QI ".16b") > > (V4HI ".4h") (V8HI ".8h") > > (V2SI ".2s") (V4SI ".4s") > > (V2DI ".2d") (V4HF ".4h") > > - (V8HF ".8h") (V4BF ".4h") > > - (V8BF ".8h") (V2SF ".2s") > > - (V4SF ".4s") (V2DF ".2d") > > + (V8HF ".8h") (V2HF ".2h") > > + (V4BF ".4h") (V8BF ".8h") > > + (V2SF ".2s") (V4SF ".4s") > > + (V2DF ".2d") > > (DI "") (SI "") > > (HI "") (QI "") > > (TI "") (HF "") > > @@ -1193,7 +1208,7 @@ (define_mode_attr Vmntype [(V8HI ".8b") (V4SI > > ".4h") (define_mode_attr Vetype [(V8QI "b") (V16QI "b") > > (V4HI "h") (V8HI "h") > > (V2SI "s") (V4SI "s") > > - (V2DI "d") > > + (V2DI "d") (V2HF "h") > > (V4HF "h") (V8HF "h") > > (V2SF "s") (V4SF "s") > > (V2DF "d") > > @@ -1285,7 +1300,7 @@ (define_mode_attr Vcwtype [(VNx16QI "b") > (VNx8QI > > "h") (VNx4QI "w") (VNx2QI "d") ;; more accurately. > > (define_mode_attr stype [(V8QI "b") (V16QI "b") (V4HI "s") (V8HI "s") > > (V2SI "s") (V4SI "s") (V2DI "d") (V4HF "s") > > - (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") > > + (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") (V2HF > "s") > > (HF "s") (SF "s") (DF "d") (QI "b") (HI "s") > > (SI "s") (DI "d")]) > > > > @@ -1360,8 +1375,8 @@ (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") > > (V4HF "HF") (V8HF "HF") > > (V2SF "SF") (V4SF "SF") > > (DF "DF") (V2DF "DF") > > - (SI "SI") (HI "HI") > > - (QI "QI") > > + (SI "SI") (V2HF "HF") > > + (QI "QI") (HI "HI") > > (V4BF "BF") (V8BF "BF") > > (VNx16QI "QI") (VNx8QI "QI") (VNx4QI "QI") (VNx2QI > "QI") > > (VNx8HI "HI") (VNx4HI "HI") (VNx2HI "HI") @@ -1381,7 > +1396,7 > > @@ (define_mode_attr Vel [(V8QI "qi") (V16QI "qi") > > (V2SF "sf") (V4SF "sf") > > (V2DF "df") (DF "df") > > (SI "si") (HI "hi") > > - (QI "qi") > > + (QI "qi") (V2HF "hf") > > (V4BF "bf") (V8BF "bf") > > (VNx16QI "qi") (VNx8QI "qi") (VNx4QI "qi") (VNx2QI "qi") > > (VNx8HI "hi") (VNx4HI "hi") (VNx2HI "hi") @@ -1866,7 > +1881,7 > > @@ (define_mode_attr q [(V8QI "") (V16QI "_q") > > (V4HF "") (V8HF "_q") > > (V4BF "") (V8BF "_q") > > (V2SF "") (V4SF "_q") > > - (V2DF "_q") > > + (V2HF "") (V2DF "_q") > > (QI "") (HI "") (SI "") (DI "") (HF "") (SF "") (DF "") > > (V2x8QI "") (V2x16QI "_q") > > (V2x4HI "") (V2x8HI "_q") > > @@ -1905,6 +1920,7 @@ (define_mode_attr vp [(V8QI "v") (V16QI "v") > > (V2SI "p") (V4SI "v") > > (V2DI "p") (V2DF "p") > > (V2SF "p") (V4SF "v") > > + (V2HF "p") > > (V4HF "v") (V8HF "v")]) > > > > (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi") diff --git > > a/gcc/config/arm/types.md b/gcc/config/arm/types.md index > > > 7d0504bdd944e9c0d1b545b0b66a9a1adc808714..3cfbc7a93cca1bea4925853e5 > 1d0 > > a147c5722247 100644 > > --- a/gcc/config/arm/types.md > > +++ b/gcc/config/arm/types.md > > @@ -483,6 +483,7 @@ (define_attr "autodetect_type" > > ; neon_fp_minmax_s_q > > ; neon_fp_minmax_d > > ; neon_fp_minmax_d_q > > +; neon_fp_reduc_add_h > > ; neon_fp_reduc_add_s > > ; neon_fp_reduc_add_s_q > > ; neon_fp_reduc_add_d > > @@ -1033,6 +1034,7 @@ (define_attr "type" > > neon_fp_minmax_d,\ > > neon_fp_minmax_d_q,\ > > \ > > + neon_fp_reduc_add_h,\ > > neon_fp_reduc_add_s,\ > > neon_fp_reduc_add_s_q,\ > > neon_fp_reduc_add_d,\ > > @@ -1257,8 +1259,8 @@ (define_attr "is_neon_type" "yes,no" > > neon_fp_compare_d, neon_fp_compare_d_q, neon_fp_minmax_s,\ > > neon_fp_minmax_s_q, neon_fp_minmax_d, > neon_fp_minmax_d_q,\ > > neon_fp_neg_s, neon_fp_neg_s_q, neon_fp_neg_d, > neon_fp_neg_d_q,\ > > - neon_fp_reduc_add_s, neon_fp_reduc_add_s_q, > neon_fp_reduc_add_d,\ > > - neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, > > + neon_fp_reduc_add_h, neon_fp_reduc_add_s, > neon_fp_reduc_add_s_q,\ > > + neon_fp_reduc_add_d, neon_fp_reduc_add_d_q, > > + neon_fp_reduc_minmax_s,\ > > neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ > > neon_fp_reduc_minmax_d_q,\ > > neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ diff --git > > a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > > b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > > index > > > 07d71a63414b1066ea431e287286ad048515711a..8e35e0b574d49913b43c7d8d > 4f4b > > a75f127f42e9 100644 > > --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > > @@ -30,11 +30,9 @@ vec_slp_##TYPE (TYPE *restrict a, TYPE b, TYPE c, in= t > n) \ > > TEST_ALL (VEC_PERM) > > > > /* We should use one DUP for each of the 8-, 16- and 32-bit types, > > - although we currently use LD1RW for _Float16. We should use two > > - DUPs for each of the three 64-bit types. */ > > + We should use two DUPs for each of the three 64-bit types. */ > > /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, [hw]} 2 } } > > */ > > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 2 } } > > */ > > -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 1 } } */ > > +/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 3 } } > > +*/ > > /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, [dx]} 9 } } > > */ > > /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, > > z[0-9]+\.d\n} 3 } } */ > > /* { dg-final { scan-assembler-not {\tzip2\t} } } */