From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2084.outbound.protection.outlook.com [40.107.20.84]) by sourceware.org (Postfix) with ESMTPS id 8E6AB3858C50 for ; Tue, 22 Nov 2022 16:01:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8E6AB3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=V40SMJkZRy+CeawMu2GK47RCLXr4mliGGkmRNfx6i/E=; b=7DYyNzxXG77unS/gpHAK2uPJCrHFrBMyXSHyXEVUTcAL+1xrTLREMYQfbDjfVT00rLqjySckGWs2hMExy60SPIv5L2bVFPBWIW6icwNB4VXHGhH4/GMHEUxfo4Ko0icwmpVkpWdqZbN+RBfUYLkUS2OHqGO8g+Mlk27sitMRimg= Received: from DB6PR0301CA0089.eurprd03.prod.outlook.com (2603:10a6:6:30::36) by PA4PR08MB7546.eurprd08.prod.outlook.com (2603:10a6:102:26c::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5857.17; Tue, 22 Nov 2022 16:01:33 +0000 Received: from DBAEUR03FT011.eop-EUR03.prod.protection.outlook.com (2603:10a6:6:30:cafe::ae) by DB6PR0301CA0089.outlook.office365.com (2603:10a6:6:30::36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5834.11 via Frontend Transport; Tue, 22 Nov 2022 16:01:33 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT011.mail.protection.outlook.com (100.127.142.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5834.8 via Frontend Transport; Tue, 22 Nov 2022 16:01:33 +0000 Received: ("Tessian outbound 6c699027a257:v130"); Tue, 22 Nov 2022 16:01:33 +0000 X-CR-MTA-TID: 64aa7808 Received: from abaf4d024691.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A383EB4D-1FE1-4F7F-A543-2F4BA7E8E247.1; Tue, 22 Nov 2022 16:01:23 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id abaf4d024691.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 22 Nov 2022 16:01:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=edJ1qQRb7PmBauXxYF6CyCzd9ZxRvxFwq6xP2ou//0CHyA6shLEhho+OmGwDGlLH10ZZpGWQjNaYm3uS+fOVsKi9kYTK0275MTVe0HFgm/fcEKB64jH8DhTzBkKKob6aNHn55dhGLKPEEbgdjSn5BcOAd8ocV6N9JtmaLZePEgHjnITdEjnRHeAKC+/zlhJ/XrYNfYDBGaywOqXQYbuhcPf594BggpxHy3rMEQ205qXtGpJ0GncS9g6QkYyn+82l+2cnmBJY/Ax6tu2+UjR08g6cTN+D/fcQLyya397lVEQOh+zDdT48987gBUvyfLp3o8+3bcWtG5tR9+q6CYX+XQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=V40SMJkZRy+CeawMu2GK47RCLXr4mliGGkmRNfx6i/E=; b=NmP9j+9Tpo6Yx8sdfObE+g5cYsbKfLpu0Op+oGESbucLgY3256gvDlEieGEDCljd5WZ2nC0qKWbDCJQ7fJnFmZzF9JX1o09Z6xlRvkZq4mJZaYXROhFK4JS9COpASiRTieUtnse/45sBElDW7uKjnpwPhpmhA69JikD+wHtMPER0k+Rc2hc4Mx72ayvPnbi/DIYszdaRFD2d6ojMqI7sI8sBYT6Aedh3vivcAEkj2DOML1qpUHEy0PhojVcwz+qKtatj+hguu5x4JgMXC+PHyZ6jqta+UxxZvtX0cKntWslNgul/5HRy/GB5Bmq2X9242xtaXRRD0j2a/tvGKnXZQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=V40SMJkZRy+CeawMu2GK47RCLXr4mliGGkmRNfx6i/E=; b=7DYyNzxXG77unS/gpHAK2uPJCrHFrBMyXSHyXEVUTcAL+1xrTLREMYQfbDjfVT00rLqjySckGWs2hMExy60SPIv5L2bVFPBWIW6icwNB4VXHGhH4/GMHEUxfo4Ko0icwmpVkpWdqZbN+RBfUYLkUS2OHqGO8g+Mlk27sitMRimg= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VI1PR08MB9960.eurprd08.prod.outlook.com (2603:10a6:800:1c3::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5857.17; Tue, 22 Nov 2022 16:01:18 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bd2a:aff9:b1a0:2fc7]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bd2a:aff9:b1a0:2fc7%4]) with mapi id 15.20.5857.017; Tue, 22 Nov 2022 16:01:18 +0000 From: Tamar Christina To: Tamar Christina , Richard Sandiford CC: "gcc-patches@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: RE: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. Thread-Topic: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. Thread-Index: AQHY7SAlWreqI6axPUuXHMahz+7ngK4qKs2fgA+wqFCAEWGTgA== Date: Tue, 22 Nov 2022 16:01:18 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 30AAB4A2651EF44E9C051600738530CE.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|VI1PR08MB9960:EE_|DBAEUR03FT011:EE_|PA4PR08MB7546:EE_ X-MS-Office365-Filtering-Correlation-Id: 90d7184b-ed85-4f33-00e1-08dacca2d376 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: UcUxoFklXBDpC+bDZw0E7R87vt08fT04BrPdbPQ5zOLvzN2/N++PcA5qHIhQiKNMQKU7VSIERBVYupuRcISsLnytJxlPQh/SoCy1/hdLxkqVfG0rfdQLVXSHLqt87gpGtU0CgdVnVzODhp5G3OvgtrM4yCr2FSOecbppK1DqinPomRdCWxdmmk6LzWGBng4V3QDPT53EyIl7se4dZELJcC9bVM+9VRQD2Jartj954pCqfqJiSUoTcEYEoc0HxHFdMISZDHajhdZfcssRUOk9KIy4Amlq1HweEhaIoJpKaUtEIiq0nOQ+/2BCXBvo1tINFSDizY9gW8l3J2xTJJLcLmZaYvFR7IR123rYXbZGkXqHoWEBGeXNJvxJQZ5EUVN1v7tyHpwKz86Q1aSREHeV+PZGA1cs+d77AGTFNi+f0PhtTUMGp7hxwCEUvINismmmBVS4aSLiUzPeaOI5jix6a1udF7e/Bq5sf6K3jtw8UFiPjQKvIc6/4GUQXthwJ2T6bHak1jN/GbuRUgUW3QGK85lTvyKAlB3datFIaxUs1gYwOIEwZI42xa1TH7SuKiIHPh5YQA287LGouxcNjgyAhdmcPnhwPM4E4wF1sfOimRb3WWiFSREYXyWAJvMehT1fPpw57K+SQyDuvy4sk+MX+D6IjK+S6oWRGCkk4+s/ZwYduYDOkbxHTa/EGfoo9pd2co8xg3hbFVZcHBWpsbuXoSGuqQS0ZpvNK0PLurTw/K7Ymnlj8E+aISlfn9mt0itQKmaHjCdybQDmjz57J/TmFg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(346002)(396003)(39860400002)(366004)(376002)(451199015)(122000001)(66946007)(54906003)(86362001)(38070700005)(38100700002)(2906002)(83380400001)(4326008)(8676002)(8936002)(30864003)(66446008)(55016003)(66476007)(64756008)(5660300002)(66556008)(41300700001)(52536014)(53546011)(9686003)(6506007)(26005)(71200400001)(186003)(6636002)(110136005)(316002)(76116006)(478600001)(7696005)(33656002)(84970400001)(579004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB9960 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT011.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f192972b-7e0f-47b2-476c-08dacca2caa9 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: N5aq/gZjV5Ui394NPte960f6zjU13yLOEPbv6OSD1IZsUOVX71WkdBrqe5Hg2trwZHWBD/+mKJkyRId6nnoEagdHR4FmZCxm2EW+dk7HZCDz81mlnFpZVHi+Vpsp3n0k4SWWbizTQRyGwgaSV3jd95/3a3ap4eF1jpVo4vlAWSusgkvV1ImIaV8NojQlKXvWmfhmIkJ6E3BpXlbe3qZUErsM5RQKnApTPGaTaOFLExZRh3z4UYq7u5zfdHhzYW/+IaAlPln2G0arVtXS53hHPwfoJkx4hnUZ3qKUpf9YZ7MHgWig6Uy2c2FsttFCrKUP8LgwXcZ6feWOrPJpTpSVV0mO2KvbM1XKyEojDC1iuZv8rJ1rOOaY4QN1qQEt7E+Wkm01jM1DHypIZsisvQ8bhZfx4yhjM1D+cuovIpalj/q9LNZTJ3dyvOybMpUDaLy2mrn4K3jcbtP9tfPExEEsDvkp8gdUGhRHq2dlnRH+V/MSQmYBCs/HvR8578NKevkZMe2qZtvIB9gSMZz2T/bMwTsq7rtBpeQWLY7HBjKvKBi94BZ61esAzrqPy/X5W8CWChlCmel1Z7z1D4RcD05J6sC8X41rHrazNU+s0+pqi452qbC4p59okdAL+tnnt//RqaIsjfNr9srkoc1NW6ptzHpra2s8GgvAkVSTgaSlqs9AF6HdJ5s6zfzFvayWuaPI50yelIiYw1SyLpkY+9qXIlpf4Oe2VpcGuZCuyXrOJeDR17RHFvm+6RY/tl3MplmLmNQ/rGfiVXPScLz8OCwHKw== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(396003)(136003)(376002)(39860400002)(346002)(451199015)(40470700004)(36840700001)(46966006)(36860700001)(8676002)(55016003)(336012)(86362001)(186003)(83380400001)(47076005)(110136005)(4326008)(40480700001)(30864003)(52536014)(40460700003)(33656002)(5660300002)(478600001)(41300700001)(82310400005)(70586007)(356005)(70206006)(81166007)(82740400003)(8936002)(54906003)(6636002)(26005)(316002)(9686003)(6506007)(53546011)(7696005)(2906002)(84970400001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2022 16:01:33.0840 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 90d7184b-ed85-4f33-00e1-08dacca2d376 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT011.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB7546 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ping > -----Original Message----- > From: Gcc-patches bounces+tamar.christina=3Darm.com@gcc.gnu.org> On Behalf Of Tamar > Christina via Gcc-patches > Sent: Friday, November 11, 2022 2:40 PM > To: Richard Sandiford > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: RE: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. >=20 > Hi, >=20 >=20 > > This name might cause confusion with the SVE iterators, where FULL > > means "every bit of the register is used". How about something like > > VMOVE instead? > > > > With this change, I guess VALL_F16 represents "The set of all modes > > for which the vld1 intrinsics are provided" and VMOVE or whatever is > > "All Advanced SIMD modes suitable for moving, loading, and storing". > > That is, VMOVE extends VALL_F16 with modes that are not manifested via > > intrinsics. > > >=20 > Done. >=20 > > Where is the 2h used, and is it valid syntax in that context? > > > > Same for later instances of 2h. >=20 > They are, but they weren't meant to be in this patch. They belong in a > separate FP16 series that I won't get to finish for GCC 13 due not being = able > to finish writing all the tests. I have moved them to that patch series = though. >=20 > While the addp patch series has been killed, this patch is still good sta= ndalone > and improves codegen as shown in the updated testcase. >=20 > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >=20 > Ok for master? >=20 > Thanks, > Tamar >=20 > gcc/ChangeLog: >=20 > * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New. > (mov, movmisalign, aarch64_dup_lane, > aarch64_store_lane0, aarch64_simd_vec_set, > @aarch64_simd_vec_copy_lane, vec_set, > reduc__scal_, reduc__scal_, > aarch64_reduc__internal, > aarch64_get_lane, > vec_init, vec_extract): Support V2HF. > (aarch64_simd_dupv2hf): New. > * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): > Add E_V2HFmode. > * config/aarch64/iterators.md (VHSDF_P): New. > (V2F, VMOVE, nunits, Vtype, Vmtype, Vetype, stype, VEL, > Vel, q, vp): Add V2HF. > * config/arm/types.md (neon_fp_reduc_add_h): New. >=20 > gcc/testsuite/ChangeLog: >=20 > * gcc.target/aarch64/sve/slp_1.c: Update testcase. >=20 > --- inline copy of patch --- >=20 > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > f4152160084d6b6f34bd69f0ba6386c1ab50f77e..487a31010245accec28e779661 > e6c2d578fca4b7 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -19,10 +19,10 @@ > ;; . >=20 > (define_expand "mov" > - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") > - (match_operand:VALL_F16 1 "general_operand"))] > + [(set (match_operand:VMOVE 0 "nonimmediate_operand") > + (match_operand:VMOVE 1 "general_operand"))] > "TARGET_SIMD" > - " > +{ > /* Force the operand into a register if it is not an > immediate whose use can be replaced with xzr. > If the mode is 16 bytes wide, then we will be doing @@ -46,12 +46,1= 1 @@ > (define_expand "mov" > aarch64_expand_vector_init (operands[0], operands[1]); > DONE; > } > - " > -) > +}) >=20 > (define_expand "movmisalign" > - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") > - (match_operand:VALL_F16 1 "general_operand"))] > + [(set (match_operand:VMOVE 0 "nonimmediate_operand") > + (match_operand:VMOVE 1 "general_operand"))] > "TARGET_SIMD && !STRICT_ALIGNMENT" > { > /* This pattern is not permitted to fail during expansion: if both arg= uments > @@ -73,6 +72,16 @@ (define_insn "aarch64_simd_dup" > [(set_attr "type" "neon_dup, neon_from_gp")] > ) >=20 > +(define_insn "aarch64_simd_dupv2hf" > + [(set (match_operand:V2HF 0 "register_operand" "=3Dw") > + (vec_duplicate:V2HF > + (match_operand:HF 1 "register_operand" "0")))] > + "TARGET_SIMD" > + "@ > + sli\\t%d0, %d1, 16" > + [(set_attr "type" "neon_shift_imm")] > +) > + > (define_insn "aarch64_simd_dup" > [(set (match_operand:VDQF_F16 0 "register_operand" "=3Dw,w") > (vec_duplicate:VDQF_F16 > @@ -85,10 +94,10 @@ (define_insn "aarch64_simd_dup" > ) >=20 > (define_insn "aarch64_dup_lane" > - [(set (match_operand:VALL_F16 0 "register_operand" "=3Dw") > - (vec_duplicate:VALL_F16 > + [(set (match_operand:VMOVE 0 "register_operand" "=3Dw") > + (vec_duplicate:VMOVE > (vec_select: > - (match_operand:VALL_F16 1 "register_operand" "w") > + (match_operand:VMOVE 1 "register_operand" "w") > (parallel [(match_operand:SI 2 "immediate_operand" "i")]) > )))] > "TARGET_SIMD" > @@ -142,6 +151,29 @@ (define_insn > "*aarch64_simd_mov" > mov_reg, neon_move")] > ) >=20 > +(define_insn "*aarch64_simd_movv2hf" > + [(set (match_operand:V2HF 0 "nonimmediate_operand" > + "=3Dw, m, m, w, ?r, ?w, ?r, w, w") > + (match_operand:V2HF 1 "general_operand" > + "m, Dz, w, w, w, r, r, Dz, Dn"))] > + "TARGET_SIMD_F16INST > + && (register_operand (operands[0], V2HFmode) > + || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" > + "@ > + ldr\\t%s0, %1 > + str\\twzr, %0 > + str\\t%s1, %0 > + mov\\t%0.2s[0], %1.2s[0] > + umov\\t%w0, %1.s[0] > + fmov\\t%s0, %1 > + mov\\t%0, %1 > + movi\\t%d0, 0 > + * return aarch64_output_simd_mov_immediate (operands[1], 32);" > + [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\ > + neon_logic, neon_to_gp, f_mcr,\ > + mov_reg, neon_move, neon_move")] > +) > + > (define_insn "*aarch64_simd_mov" > [(set (match_operand:VQMOV 0 "nonimmediate_operand" > "=3Dw, Umn, m, w, ?r, ?w, ?r, w") > @@ -182,7 +214,7 @@ (define_insn "*aarch64_simd_mov" >=20 > (define_insn "aarch64_store_lane0" > [(set (match_operand: 0 "memory_operand" "=3Dm") > - (vec_select: (match_operand:VALL_F16 1 "register_operand" > "w") > + (vec_select: (match_operand:VMOVE 1 "register_operand" > "w") > (parallel [(match_operand 2 "const_int_operand" > "n")])))] > "TARGET_SIMD > && ENDIAN_LANE_N (, INTVAL (operands[2])) =3D=3D 0" > @@ -1035,11 +1067,11 @@ (define_insn "one_cmpl2" > ) >=20 > (define_insn "aarch64_simd_vec_set" > - [(set (match_operand:VALL_F16 0 "register_operand" "=3Dw,w,w") > - (vec_merge:VALL_F16 > - (vec_duplicate:VALL_F16 > + [(set (match_operand:VMOVE 0 "register_operand" "=3Dw,w,w") > + (vec_merge:VMOVE > + (vec_duplicate:VMOVE > (match_operand: 1 > "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) > - (match_operand:VALL_F16 3 "register_operand" "0,0,0") > + (match_operand:VMOVE 3 "register_operand" "0,0,0") > (match_operand:SI 2 "immediate_operand" "i,i,i")))] > "TARGET_SIMD" > { > @@ -1061,14 +1093,14 @@ (define_insn "aarch64_simd_vec_set" > ) >=20 > (define_insn "@aarch64_simd_vec_copy_lane" > - [(set (match_operand:VALL_F16 0 "register_operand" "=3Dw") > - (vec_merge:VALL_F16 > - (vec_duplicate:VALL_F16 > + [(set (match_operand:VMOVE 0 "register_operand" "=3Dw") > + (vec_merge:VMOVE > + (vec_duplicate:VMOVE > (vec_select: > - (match_operand:VALL_F16 3 "register_operand" "w") > + (match_operand:VMOVE 3 "register_operand" "w") > (parallel > [(match_operand:SI 4 "immediate_operand" "i")]))) > - (match_operand:VALL_F16 1 "register_operand" "0") > + (match_operand:VMOVE 1 "register_operand" "0") > (match_operand:SI 2 "immediate_operand" "i")))] > "TARGET_SIMD" > { > @@ -1376,7 +1408,7 @@ (define_insn "vec_shr_" > ) >=20 > (define_expand "vec_set" > - [(match_operand:VALL_F16 0 "register_operand") > + [(match_operand:VMOVE 0 "register_operand") > (match_operand: 1 "aarch64_simd_nonimmediate_operand") > (match_operand:SI 2 "immediate_operand")] > "TARGET_SIMD" > @@ -3495,7 +3527,7 @@ (define_insn "popcount2" > ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP > smax/smin). > (define_expand "reduc__scal_" > [(match_operand: 0 "register_operand") > - (unspec: [(match_operand:VHSDF 1 "register_operand")] > + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] > FMAXMINV)] > "TARGET_SIMD" > { > @@ -3510,7 +3542,7 @@ (define_expand "reduc__scal_" >=20 > (define_expand "reduc__scal_" > [(match_operand: 0 "register_operand") > - (unspec: [(match_operand:VHSDF 1 "register_operand")] > + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] > FMAXMINNMV)] > "TARGET_SIMD" > { > @@ -3554,8 +3586,8 @@ (define_insn > "aarch64_reduc__internalv2si" > ) >=20 > (define_insn "aarch64_reduc__internal" > - [(set (match_operand:VHSDF 0 "register_operand" "=3Dw") > - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] > + [(set (match_operand:VHSDF_P 0 "register_operand" "=3Dw") > + (unspec:VHSDF_P [(match_operand:VHSDF_P 1 "register_operand" > + "w")] > FMAXMINV))] > "TARGET_SIMD" > "\\t%0, %1." > @@ -4200,7 +4232,7 @@ (define_insn > "*aarch64_get_lane_zero_extend" > (define_insn_and_split "aarch64_get_lane" > [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" > "=3D?r, w, Utv") > (vec_select: > - (match_operand:VALL_F16 1 "register_operand" "w, w, w") > + (match_operand:VMOVE 1 "register_operand" "w, w, w") > (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] > "TARGET_SIMD" > { > @@ -7981,7 +8013,7 @@ (define_expand "aarch64_st1" > ;; Standard pattern name vec_init. >=20 > (define_expand "vec_init" > - [(match_operand:VALL_F16 0 "register_operand") > + [(match_operand:VMOVE 0 "register_operand") > (match_operand 1 "" "")] > "TARGET_SIMD" > { > @@ -8060,7 +8092,7 @@ (define_insn "aarch64_urecpe" >=20 > (define_expand "vec_extract" > [(match_operand: 0 "aarch64_simd_nonimmediate_operand") > - (match_operand:VALL_F16 1 "register_operand") > + (match_operand:VMOVE 1 "register_operand") > (match_operand:SI 2 "immediate_operand")] > "TARGET_SIMD" > { > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.c= c > index > 84dbe2f4ea7d03b424602ed98a34e7824217dc91..35671cb86e374f9ded21d0e4 > 944c63bc2cbc0901 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -3566,6 +3566,7 @@ aarch64_classify_vector_mode (machine_mode > mode) > case E_V8BFmode: > case E_V4SFmode: > case E_V2DFmode: > + case E_V2HFmode: > return TARGET_SIMD ? VEC_ADVSIMD : 0; >=20 > default: > diff --git a/gcc/config/aarch64/iterators.md > b/gcc/config/aarch64/iterators.md index > 37d8161a33b1c399d80be82afa67613a087389d4..dfcf86a440e316c2abdbcc6463 > 63d39e458d1a91 100644 > --- a/gcc/config/aarch64/iterators.md > +++ b/gcc/config/aarch64/iterators.md > @@ -160,6 +160,10 @@ (define_mode_iterator VDQF [V2SF V4SF V2DF]) > (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST") > (V8HF "TARGET_SIMD_F16INST") > V2SF V4SF V2DF]) > +;; Advanced SIMD Float modes suitable for pairwise operations. > +(define_mode_iterator VHSDF_P [(V4HF "TARGET_SIMD_F16INST") > + (V8HF "TARGET_SIMD_F16INST") > + V2SF V4SF V2DF (V2HF > "TARGET_SIMD_F16INST")]) >=20 > ;; Advanced SIMD Float modes, and DF. > (define_mode_iterator VDQF_DF [V2SF V4SF V2DF DF]) @@ -188,15 +192,23 > @@ (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF V2DI]) > (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF]) >=20 > ;; Advanced SIMD Float modes with 2 elements. > -(define_mode_iterator V2F [V2SF V2DF]) > +(define_mode_iterator V2F [V2SF V2DF V2HF]) >=20 > ;; All Advanced SIMD modes on which we support any arithmetic operations= . > (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF > V4SF V2DF]) >=20 > -;; All Advanced SIMD modes suitable for moving, loading, and storing. > +;; All Advanced SIMD modes suitable for moving, loading, and storing ;; > +except V2HF. > (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI > V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) >=20 > +;; All Advanced SIMD modes suitable for moving, loading, and storing ;; > +including V2HF (define_mode_iterator VMOVE [V8QI V16QI V4HI V8HI V2SI > +V4SI V2DI > + V4HF V8HF V4BF V8BF V2SF V4SF V2DF > + (V2HF "TARGET_SIMD_F16INST")]) > + > + > ;; The VALL_F16 modes except the 128-bit 2-element ones. > (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI > V4SI > V4HF V8HF V2SF V4SF]) > @@ -1076,7 +1088,7 @@ (define_mode_attr nunits [(V8QI "8") (V16QI "16") > (V2SF "2") (V4SF "4") > (V1DF "1") (V2DF "2") > (DI "1") (DF "1") > - (V8DI "8")]) > + (V8DI "8") (V2HF "2")]) >=20 > ;; Map a mode to the number of bits in it, if the size of the mode ;; i= s > constant. > @@ -1090,6 +1102,7 @@ (define_mode_attr s [(HF "h") (SF "s") (DF "d") (SI > "s") (DI "d")]) >=20 > ;; Give the length suffix letter for a sign- or zero-extension. > (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) > +(define_mode_attr sizel [(QI "b") (HI "h") (SI "")]) >=20 > ;; Give the number of bits in the mode > (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) @@ -11= 93,7 > +1206,7 @@ (define_mode_attr Vmntype [(V8HI ".8b") (V4SI ".4h") > (define_mode_attr Vetype [(V8QI "b") (V16QI "b") > (V4HI "h") (V8HI "h") > (V2SI "s") (V4SI "s") > - (V2DI "d") > + (V2DI "d") (V2HF "h") > (V4HF "h") (V8HF "h") > (V2SF "s") (V4SF "s") > (V2DF "d") > @@ -1285,7 +1298,7 @@ (define_mode_attr Vcwtype [(VNx16QI "b") > (VNx8QI "h") (VNx4QI "w") (VNx2QI "d") ;; more accurately. > (define_mode_attr stype [(V8QI "b") (V16QI "b") (V4HI "s") (V8HI "s") > (V2SI "s") (V4SI "s") (V2DI "d") (V4HF "s") > - (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") > + (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") (V2HF > "s") > (HF "s") (SF "s") (DF "d") (QI "b") (HI "s") > (SI "s") (DI "d")]) >=20 > @@ -1360,8 +1373,8 @@ (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") > (V4HF "HF") (V8HF "HF") > (V2SF "SF") (V4SF "SF") > (DF "DF") (V2DF "DF") > - (SI "SI") (HI "HI") > - (QI "QI") > + (SI "SI") (V2HF "HF") > + (QI "QI") (HI "HI") > (V4BF "BF") (V8BF "BF") > (VNx16QI "QI") (VNx8QI "QI") (VNx4QI "QI") (VNx2QI > "QI") > (VNx8HI "HI") (VNx4HI "HI") (VNx2HI "HI") @@ -1381,7 > +1394,7 @@ (define_mode_attr Vel [(V8QI "qi") (V16QI "qi") > (V2SF "sf") (V4SF "sf") > (V2DF "df") (DF "df") > (SI "si") (HI "hi") > - (QI "qi") > + (QI "qi") (V2HF "hf") > (V4BF "bf") (V8BF "bf") > (VNx16QI "qi") (VNx8QI "qi") (VNx4QI "qi") (VNx2QI "qi") > (VNx8HI "hi") (VNx4HI "hi") (VNx2HI "hi") @@ -1866,7 > +1879,7 @@ (define_mode_attr q [(V8QI "") (V16QI "_q") > (V4HF "") (V8HF "_q") > (V4BF "") (V8BF "_q") > (V2SF "") (V4SF "_q") > - (V2DF "_q") > + (V2HF "") (V2DF "_q") > (QI "") (HI "") (SI "") (DI "") (HF "") (SF "") (DF "") > (V2x8QI "") (V2x16QI "_q") > (V2x4HI "") (V2x8HI "_q") > @@ -1905,6 +1918,7 @@ (define_mode_attr vp [(V8QI "v") (V16QI "v") > (V2SI "p") (V4SI "v") > (V2DI "p") (V2DF "p") > (V2SF "p") (V4SF "v") > + (V2HF "p") > (V4HF "v") (V8HF "v")]) >=20 > (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi") diff --git > a/gcc/config/arm/types.md b/gcc/config/arm/types.md index > 7d0504bdd944e9c0d1b545b0b66a9a1adc808714..3cfbc7a93cca1bea4925853e5 > 1d0a147c5722247 100644 > --- a/gcc/config/arm/types.md > +++ b/gcc/config/arm/types.md > @@ -483,6 +483,7 @@ (define_attr "autodetect_type" > ; neon_fp_minmax_s_q > ; neon_fp_minmax_d > ; neon_fp_minmax_d_q > +; neon_fp_reduc_add_h > ; neon_fp_reduc_add_s > ; neon_fp_reduc_add_s_q > ; neon_fp_reduc_add_d > @@ -1033,6 +1034,7 @@ (define_attr "type" > neon_fp_minmax_d,\ > neon_fp_minmax_d_q,\ > \ > + neon_fp_reduc_add_h,\ > neon_fp_reduc_add_s,\ > neon_fp_reduc_add_s_q,\ > neon_fp_reduc_add_d,\ > @@ -1257,8 +1259,8 @@ (define_attr "is_neon_type" "yes,no" > neon_fp_compare_d, neon_fp_compare_d_q, neon_fp_minmax_s,\ > neon_fp_minmax_s_q, neon_fp_minmax_d, neon_fp_minmax_d_q,\ > neon_fp_neg_s, neon_fp_neg_s_q, neon_fp_neg_d, > neon_fp_neg_d_q,\ > - neon_fp_reduc_add_s, neon_fp_reduc_add_s_q, > neon_fp_reduc_add_d,\ > - neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, > + neon_fp_reduc_add_h, neon_fp_reduc_add_s, > neon_fp_reduc_add_s_q,\ > + neon_fp_reduc_add_d, neon_fp_reduc_add_d_q, > + neon_fp_reduc_minmax_s,\ > neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ > neon_fp_reduc_minmax_d_q,\ > neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ diff --git > a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > index > 07d71a63414b1066ea431e287286ad048515711a..e6021c5a42748701e5326a5c3 > 87a39a0bbadc9e5 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c > @@ -30,11 +30,9 @@ vec_slp_##TYPE (TYPE *restrict a, TYPE b, TYPE c, int = n) > \ > TEST_ALL (VEC_PERM) >=20 > /* We should use one DUP for each of the 8-, 16- and 32-bit types, > - although we currently use LD1RW for _Float16. We should use two > - DUPs for each of the three 64-bit types. */ > + We should use two DUPs for each of the three 64-bit types. */ > /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, [hw]} 2 } } */ > -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 2 } } */ > -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 1 } } */ > +/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 3 } } */ > /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, [dx]} 9 } } */ > /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[= 0- > 9]+\.d\n} 3 } } */ > /* { dg-final { scan-assembler-not {\tzip2\t} } } */ @@ -53,7 +51,7 @@ > TEST_ALL (VEC_PERM) > /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s} 6 } } */ > /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d} 6 } } */ > /* { dg-final { scan-assembler-not {\tldr} } } */ > -/* { dg-final { scan-assembler-times {\tstr} 2 } } */ > -/* { dg-final { scan-assembler-times {\tstr\th[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-not {\tstr} } } */ > +/* { dg-final { scan-assembler-not {\tstr\th[0-9]+} } } */ >=20 > /* { dg-final { scan-assembler-not {\tuqdec} } } */