From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2049.outbound.protection.outlook.com [40.107.6.49]) by sourceware.org (Postfix) with ESMTPS id 2D2A83858C5E for ; Tue, 10 Oct 2023 14:05:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2D2A83858C5E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CfWHBXzaDlcOcvil9W62+iLGVaAPihnvtdjJo1Xs5mI=; b=scaWlqbVBgF1KTnwrgxTjnWY9B7kVkzFYV3c41SLn1X2RHPYY0msJMnkR9jOKbSewtntXhW83yo+CoQ+zLiN4Qe4vOIrHXFlXa+eapIY6Yo8zGkJQMroJODTRfv9ClbMlRIzZdrINQA0mHHU5glh49AIYmxoSiLMuw/sNvwtdwE= Received: from DU2PR04CA0352.eurprd04.prod.outlook.com (2603:10a6:10:2b4::23) by AM8PR08MB6372.eurprd08.prod.outlook.com (2603:10a6:20b:369::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.37; Tue, 10 Oct 2023 14:05:08 +0000 Received: from DBAEUR03FT056.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:2b4:cafe::2) by DU2PR04CA0352.outlook.office365.com (2603:10a6:10:2b4::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.38 via Frontend Transport; Tue, 10 Oct 2023 14:05:08 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT056.mail.protection.outlook.com (100.127.142.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.23 via Frontend Transport; Tue, 10 Oct 2023 14:05:08 +0000 Received: ("Tessian outbound 6d14f3380669:v211"); Tue, 10 Oct 2023 14:05:08 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d32462cc16a87ba9 X-CR-MTA-TID: 64aa7808 Received: from ae9a8ad54743.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9E5FB32A-6536-4947-AE86-C26EB3A5DDF9.1; Tue, 10 Oct 2023 14:05:01 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ae9a8ad54743.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 10 Oct 2023 14:05:01 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZH6yj7V00/Qw6jPTYULUdeJ5eyozyyNv6RaS5T1bb+MHFPi0a8liN1B80O3S/tEXyFU1e0OHzU14/LKNCD6XSmr+ijIPEOuMuEVxrStLfOKCJ7mpOYONZdoQuUxwUbg8r7tdB6vYSKBb6Tf1lNItZf6+N+hdB2J+BJnLXEvfK/KDKv7tgbuRc6Mtu+SW2tzWLpsIVMGRe/m85OUyTJPt8bQxy8KdNbErxbTIdSX9MKlXCKFAOc70tjT6WBY2zZnhbe478c2W5v1BunxsHDmgmKk5u5vzUxoovx91hH50g3cMb43gIsLkGPoiSdCbUm6hLBeZDcuklZugRnqY15hM/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CfWHBXzaDlcOcvil9W62+iLGVaAPihnvtdjJo1Xs5mI=; b=gMKP0FqdQA+MnwbqsylaMDfbFvVoZrHs9hfsW1vzfn8wC3uQS81LXW66dGwSzMDFicg5fSI643A8JElxg0jqUkRX8I2NtIUuMiiNlacl9hAmNl2THnLnnBEbY+Kr4l1POGdu4a2Ck9CsJi9jU8c3wPIxiiP1bczRZOsbUIuXswUpH9WAMo2PBLbZSAVNqRU24D+UTrRPwf5rykz3qTr7H9QSAMpe26DGNJC7greICyXvxu7FOGsQsCGt3wmYnzV+Xp5XvXa5pOzL+R9gc0uCrV9xlYE7a7Xr4M0tCzOpE3JJ1XLXZfX39WniJThinDmFBhz7gxXCW6X0yEYW3R5g+A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CfWHBXzaDlcOcvil9W62+iLGVaAPihnvtdjJo1Xs5mI=; b=scaWlqbVBgF1KTnwrgxTjnWY9B7kVkzFYV3c41SLn1X2RHPYY0msJMnkR9jOKbSewtntXhW83yo+CoQ+zLiN4Qe4vOIrHXFlXa+eapIY6Yo8zGkJQMroJODTRfv9ClbMlRIzZdrINQA0mHHU5glh49AIYmxoSiLMuw/sNvwtdwE= Received: from DB8PR06CA0045.eurprd06.prod.outlook.com (2603:10a6:10:120::19) by GV2PR08MB9399.eurprd08.prod.outlook.com (2603:10a6:150:df::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.38; Tue, 10 Oct 2023 14:04:58 +0000 Received: from DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:120:cafe::32) by DB8PR06CA0045.outlook.office365.com (2603:10a6:10:120::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.31 via Frontend Transport; Tue, 10 Oct 2023 14:04:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DBAEUR03FT063.mail.protection.outlook.com (100.127.142.255) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6886.23 via Frontend Transport; Tue, 10 Oct 2023 14:04:57 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Tue, 10 Oct 2023 14:04:56 +0000 Received: from e127754.arm.com (10.57.5.240) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Tue, 10 Oct 2023 14:04:56 +0000 From: To: CC: , Subject: [PATCH 1/3] [GCC] arm: vst1q_types_x2 ACLE intrinsics Date: Tue, 10 Oct 2023 15:04:43 +0100 Message-ID: <20231010140445.2084-2-Ezra.Sitorus@arm.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20231010140445.2084-1-Ezra.Sitorus@arm.com> References: <20231010140445.2084-1-Ezra.Sitorus@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DBAEUR03FT063:EE_|GV2PR08MB9399:EE_|DBAEUR03FT056:EE_|AM8PR08MB6372:EE_ X-MS-Office365-Filtering-Correlation-Id: 5ac90ca8-e1ff-43b8-c6b1-08dbc999e958 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: NFGQ6j6bgQ2CYIkGSVqJ4py9ruM4Bblw84BRSuK12zdKBsx2J2tjAfavF63YqFauYvGp+h2A6VnZBhnI8G9CobwuRGru9Z3M+PG3CEMm2Ndwzka++cUioRtUaOGLmkJLRqpt7l367dVeNUe6kj1KDxwtYB1OSHhltm6k87hRx4MwIb/JMYO6HAGw/HmrEEU5sD2yR95BwK5BwTOn9jQo5Y632JFDEv9FHD0pj0AjrPRkyh/IOGw7Wes2HWoq/RsWs1Xn4+CLShgy42k5MvG3xcC7Ibwy6jPqXa5XrZ5FwveAhGzpLbxdXzrNxGRFeFmSgjcU905WHuNVN4ECBN+6tIs+XkmbiYYgQPlBILOjGhuenLV8PcM+yZyYSVAp7KZv73zfAhd4NzSxYPME0SCOH5zKTAgvrP2B79/jb+X4EzTBJSEoyJ1QBOB3rJmmL9xwiyM7bW4oYml9X3awi4NoobG6hOyvBkPXS5SSvuPG/u8r6Ubb87NwR7op/wxC+quY1cu7OgAmvMB2kc5W1CUlpox6vAHcjIhF3tUjtxWUBtTAv6hNkYmr7c9PAIHBrlAtYQH0F0PtjsFq5QJFMTE3qnNF1H5dIOcxh/L9knWvS5zMiAQMdr9GxllVYpFbCF5vFj+3hEofmI+W+yMNr7YZFlvOCmIATO14j6KeliXYAHguX4bO9U3AcdTq8F70/A6dCCzZU5fddw+f7oyYDbLPJ3GfgMk3uSS15VtignVcbYcBba1LGoLcB7p56knxNg+BFdWdFsCGevPvOs1WwoDNTPkppVkGuWn/bS/vSNYeAqivMQwJ3O40hL71ZK28lGk+D3CasWI35ei4lHnw9m2d3A== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:nebula.arm.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(396003)(39860400002)(346002)(376002)(136003)(230922051799003)(64100799003)(451199024)(186009)(1800799009)(82310400011)(36840700001)(40470700004)(46966006)(84970400001)(1076003)(40460700003)(356005)(36756003)(40480700001)(86362001)(82740400003)(81166007)(36860700001)(336012)(26005)(47076005)(426003)(2906002)(83380400001)(2876002)(30864003)(966005)(2616005)(6666004)(7696005)(478600001)(8676002)(4326008)(6916009)(8936002)(41300700001)(5660300002)(70586007)(70206006)(54906003)(316002)(36900700001)(357404004);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB9399 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a86b876d-644f-407c-50cf-08dbc999e2e8 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: O4yA74jgB4chKm4tsfcZUU+vmMQUrf9ed9IyWCu9K7I5atc1lqRb2jFEQXVgcEL1nQqfMTde++AXhKh1LfkV6ANl1AEWsZEVW/ERoFpIJGv2u7ckS8+RMLk1+U/ts31m/n1Tws10p6WLdzBSkH41ccPnTsG6MRoXkbic1TjeavF+HUCu57Tt/YjoFiL1iDuv3vfMIqOCyjgX5amG2A+sgPlsS8W3msKQa0iQ7N/FtZkBdb0SfrDLSPZKAwYCGDs6xXVQvQpV06z1UXyADCuuYGN24ZlbvCRV/3YIeB9QsJgPyUDCQXPi/sl8jk4AiRwBDAGFIEmnN3VK7Pe3LsKsBxu5JJ/VMu0i+w+mT9jv006Fz8fLWzWdS3lOkLQsyD036mRdog9jHvUaOhpmh3X9O/yaaAmMCvxKba62pur7wgJcsfondeCBqphJWRdFl5Rsv3gWXRUivd3l+BVPWcHfDq3794QB53y3AYPhiGZhMoE5IgMyiiWzpHqt4rzofFyK+uEhJF8qRBYRz3nHSZYgOWu84kysASEz+hjm6IVOgO1Hs9VUsFog2np6m61iGd5/FvxhUfIt0J8TUpkTDfZlTvQ3zJDrR1U548RgAjFdBIs0TPSTpEBusqWaklJ3/W/rzDDw6acwnl7ESKuff655y5N3zvafED5dSn9XhlKTJuNn6AmvjC8b+OhJGyDJte7UIkzj7Mr5ZK0LcIVGV6vRyjpCLphi3C2NUOaDoL6z6d2s0HuKLXmsK2CZR9PBY7aBRruTIvjCdoBGKEHA+ZMBUjLL3bUKWSIwir5uLfgTu4I= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(376002)(346002)(136003)(39860400002)(396003)(230922051799003)(186009)(1800799009)(82310400011)(64100799003)(451199024)(36840700001)(46966006)(40470700004)(84970400001)(40460700003)(36756003)(30864003)(86362001)(2876002)(2906002)(5660300002)(4326008)(8676002)(8936002)(41300700001)(82740400003)(54906003)(81166007)(70206006)(6916009)(426003)(336012)(7696005)(36860700001)(40480700001)(316002)(70586007)(47076005)(6666004)(83380400001)(478600001)(26005)(2616005)(966005)(1076003)(357404004);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Oct 2023 14:05:08.4989 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5ac90ca8-e1ff-43b8-c6b1-08dbc999e958 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR08MB6372 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for AArch32. This patch adds the _x2 variants of the vst1q intrinsic. Tests use xN so that the latter variants (_x3, _x4) could be added. ACLE documents are at https://developer.arm.com/documentation/ihi0053/latest/ ISA documents are at https://developer.arm.com/documentation/ddi0487/latest/ gcc/ChangeLog: * config/arm/arm_neon.h (vst1q_u8_x2, vst1q_u16_x2, vst1q_u32_x2, vst1q_u64_x32): New. (vst1q_s8_x2, vst1q_s16_x2, vst1q_s32_x2, vst1q_s64_x2): New. (vst1q_f16_x2, vst1q_f32_x2): New. (vst1q_p8_x2, vst1q_p16_x2, vst1q_p64_x2): New. (vst1q_bf16_x2): New. * config/arm/arm_neon_builtins.def (vst1<_x2): New entries. * config/arm/neon.md (neon_vst1_x2): Updated from neon_vst1_x2. * config/arm/iterators.md (VMEMX2): New mode iterator. (VMEMX2_q): New mode attribute. gcc/testsuite/ChangeLog: * gcc.target/arm/simd/vst1q_base_xN_1.c: Add new tests. * gcc.target/arm/simd/vst1q_bf16_xN_1.c: Add new tests. * gcc.target/arm/simd/vst1q_fp16_xN_1.c: Add new tests. * gcc.target/arm/simd/vst1q_p64_xN_1.c: Add new tests. --- gcc/config/arm/arm_neon.h | 114 ++++++++++++++++++ gcc/config/arm/arm_neon_builtins.def | 1 + gcc/config/arm/iterators.md | 6 + gcc/config/arm/neon.md | 6 +- .../gcc.target/arm/simd/vst1q_base_xN_1.c | 70 +++++++++++ .../gcc.target/arm/simd/vst1q_bf16_xN_1.c | 13 ++ .../gcc.target/arm/simd/vst1q_fp16_xN_1.c | 13 ++ .../gcc.target/arm/simd/vst1q_p64_xN_1.c | 13 ++ 8 files changed, 233 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h index 41e645d8352..b8f3fca3060 100644 --- a/gcc/config/arm/arm_neon.h +++ b/gcc/config/arm/arm_neon.h @@ -11327,6 +11327,38 @@ vst1_s64_x2 (int64_t * __a, int64x1x2_t __b) __builtin_neon_vst1_x2di ((__builtin_neon_di *) __a, __bu.__o); } +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_s8_x2 (int8_t * __a, int8x16x2_t __b) +{ + union { int8x16x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v16qi ((__builtin_neon_qi *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_s16_x2 (int16_t * __a, int16x8x2_t __b) +{ + union { int16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v8hi ((__builtin_neon_hi *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_s32_x2 (int32_t * __a, int32x4x2_t __b) +{ + union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v4si ((__builtin_neon_si *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_s64_x2 (int64_t * __a, int64x2x2_t __b) +{ + union { int64x2x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v2di ((__builtin_neon_di *) __a, __bu.__o); +} + __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vst1_s8_x3 (int8_t * __a, int8x8x3_t __b) @@ -11656,6 +11688,14 @@ vst1q_p64 (poly64_t * __a, poly64x2_t __b) __builtin_neon_vst1v2di ((__builtin_neon_di *) __a, (int64x2_t) __b); } +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_p64_x2 (poly64_t * __a, poly64x2x2_t __b) +{ + union { poly64x2x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v2di ((__builtin_neon_di *) __a, __bu.__o); +} + #pragma GCC pop_options __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) @@ -11701,6 +11741,24 @@ vst1q_f32 (float32_t * __a, float32x4_t __b) __builtin_neon_vst1v4sf ((__builtin_neon_sf *) __a, __b); } +#if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE) +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_f16_x2 (float16_t * __a, float16x8x2_t __b) +{ + union { float16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v8hf (__a, __bu.__o); +} +#endif + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_f32_x2 (float32_t * __a, float32x4x2_t __b) +{ + union { float32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v4sf (__a, __bu.__o); +} + __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vst1q_u8 (uint8_t * __a, uint8x16_t __b) @@ -11729,6 +11787,38 @@ vst1q_u64 (uint64_t * __a, uint64x2_t __b) __builtin_neon_vst1v2di ((__builtin_neon_di *) __a, (int64x2_t) __b); } +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_u8_x2 (uint8_t * __a, uint8x16x2_t __b) +{ + union { uint8x16x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v16qi ((__builtin_neon_qi *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_u16_x2 (uint16_t * __a, uint16x8x2_t __b) +{ + union { uint16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v8hi ((__builtin_neon_hi *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_u32_x2 (uint32_t * __a, uint32x4x2_t __b) +{ + union { uint32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v4si ((__builtin_neon_si *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_u64_x2 (uint64_t * __a, uint64x2x2_t __b) +{ + union { uint64x2x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v2di ((__builtin_neon_di *) __a, __bu.__o); +} + __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vst1q_p8 (poly8_t * __a, poly8x16_t __b) @@ -11743,6 +11833,22 @@ vst1q_p16 (poly16_t * __a, poly16x8_t __b) __builtin_neon_vst1v8hi ((__builtin_neon_hi *) __a, (int16x8_t) __b); } +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_p8_x2 (poly8_t * __a, poly8x16x2_t __b) +{ + union { poly8x16x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v16qi ((__builtin_neon_qi *) __a, __bu.__o); +} + +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_p16_x2 (poly16_t * __a, poly16x8x2_t __b) +{ + union { poly16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v8hi ((__builtin_neon_hi *) __a, __bu.__o); +} + __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vst1_lane_s8 (int8_t * __a, int8x8_t __b, const int __c) @@ -20419,6 +20525,14 @@ vst1q_bf16 (bfloat16_t * __a, bfloat16x8_t __b) __builtin_neon_vst1v8bf (__a, __b); } +__extension__ extern __inline void +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vst1q_bf16_x2 (bfloat16_t * __a, bfloat16x8x2_t __b) +{ + union { bfloat16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b }; + __builtin_neon_vst1q_x2v8bf (__a, __bu.__o); +} + __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vst2_bf16 (bfloat16_t * __ptr, bfloat16x4x2_t __val) diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def index 95300cb0fe4..496d267fab8 100644 --- a/gcc/config/arm/arm_neon_builtins.def +++ b/gcc/config/arm/arm_neon_builtins.def @@ -309,6 +309,7 @@ VAR12 (LOAD1LANE, vld1_lane, VAR10 (LOAD1, vld1_dup, v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di) VAR7 (STORE1, vst1_x2, v8qi, v4hi, v2si, di, v4hf, v2sf, v4bf) +VAR7 (STORE1, vst1q_x2, v16qi, v8hi, v4si, v2di, v8hf, v4sf, v8bf) VAR7 (STORE1, vst1_x3, v8qi, v4hi, v2si, di, v4hf, v2sf, v4bf) VAR7 (STORE1, vst1_x4, v8qi, v4hi, v2si, di, v4hf, v2sf, v4bf) VAR14 (STORE1, vst1, diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index a9803538101..6c5a80d9348 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -141,6 +141,9 @@ ;; Opaque structure types used in table lookups (except vtbl1/vtbx1). (define_mode_iterator VTAB [TI EI OI]) +;; Opaque structure types for x2 variants of VSTR1/VSTR1Q or VLD1/VLD1Q. +(define_mode_iterator VMEMX2 [TI OI]) + ;; Widenable modes. (define_mode_iterator VW [V8QI V4HI V2SI]) @@ -1533,6 +1536,9 @@ ;; vtbl suffix for NEON vector modes. (define_mode_attr VTAB_n [(TI "2") (EI "3") (OI "4")]) +;; Suffix for x2 variants of vld1 and vst1. +(define_mode_attr VMEMX2_q [(TI "") (OI "q")]) + ;; fp16 or bf16 marker for 16-bit float modes. (define_mode_attr fporbf [(HF "fp16") (BF "bf16")]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index f5d583129fa..088277ee6ed 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -5125,9 +5125,9 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST1))] "TARGET_NEON") -(define_insn "neon_vst1_x2" - [(set (match_operand:TI 0 "neon_struct_operand" "=Um") - (unspec:TI [(match_operand:TI 1 "s_register_operand" "w") +(define_insn "neon_vst1_x2" + [(set (match_operand:VMEMX2 0 "neon_struct_operand" "=Um") + (unspec:VMEMX2 [(match_operand:VMEMX2 1 "s_register_operand" "w") (unspec:VDQX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] UNSPEC_VST1))] "TARGET_NEON" diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c b/gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c new file mode 100644 index 00000000000..232feafade0 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vst1q_base_xN_1.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-save-temps -O2" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" + + +void test_vst1q_u8_x2 (uint8_t * ptr, uint8x16x2_t val) +{ + vst1q_u8_x2 (ptr, val); +} + +void test_vst1q_u16_x2 (uint16_t * ptr, uint16x8x2_t val) +{ + vst1q_u16_x2 (ptr, val); +} + +void test_vst1q_u32_x2 (uint32_t * ptr, uint32x4x2_t val) +{ + vst1q_u32_x2 (ptr, val); +} + +void test_vst1q_u64_x2 (uint64_t * ptr, uint64x2x2_t val) +{ + vst1q_u64_x2 (ptr, val); +} + +void test_vst1q_s8_x2 (int8_t * ptr, int8x16x2_t val) +{ + vst1q_s8_x2 (ptr, val); +} + +void test_vst1q_s16_x2 (int16_t * ptr, int16x8x2_t val) +{ + vst1q_s16_x2 (ptr, val); +} + +void test_vst1q_s32_x2 (int32_t * ptr, int32x4x2_t val) +{ + vst1q_s32_x2 (ptr, val); +} + +void test_vst1q_s64_x2 (int64_t * ptr, int64x2x2_t val) +{ + vst1q_s64_x2 (ptr, val); +} + +void test_vst1q_f32_x2 (float32_t * ptr, float32x4x2_t val) +{ + vst1q_f32_x2 (ptr, val); +} + +void test_vst1q_p8_x2 (poly8_t * ptr, poly8x16x2_t val) +{ + vst1q_p8_x2 (ptr, val); +} + +void test_vst1q_p16_x2 (poly16_t * ptr, poly16x8x2_t val) +{ + vst1q_p16_x2 (ptr, val); +} + +/* { dg-final { scan-assembler-times {vst1.8\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+\]\n} 3 } } */ + +/* { dg-final { scan-assembler-times {vst1.16\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+\]\n} 3 } } */ + +/* { dg-final { scan-assembler-times {vst1.32\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+\]\n} 3 } } */ + +/* { dg-final { scan-assembler-times {vst1.64\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+:64\]\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c b/gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c new file mode 100644 index 00000000000..2a4579f0aae --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vst1q_bf16_xN_1.c @@ -0,0 +1,13 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */ +/* { dg-options "-save-temps -O2" } */ +/* { dg-add-options arm_v8_2a_bf16_neon } */ + +#include "arm_neon.h" + +void test_vst1q_bf16_x2 (bfloat16_t * ptr, bfloat16x8x2_t val) +{ + vst1q_bf16_x2 (ptr, val); +} + +/* { dg-final { scan-assembler-times {vst1.16\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+\]\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c b/gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c new file mode 100644 index 00000000000..61a7e558c48 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vst1q_fp16_xN_1.c @@ -0,0 +1,13 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_neon_fp16_ok } */ +/* { dg-options "-save-temps -O2" } */ +/* { dg-add-options arm_neon_fp16 } */ + +#include "arm_neon.h" + +void test_vst1q_f16_x2 (float16_t * ptr, float16x8x2_t val) +{ + vst1q_f16_x2 (ptr, val); +} + +/* { dg-final { scan-assembler-times {vst1.16\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+\]\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c b/gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c new file mode 100644 index 00000000000..82f3dad293c --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vst1q_p64_xN_1.c @@ -0,0 +1,13 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_crypto_ok } */ +/* { dg-options "-save-temps -O2" } */ +/* { dg-add-options arm_crypto } */ + +#include "arm_neon.h" + +void test_vst1q_p64_x2 (poly64_t * ptr, poly64x2x2_t val) +{ + vst1q_p64_x2 (ptr, val); +} + +/* { dg-final { scan-assembler-times {vst1.64\t\{d[0-9]+-d[0-9]+\}, \[r[0-9]+:64\]\n} 1 } } */ -- 2.25.1