From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on2079.outbound.protection.outlook.com [40.107.14.79]) by sourceware.org (Postfix) with ESMTPS id A44683858C50 for ; Thu, 9 Feb 2023 12:43:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A44683858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=uxTrT3zNqrscBTuJzwDntm0cVG1CPc5AodXY2ixVzR8=; b=mv104iWBvz0DrwCvnVHgYdZl6d0wBOBDg0nYECZF9ha0mEsUE2AMGPhFqLZk8czVfNb/YORVRrLkrNICBRqv3NYBUEYl7PWJDOpXhlXHBJ55lDHrTYDSqvf/wpHKyN4IniaHRYB1+YlE8U3qUo1ymxmn7xgUq4ixf7KSvFpi93E= Received: from AS9PR04CA0083.eurprd04.prod.outlook.com (2603:10a6:20b:48b::29) by DB9PR08MB9684.eurprd08.prod.outlook.com (2603:10a6:10:460::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.17; Thu, 9 Feb 2023 12:43:35 +0000 Received: from AM7EUR03FT005.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:48b:cafe::65) by AS9PR04CA0083.outlook.office365.com (2603:10a6:20b:48b::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.18 via Frontend Transport; Thu, 9 Feb 2023 12:43:34 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT005.mail.protection.outlook.com (100.127.140.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.19 via Frontend Transport; Thu, 9 Feb 2023 12:43:34 +0000 Received: ("Tessian outbound 43b0faad5a68:v132"); Thu, 09 Feb 2023 12:43:34 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 0e1b5e1697a6a97a X-CR-MTA-TID: 64aa7808 Received: from bc8ef5e99e9b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id ABB9CB78-8D66-450C-BF6F-3A6AD743832F.1; Thu, 09 Feb 2023 12:43:26 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id bc8ef5e99e9b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 09 Feb 2023 12:43:26 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VQLr9cF0734PmX2g7o0I2EB9dQWoXUntvJmh1XcM+4pY31ila35es86+++osFuK6H78akDtBEEan/0+qstC96zkEbzJx0n++dv5iOapa7K4wLNaophzJCYho8qK0ShwK+FreHL51SlFWy5nHQN3PH9Tnflu0/jvy4i2MfbKxwCabArzSd0BH2Sh2BJXHOdGvkdLRTV9rcBVmrh+q9QX1x3CmWtOiKewkws6QrUyJw9h9rZsLEZkp52hQoqfo+Srv1p+PuN+dePYKpFGpsUkAmoCoFc+vyU2knAFFioqtXCRl0bQcTNL+sAcrcsrIOWgS2RxxuiGN3MW9qh2RzTFogA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uxTrT3zNqrscBTuJzwDntm0cVG1CPc5AodXY2ixVzR8=; b=RK82mDPuQCumXI9D9RDReryOr3LFhKxHBUaLzzezeRZXfQXrVYth6yI9dT1EGc0jyYanLzrno+wfydRgpIR8tovRNjDnhWfopPRz+UEBnhEhw3G0utm6zLJduFMnTiNbISoV1PqzXL04ozVxPB2Tl0qIIQMKVTh05Nr2TW5epa0lYeCSzhGHLvM88hfqy40OPYCM4udn6VZUOIPZJnJSpvt3QurtciZLgMqRFdkd8O+ezQKl/MxuBhOBP6Q1EoPqevK8zzYUtB8hIsVh5SsE9zpkMG7f6Egs9Q9PLJrW1rxbhKF1empLusXEmFMyfcj5EAvrkEIkubdsh3cGSavb1w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=uxTrT3zNqrscBTuJzwDntm0cVG1CPc5AodXY2ixVzR8=; b=mv104iWBvz0DrwCvnVHgYdZl6d0wBOBDg0nYECZF9ha0mEsUE2AMGPhFqLZk8czVfNb/YORVRrLkrNICBRqv3NYBUEYl7PWJDOpXhlXHBJ55lDHrTYDSqvf/wpHKyN4IniaHRYB1+YlE8U3qUo1ymxmn7xgUq4ixf7KSvFpi93E= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from AM6PR08MB4951.eurprd08.prod.outlook.com (2603:10a6:20b:eb::29) by PA4PR08MB7620.eurprd08.prod.outlook.com (2603:10a6:102:261::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.18; Thu, 9 Feb 2023 12:43:19 +0000 Received: from AM6PR08MB4951.eurprd08.prod.outlook.com ([fe80::c36:af12:845a:cfdd]) by AM6PR08MB4951.eurprd08.prod.outlook.com ([fe80::c36:af12:845a:cfdd%4]) with mapi id 15.20.6064.034; Thu, 9 Feb 2023 12:43:19 +0000 Message-ID: <6565ff59-1a71-bdca-83f3-1f8b4f8f2e35@arm.com> Date: Thu, 9 Feb 2023 12:43:17 +0000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [PATCH] [RFC] Proposal for implementing AArch64 port of libmvec To: Adhemerval Zanella Netto , libc-alpha@sourceware.org, Szabolcs Nagy References: <20230207113555.66008-1-Joe.Ramsay@arm.com> Content-Language: en-US From: Joe Ramsay In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: LO4P123CA0579.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:276::20) To AM6PR08MB4951.eurprd08.prod.outlook.com (2603:10a6:20b:eb::29) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: AM6PR08MB4951:EE_|PA4PR08MB7620:EE_|AM7EUR03FT005:EE_|DB9PR08MB9684:EE_ X-MS-Office365-Filtering-Correlation-Id: 4da17461-27be-40d5-6c73-08db0a9b421e x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: cKsS5LFXXI+Om4SrRRY9pyE1Fjs6vAX28os71jCs/iwuPVnemrZEYP1jVfOPRtZRVpcwimL1upfMZCIJHKgO7T33MZ5t9QwUi9XgV6uxxJQl/aFC6eIiVWO9KGQrgrJC+Sd4uu8z73h4g3nAaWsVkUNa9PdFrRTuQ3uKddZlR3jhroNeHalz4dV9k1SJ4JjKxNkbpLhwW7/kv5H8SGEVG0InQXp7VUmL2enlCIRuoEglJSfVQnOpE+nf2eX5uokBNDaIvPRW13aMLrMHXhjlFAtD9ErXUCCmefvm/3lYI1FZhUCnYC4siSdnsD8XVyvR158A/VDwqqWFVvvnspmwvk3mjdhH3miMa56cSpJE3V6SSD2k6ZsrUpX8Yi7KOl/thHPpUrSzqF1FJ+Es4tdIdh1af2eT2B7QReWFseddLlLawhidY9nfxCycSRwpWJ0If8lyeqgXQQ5+fSwT2k+ZjTZjcPqjJVfxBCR3CgLCW1hqHXstXLExVv3Z4COsuNae+zFgQpBoJA9Pd7pvnS52tc1CriTaEz+vLf4RORxX3x13YGtuLZpgc5JUVItyyb61MMK3O6MFfkBZ0HjMOHb6KU9bH+DuD8OM2gj+4On/n15v2FbeIXqvi24i0+M5Dxs8ZZ7nn96cyOUfqZwTfLEh59eYaelYkvaJi1p/r7QPlNVylrBoRyvQvx54X5ZpuDsWDKzCQg8nXt8etXl8mkVturoljrDNFJGjsdiyERVPVM2/fj/VRu3zxVJuX2DUo4FWO+6A2aJBSsUImoa5lciLcW8ZViQ6ml5SrOoRec4AoAwh6V62rhSGgngzZL/Qc50U X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AM6PR08MB4951.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(39860400002)(346002)(376002)(136003)(396003)(366004)(451199018)(30864003)(66899018)(2906002)(44832011)(186003)(31696002)(86362001)(2616005)(8936002)(5660300002)(26005)(36756003)(6486002)(53546011)(478600001)(55236004)(6506007)(31686004)(83380400001)(6512007)(38100700002)(316002)(66946007)(41300700001)(8676002)(66476007)(66556008)(6636002)(110136005)(358444002)(2004002)(43740500002)(45980500001)(559001)(579004);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB7620 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT005.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 342ad06e-7852-4e2a-3ec8-08db0a9b3895 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /ubKf2DNyC3BI/tZKVOjPAo/AOtebd0AhujJAwz33ByOYW/iUWxmL4dCP17jKKcAUIaHjp8cAqWIc7IkPZ7AScSD5UYl5wofgxJ8SLzY1XYWi4DotNYMz4hPWwKOzEVTtmzkoT8b4a0uM9D5bAvWozGBXvD2nnlGGnJR91wODzFPd50WDTkzbHQkuP+YNWEL5fxuVeu8z/O66L5Hy8T9WxZNYJuNQu6XfwnndIXkq307KVVb5jostpxWNAHYd0imUw9+8pK0Y/LeyKQdR3PpqGDw5/JEeIYtmx33kGNwYYBcqQ/6iQfz2w3dvwXJF3pIeqLJfVXHVNNP0hHa/p85fl1hY+xk9IZ6soX02pQ2pdf1sxYPOfEZ6yPLwhQWcj8ra8ZU/qr09dyDGjM3fbyzYq9lmmEuTtlRP/AGmOOqJjtjhq+azjHxf30FM9HIr0q7T7WxM3w6v2IzdzeFVezYvLBC+hLQfhGkFaAKy4M8UhjEVy5Ie1hlNxLNn52RFa8g4qisw0aYgc1iCEtFYOVmiqutW374TGvPzJkqKBwZKECRv8csvzY8/84NfcgaoAAPmDWZG+/EcbT85zW11uV6Z3jZ+99bDECFAup6Kg5c9n4MJzLCmvtcW/FYDp3K51+ZOlIZNWacKaHdl0tVDQxSKe6uKj81yzS/fGuKPR/bdliR5GrC0YLdLAVjoQFcu89xwaFinQgJBsnMuevKQnNoI6Q9Gp9eiww9MKyuOm7VQp1e2bhvVxQaKKj8YdIkTao9Doak+L2PJvFt6tFU6wclJEPTfDoUCXjm+8JHc+TIk0VMdysynXFoeV+y3TIVHwEC X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230025)(4636009)(346002)(136003)(396003)(376002)(39860400002)(451199018)(46966006)(36840700001)(40470700004)(81166007)(82740400003)(66899018)(31686004)(86362001)(336012)(40480700001)(356005)(2616005)(82310400005)(31696002)(40460700003)(36860700001)(478600001)(8936002)(70206006)(36756003)(2906002)(8676002)(41300700001)(83380400001)(26005)(6512007)(186003)(110136005)(6486002)(6636002)(316002)(47076005)(6506007)(70586007)(53546011)(5660300002)(30864003)(44832011)(358444002)(2004002)(43740500002)(579004)(559001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Feb 2023 12:43:34.7774 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4da17461-27be-40d5-6c73-08db0a9b421e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT005.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9684 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Thanks for the comments. I will attempt a patch that addresses them, in the meantime just a few questions: On 08/02/2023 13:11, Adhemerval Zanella Netto wrote: > > > On 07/02/23 08:35, Joe Ramsay via Libc-alpha wrote: >> Hi, >> >> The attached patch is an attempt to enable libmvec on AArch64. The >> proposed change is mainly implementing build infrastructure to add the >> new routines to ABI, tests and benchmarks. I have demonstrated how >> this all fits together by adding implementations for vector cos, in >> both single and double precision, targeting both Advanced SIMD and >> SVE. >> >> The implementations of the routines themselves are just loops over the >> scalar routine from libm for now, as we are more concerned with >> getting the plumbing right at this point. We plan to contribute vector >> routines from the Arm Optimized Routines repo that are compliant with >> requirements described in the libmvec wiki. >> >> Any comments/thoughts much appreciated! In particular, the patch >> raises the minimum GCC to 10, in order to be able to submit routines >> written using ACLE instead of assembly. This is clearly a big jump, >> but we have options if this is not acceptable. One option would be to >> submit compiler-generated assembly, similar to the equivalent routines >> under sysdeps/x86_64. If GCC 9 is an acceptable compromise then this >> would only have to be for SVE routines. > > Using C implementation with intrinsics would be idea, there are more easily > maintained and can leverage compiler improvements. I rather do it instead > of the assembly dump Intel did. > > The minimum GCC 10 is not ideal, however I don't see it as blocker either > (it should be up to arch-maintainers). One option might be check if > compiler does not support building libmvec, disable the build and related > checks. It is not ideal either, since the resulting glibc won't have > a complete ABI. > > OK, let's see what arch-maintainers make of it. >> >> Also, are there plans to merge libmvec into libm, or will they be kept >> separate? > > There is none afaik. The libpthread, librt, etc. merge was done to > fix long standing design and maintanance issues that is not really presented > with libm and libmvec. There is still the partial upgrade one, but > it is still present with a disjoint ld, libc, libm anyway. > > However, it is feasible to merge if your willing to work on it. We will > need to keep the x86_64 lib with the sentinel compat symbol (similar to > what we did for libpthread). > > What I would like to avoid is to have different arquitectures using different > approaches, for instance aarch64 begin merged while having x86_64 still > using a different library. It add a slight more complexity to the build > process and extra arch specific boilerplate code. > This sounds good - keeping them separate would be our choice too. I have not come across the sentinel compat symbol - is this something we need to do for AArch64 also? >> >> Note that at this point users have to manually call the vector math >> functions, there is no declaration in math.h to assist auto >> vectorization of scalar math calls. This seems to be acceptable to >> some downstream users. > > I think that's the current approach for x86_64 anyway, since most usages > are done through compiler autovectorization code. > > Some comments below. > >> >> Thanks, >> Joe >> --- >> INSTALL | 3 + >> manual/install.texi | 3 + >> sysdeps/aarch64/configure | 28 ++++++ >> sysdeps/aarch64/configure.ac | 20 ++++ >> sysdeps/aarch64/fpu/Makefile | 66 +++++++++++++ >> sysdeps/aarch64/fpu/Versions | 8 ++ >> sysdeps/aarch64/fpu/advsimd_utils.h | 39 ++++++++ >> sysdeps/aarch64/fpu/bench-libmvec-skeleton.c | 83 +++++++++++++++++ >> sysdeps/aarch64/fpu/bits/math-vector.h | 65 +++++++++++++ >> sysdeps/aarch64/fpu/cos_advsimd.c | 28 ++++++ >> sysdeps/aarch64/fpu/cos_sve.c | 27 ++++++ >> sysdeps/aarch64/fpu/cosf_advsimd.c | 28 ++++++ >> sysdeps/aarch64/fpu/cosf_sve.c | 27 ++++++ >> sysdeps/aarch64/fpu/libm-test-ulps | 7 ++ >> sysdeps/aarch64/fpu/libm-test-ulps-name | 1 + >> sysdeps/aarch64/fpu/math-tests-arch.h | 34 +++++++ >> .../fpu/scripts/bench_libmvec_advsimd.py | 91 ++++++++++++++++++ >> .../aarch64/fpu/scripts/bench_libmvec_sve.py | 93 +++++++++++++++++++ >> sysdeps/aarch64/fpu/sve_utils.h | 55 +++++++++++ >> .../fpu/test-double-advsimd-wrappers.c | 26 ++++++ >> sysdeps/aarch64/fpu/test-double-advsimd.h | 25 +++++ >> .../aarch64/fpu/test-double-sve-wrappers.c | 34 +++++++ >> sysdeps/aarch64/fpu/test-double-sve.h | 26 ++++++ >> .../aarch64/fpu/test-float-advsimd-wrappers.c | 26 ++++++ >> sysdeps/aarch64/fpu/test-float-advsimd.h | 25 +++++ >> sysdeps/aarch64/fpu/test-float-sve-wrappers.c | 34 +++++++ >> sysdeps/aarch64/fpu/test-float-sve.h | 26 ++++++ >> .../aarch64/fpu/test-vpcs-vector-wrapper.h | 30 ++++++ >> .../unix/sysv/linux/aarch64/libmvec.abilist | 4 + >> 29 files changed, 962 insertions(+) >> create mode 100644 sysdeps/aarch64/fpu/Makefile >> create mode 100644 sysdeps/aarch64/fpu/Versions >> create mode 100644 sysdeps/aarch64/fpu/advsimd_utils.h >> create mode 100644 sysdeps/aarch64/fpu/bench-libmvec-skeleton.c >> create mode 100644 sysdeps/aarch64/fpu/bits/math-vector.h >> create mode 100644 sysdeps/aarch64/fpu/cos_advsimd.c >> create mode 100644 sysdeps/aarch64/fpu/cos_sve.c >> create mode 100644 sysdeps/aarch64/fpu/cosf_advsimd.c >> create mode 100644 sysdeps/aarch64/fpu/cosf_sve.c >> create mode 100644 sysdeps/aarch64/fpu/libm-test-ulps >> create mode 100644 sysdeps/aarch64/fpu/libm-test-ulps-name >> create mode 100644 sysdeps/aarch64/fpu/math-tests-arch.h >> create mode 100644 sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py >> create mode 100755 sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py >> create mode 100644 sysdeps/aarch64/fpu/sve_utils.h >> create mode 100644 sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c >> create mode 100644 sysdeps/aarch64/fpu/test-double-advsimd.h >> create mode 100644 sysdeps/aarch64/fpu/test-double-sve-wrappers.c >> create mode 100644 sysdeps/aarch64/fpu/test-double-sve.h >> create mode 100644 sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c >> create mode 100644 sysdeps/aarch64/fpu/test-float-advsimd.h >> create mode 100644 sysdeps/aarch64/fpu/test-float-sve-wrappers.c >> create mode 100644 sysdeps/aarch64/fpu/test-float-sve.h >> create mode 100644 sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h >> create mode 100644 sysdeps/unix/sysv/linux/aarch64/libmvec.abilist >> >> diff --git a/INSTALL b/INSTALL >> index 970d6627e2..ba800e41d6 100644 >> --- a/INSTALL >> +++ b/INSTALL >> @@ -524,6 +524,9 @@ build the GNU C Library: >> For s390x architecture builds, GCC 7.1 or higher is needed (See gcc >> Bug 98269). >> >> + For AArch64 architecture builds with mathvec enabled, GCC 10 or >> + higher is needed due to dependency on arm_sve.h. >> + >> For multi-arch support it is recommended to use a GCC which has >> been built with support for GNU indirect functions. This ensures >> that correct debugging information is generated for functions >> diff --git a/manual/install.texi b/manual/install.texi >> index 260f8a5c82..e9c62b51ae 100644 >> --- a/manual/install.texi >> +++ b/manual/install.texi >> @@ -567,6 +567,9 @@ For ARC architecture builds, GCC 8.3 or higher is needed. >> >> For s390x architecture builds, GCC 7.1 or higher is needed (See gcc Bug 98269). >> >> +For AArch64 architecture builds with mathvec enabled, GCC 10 or higher is needed >> +due to dependency on arm_sve.h. >> + >> For multi-arch support it is recommended to use a GCC which has been built with >> support for GNU indirect functions. This ensures that correct debugging >> information is generated for functions selected by IFUNC resolvers. This >> diff --git a/sysdeps/aarch64/configure b/sysdeps/aarch64/configure >> index 2130f6b8f8..a71c32d70f 100644 >> --- a/sysdeps/aarch64/configure >> +++ b/sysdeps/aarch64/configure >> @@ -327,3 +327,31 @@ if test $libc_cv_aarch64_sve_asm = yes; then >> $as_echo "#define HAVE_AARCH64_SVE_ASM 1" >>confdefs.h >> >> fi >> + >> +# Check if the local system can run SVE binary >> +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for local SVE hardware" >&5 >> +$as_echo_n "checking for local SVE hardware... " >&6; } >> +if ${libc_cv_can_run_sve+:} false; then : >> + $as_echo_n "(cached) " >&6 >> +else >> + cat > conftest.c <> +#include >> +int main(void) { >> + if (! (getauxval (AT_HWCAP) & HWCAP_SVE)) >> + return 1; >> + return 0; >> +} >> +EOF >> + libc_cv_can_run_sve=yes >> + ${CC-cc} conftest.c -o conftest >> + ./conftest || libc_cv_can_run_sve=no >> + rm -f conftest* >> +fi >> +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_can_run_sve" >&5 >> +$as_echo "$libc_cv_can_run_sve" >&6; } >> +config_vars="$config_vars >> +aarch64-can-run-sve = $libc_cv_can_run_sve" >> + >> +if test x"$build_mathvec" = xnotset; then >> + build_mathvec=yes >> +fi >> diff --git a/sysdeps/aarch64/configure.ac b/sysdeps/aarch64/configure.ac >> index 85c6f76508..688f8772a6 100644 >> --- a/sysdeps/aarch64/configure.ac >> +++ b/sysdeps/aarch64/configure.ac >> @@ -101,3 +101,23 @@ rm -f conftest*]) >> if test $libc_cv_aarch64_sve_asm = yes; then >> AC_DEFINE(HAVE_AARCH64_SVE_ASM) >> fi >> + >> +# Check if the local system can run SVE binary >> +AC_CACHE_CHECK(for local SVE hardware, libc_cv_can_run_sve, [dnl >> + cat > conftest.c <> +#include >> +int main(void) { >> + if (! (getauxval (AT_HWCAP) & HWCAP_SVE)) >> + return 1; >> + return 0; >> +} >> +EOF >> + libc_cv_can_run_sve=yes >> + ${CC-cc} conftest.c -o conftest >> + ./conftest || libc_cv_can_run_sve=no >> + rm -f conftest*]) >> +LIBC_CONFIG_VAR([aarch64-can-run-sve], [$libc_cv_can_run_sve]) >> + >> +if test x"$build_mathvec" = xnotset; then >> + build_mathvec=yes >> +fi >> diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile >> new file mode 100644 >> index 0000000000..caf5d60669 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/Makefile >> @@ -0,0 +1,66 @@ >> +float-advsimd-funcs = cos >> + >> +double-advsimd-funcs = cos >> + >> +float-sve-funcs = cos >> + >> +double-sve-funcs = cos >> + >> +ifeq ($(subdir),mathvec) >> +libmvec-support = $(addsuffix f_advsimd,$(float-advsimd-funcs)) \ >> + $(addsuffix _advsimd,$(double-advsimd-funcs)) \ >> + $(addsuffix f_sve,$(float-sve-funcs)) \ >> + $(addsuffix _sve,$(double-sve-funcs)) >> +endif >> + >> +sve-cflags = -march=armv8-a+sve >> + >> + >> +ifeq ($(build-mathvec),yes) >> +bench-libmvec = $(addprefix float-advsimd-,$(float-advsimd-funcs)) \ >> + $(addprefix double-advsimd-,$(double-advsimd-funcs)) >> + >> +# If not on an SVE-enabled machine, do not add SVE routines to benchmarks. >> +# The routines are still built. >> +ifeq ($(aarch64-can-run-sve),yes) >> + bench-libmvec += $(addprefix float-sve-,$(float-sve-funcs)) \ >> + $(addprefix double-sve-,$(double-sve-funcs)) >> +endif >> +endif >> + >> +$(objpfx)bench-float-advsimd-%.c: >> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py $(basename $(@F)) > $@ >> +$(objpfx)bench-double-advsimd-%.c: >> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py $(basename $(@F)) > $@ >> +$(objpfx)bench-float-sve-%.c: >> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py $(basename $(@F)) > $@ >> +$(objpfx)bench-double-sve-%.c: >> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py $(basename $(@F)) > $@ >> + >> +ifeq (${STATIC-BENCHTESTS},yes) >> +libmvec-benchtests = $(common-objpfx)mathvec/libmvec.a $(common-objpfx)math/libm.a >> +else >> +libmvec-benchtests = $(libmvec) $(libm) >> +endif >> + >> +$(addprefix $(objpfx)bench-,$(bench-libmvec)): $(libmvec-benchtests) >> + >> +ifeq ($(build-mathvec),yes) >> +libmvec-tests += float-advsimd double-advsimd float-sve double-sve >> +endif >> + >> +define sve-float-cflags-template >> +CFLAGS-$(1)f_sve.c += $(sve-cflags) >> +CFLAGS-bench-float-sve-$(1).c += $(sve-cflags) >> +endef >> + >> +define sve-double-cflags-template >> +CFLAGS-$(1)_sve.c += $(sve-cflags) >> +CFLAGS-bench-double-sve-$(1).c += $(sve-cflags) >> +endef >> + >> +$(foreach f,$(float-sve-funcs), $(eval $(call sve-float-cflags-template,$(f)))) >> +$(foreach f,$(double-sve-funcs), $(eval $(call sve-double-cflags-template,$(f)))) >> + >> +CFLAGS-test-float-sve-wrappers.c = $(sve-cflags) >> +CFLAGS-test-double-sve-wrappers.c = $(sve-cflags) >> diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions >> new file mode 100644 >> index 0000000000..5222a6f180 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/Versions >> @@ -0,0 +1,8 @@ >> +libmvec { >> + GLIBC_2.38 { >> + _ZGVnN2v_cos; >> + _ZGVnN4v_cosf; >> + _ZGVsMxv_cos; >> + _ZGVsMxv_cosf; >> + } >> +} >> diff --git a/sysdeps/aarch64/fpu/advsimd_utils.h b/sysdeps/aarch64/fpu/advsimd_utils.h >> new file mode 100644 >> index 0000000000..b597a18b8f >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/advsimd_utils.h >> @@ -0,0 +1,39 @@ >> +/* Helpers for Advanced SIMD vector math funtions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#define VPCS_ATTR __attribute__ ((aarch64_vector_pcs)) >> + >> +#define V_NAME_F1(fun) _ZGVnN4v_##fun##f >> +#define V_NAME_D1(fun) _ZGVnN2v_##fun >> +#define V_NAME_F2(fun) _ZGVnN4vv_##fun##f >> +#define V_NAME_D2(fun) _ZGVnN2vv_##fun >> + >> +static inline float32x4_t > > You might considere using __always_inline here if the idea is to use this functions > as macros. > >> +v_call_f32 (float (*f) (float), float32x4_t x) >> +{ >> + return (float32x4_t){f (x[0]), f (x[1]), f (x[2]), f (x[3])}; >> +} >> + >> +static inline float64x2_t >> +v_call_f64 (double (*f) (double), float64x2_t x) >> +{ >> + return (float64x2_t){f (x[0]), f (x[1])}; >> +} >> diff --git a/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c b/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c >> new file mode 100644 >> index 0000000000..ca6a10d1fe >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c >> @@ -0,0 +1,83 @@ >> +/* Skeleton for libmvec benchmark programs. >> + Copyright (C) 2021-2023 Free Software Foundation, Inc. > > I think Copyright year is only 2023 here. > Carlos flagged this up as well. This file was copied, with some modifications, from sysdeps/x86_64/fpu/bench-libmvec-skeleton.c - is it OK for it to get a brand new copyright header despite not being brand new code? >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include >> +#define D_ITERS 10000 >> + >> +int >> +main (int argc, char **argv) >> +{ >> + unsigned long i, k; >> + timing_t start, end; >> + json_ctx_t json_ctx;> + >> + bench_start (); >> + >> +#ifdef BENCH_INIT >> + BENCH_INIT (); >> +#endif >> + >> + json_init (&json_ctx, 2, stdout); >> + >> + /* Begin function. */ >> + json_attr_object_begin (&json_ctx, FUNCNAME); >> + >> + for (int v = 0; v < NUM_VARIANTS; v++) >> + { >> + double d_total_time = 0; >> + timing_t cur; >> + for (k = 0; k < D_ITERS; k++) >> + { >> + TIMING_NOW (start); >> + for (i = 0; i < NUM_SAMPLES (v); i++) >> + BENCH_FUNC (v, i); >> + TIMING_NOW (end); >> + >> + TIMING_DIFF (cur, start, end); >> + >> + TIMING_ACCUM (d_total_time, cur); >> + } >> + double d_total_data_set = D_ITERS * NUM_SAMPLES (v) * STRIDE; >> + >> + /* Begin variant. */ >> + json_attr_object_begin (&json_ctx, VARIANT (v)); >> + >> + json_attr_double (&json_ctx, "duration", d_total_time); >> + json_attr_double (&json_ctx, "iterations", d_total_data_set); >> + json_attr_double (&json_ctx, "mean", d_total_time / d_total_data_set); >> + >> + /* End variant. */ >> + json_attr_object_end (&json_ctx); >> + } >> + >> + /* End function. */ >> + json_attr_object_end (&json_ctx); >> + >> + return 0; >> +} > > This file is quite similar to x86_64 modulo the extra CPU_FEATURE_ACTIVE checks. > Maybe try to refactor to use a common definition and parametrize the x86 code > on a arch-specific code? > >> diff --git a/sysdeps/aarch64/fpu/bits/math-vector.h b/sysdeps/aarch64/fpu/bits/math-vector.h >> new file mode 100644 >> index 0000000000..a25845bff8 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/bits/math-vector.h >> @@ -0,0 +1,65 @@ >> +/* Platform-specific SIMD declarations of math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#ifndef _MATH_H >> +# error "Never include directly;\ >> + include instead." >> +#endif >> + >> +/* Get default empty definitions for simd declarations. */ >> +#include >> + >> +#if __GNUC_PREREQ (9, 0) > > I think these tests should move to configure tests instead, it advertises beforehand > the user that it needs to update the compiler instead through a compiler error. > > The configure check will then check for both advsimd and SVE support, so there is > no need for __ADVSIMD_VEC_MATH_SUPPORTED or __SVE_VEC_MATH_SUPPORTED. > Apologies, I don't quite understand what you mean by this. I put these tests in so that users could compile against the new symbols with math.h as long as they had a sufficiently new compiler, but wouldn't get undefined types in math.h if they were using an old compiler that didn't have e.g. __Float32x4_t. (I think I remarked in the original message that new symbols hadn't been added to math.h, but this was not correct). I don't see how this relates to configure, since this isn't for the benefit of library builders, but maybe I have misunderstood? We could put a separate check in at configure time for a compiler which is sufficient for vector types, but that would not IMO make these tests redundant. Let me know what you think. >> +# define __ADVSIMD_VEC_MATH_SUPPORTED >> +typedef __Float32x4_t __f32x4_t; >> +typedef __Float64x2_t __f64x2_t; >> +#elif __clang_major__ >= 8 >> +# define __ADVSIMD_VEC_MATH_SUPPORTED >> +typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t; >> +typedef __attribute__((__neon_vector_type__(2))) double __f64x2_t; >> +#endif >> + >> +#if __GNUC_PREREQ (10, 0) || __clang_major >= 11 >> +# define __SVE_VEC_MATH_SUPPORTED >> +typedef __SVFloat32_t __sv_f32_t; >> +typedef __SVFloat64_t __sv_f64_t; >> +typedef __SVBool_t __sv_bool_t; >> +#endif >> + >> +/* If vector types and vector PCS are unsupported in the working >> + compiler, no choice but to omit vector math declarations. */ >> + >> +#ifdef __ADVSIMD_VEC_MATH_SUPPORTED >> + >> +# define __vpcs __attribute__((__aarch64_vector_pcs__)) >> + >> +__vpcs __f32x4_t _ZGVnN4v_cosf (__f32x4_t); >> +__vpcs __f64x2_t _ZGVnN2v_cos (__f64x2_t); >> + >> +#undef __ADVSIMD_VEC_MATH_SUPPORTED >> +#endif /* __ADVSIMD_VEC_MATH_SUPPORTED */ >> + >> +#ifdef __SVE_VEC_MATH_SUPPORTED >> + >> +__sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t); >> +__sv_f64_t _ZGVsMxv_cos (__sv_f64_t, __sv_bool_t); >> + >> +#undef __SVE_VEC_MATH_SUPPORTED >> +#endif /* __SVE_VEC_MATH_SUPPORTED */ >> + >> diff --git a/sysdeps/aarch64/fpu/cos_advsimd.c b/sysdeps/aarch64/fpu/cos_advsimd.c >> new file mode 100644 >> index 0000000000..5a42fbb182 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/cos_advsimd.c >> @@ -0,0 +1,28 @@ >> +/* Double-precision vector (Advanced SIMD) cos function. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "advsimd_utils.h" >> + >> +VPCS_ATTR >> +float64x2_t V_NAME_D1 (cos) (float64x2_t x) >> +{ >> + return v_call_f64 (cos, x); >> +} >> diff --git a/sysdeps/aarch64/fpu/cos_sve.c b/sysdeps/aarch64/fpu/cos_sve.c >> new file mode 100644 >> index 0000000000..62bd2ece0e >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/cos_sve.c >> @@ -0,0 +1,27 @@ >> +/* Double-precision vector (SVE) cos function. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "sve_utils.h" >> + >> +svfloat64_t SV_NAME_D1 (cos) (svfloat64_t x, svbool_t pg) >> +{ >> + return sv_call_f64 (cos, x, svdup_n_f64 (0), pg); >> +} >> diff --git a/sysdeps/aarch64/fpu/cosf_advsimd.c b/sysdeps/aarch64/fpu/cosf_advsimd.c >> new file mode 100644 >> index 0000000000..23f54bd905 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/cosf_advsimd.c >> @@ -0,0 +1,28 @@ >> +/* Single-precision vector (Advanced SIMD) cos function. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "advsimd_utils.h" >> + >> +VPCS_ATTR >> +float32x4_t V_NAME_F1 (cos) (float32x4_t x) >> +{ >> + return v_call_f32 (cosf, x); >> +} >> diff --git a/sysdeps/aarch64/fpu/cosf_sve.c b/sysdeps/aarch64/fpu/cosf_sve.c >> new file mode 100644 >> index 0000000000..0c4e365e1e >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/cosf_sve.c >> @@ -0,0 +1,27 @@ >> +/* Single-precision vector (SVE) cos function. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "sve_utils.h" >> + >> +svfloat32_t SV_NAME_F1 (cos) (svfloat32_t x, svbool_t pg) >> +{ >> + return sv_call_f32 (cosf, x, svdup_n_f32 (0), pg); >> +} >> diff --git a/sysdeps/aarch64/fpu/libm-test-ulps b/sysdeps/aarch64/fpu/libm-test-ulps >> new file mode 100644 >> index 0000000000..b199d7ddab >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/libm-test-ulps >> @@ -0,0 +1,7 @@ >> +Function: "cos_advsimd": >> +double: 2 >> +float: 2 >> + >> +Function: "cos_sve": >> +double: 2 >> +float: 2 >> \ No newline at end of file > > Bogus line feed here. > >> diff --git a/sysdeps/aarch64/fpu/libm-test-ulps-name b/sysdeps/aarch64/fpu/libm-test-ulps-name >> new file mode 100644 >> index 0000000000..1f66c5cda0 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/libm-test-ulps-name >> @@ -0,0 +1 @@ >> +AArch64 >> diff --git a/sysdeps/aarch64/fpu/math-tests-arch.h b/sysdeps/aarch64/fpu/math-tests-arch.h >> new file mode 100644 >> index 0000000000..263d4cabf1 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/math-tests-arch.h >> @@ -0,0 +1,34 @@ >> +/* Runtime architecture check for math tests. AArch64 version. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#ifdef REQUIRE_SVE >> +# include >> + >> +# define INIT_ARCH_EXT >> +# define CHECK_ARCH_EXT \ >> + do \ >> + { \ >> + if (!(getauxval (AT_HWCAP) & HWCAP_SVE)) return; \ >> + } \ >> + while (0) >> + >> +#else >> +# include >> +#endif >> + > > Spurions new line here. > >> diff --git a/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py b/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py >> new file mode 100644 >> index 0000000000..9c092670d7 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py >> @@ -0,0 +1,91 @@ >> +#!/usr/bin/python3 >> +# Copyright (C) 2023 Free Software Foundation, Inc. >> +# This file is part of the GNU C Library. >> +# >> +# The GNU C Library is free software; you can redistribute it and/or >> +# modify it under the terms of the GNU Lesser General Public >> +# License as published by the Free Software Foundation; either >> +# version 2.1 of the License, or (at your option) any later version. >> +# >> +# The GNU C Library is distributed in the hope that it will be useful, >> +# but WITHOUT ANY WARRANTY; without even the implied warranty of >> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> +# Lesser General Public License for more details. >> +# >> +# You should have received a copy of the GNU Lesser General Public >> +# License along with the GNU C Library; if not, see >> +# . >> + >> +import sys >> + >> +TEMPLATE = """ >> +#include >> +#include >> + >> +#define STRIDE {stride} >> + >> +#define CALL_BENCH_FUNC(v, i) (__extension__ ({{ \\ >> + {rtype} mx0 = {fname}(vld1q_f{prec_short} (variants[v].in[i].arg0)); \\ >> + mx0; }})) >> + >> +struct args >> +{{ >> + {stype} arg0[STRIDE]; >> + double timing; >> +}}; >> + >> +struct _variants >> +{{ >> + const char *name; >> + int count; >> + struct args *in; >> +}}; >> + >> +struct args in0[{rowcount}] = {{ >> +{in_data} >> +}}; >> + >> +struct _variants variants[1] = {{ >> + {{"", {rowcount}, in0}}, >> +}}; > > Maybe define them as static const? > >> + >> +#define NUM_VARIANTS 1 >> +#define NUM_SAMPLES(i) (variants[i].count) >> +#define VARIANT(i) (variants[i].name) >> + >> +// Cannot pass volatile pointer to svst1. This still does not appear to get optimised out. >> +static {rtype} volatile ret; >> + >> +#define BENCH_FUNC(i, j) ({{ ret = CALL_BENCH_FUNC(i, j); }}) >> +#define FUNCNAME "{fname}" >> +#include >> +""" >> + >> +def main(name): >> + _, prec, _, func = name.split("-") >> + scalar_to_advsimd_type = {"double": "float64x2_t", "float": "float32x4_t"} >> + >> + stride = {"double": 2, "float": 4}[prec] >> + rtype = scalar_to_advsimd_type[prec] >> + atype = scalar_to_advsimd_type[prec] >> + fname = f"_ZGVnN{stride}v_{func}{'f' if prec == 'float' else ''}" >> + prec_short = {"double": 64, "float": 32}[prec] >> + >> + with open(f"../benchtests/{func}-inputs") as f: >> + in_vals = [l.strip() for l in f.readlines() if l and not l.startswith("#")] >> + in_vals = [in_vals[i:i+stride] for i in range(0, len(in_vals), stride)] >> + rowcount= len(in_vals) >> + in_data = ",\n".join("{{" + ", ".join(row) + "}, 0}" for row in in_vals) >> + >> + print(TEMPLATE.format(stride=stride, >> + rtype=rtype, >> + atype=atype, >> + fname=fname, >> + prec_short=prec_short, >> + in_data=in_data, >> + rowcount=rowcount, >> + stype=prec)) >> + >> + >> +if __name__ == "__main__": >> + main(sys.argv[1]) >> diff --git a/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py b/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py >> new file mode 100755 >> index 0000000000..0ea21c4c69 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py >> @@ -0,0 +1,93 @@ >> +#!/usr/bin/python3 >> +# Copyright (C) 2023 Free Software Foundation, Inc. >> +# This file is part of the GNU C Library. >> +# >> +# The GNU C Library is free software; you can redistribute it and/or >> +# modify it under the terms of the GNU Lesser General Public >> +# License as published by the Free Software Foundation; either >> +# version 2.1 of the License, or (at your option) any later version. >> +# >> +# The GNU C Library is distributed in the hope that it will be useful, >> +# but WITHOUT ANY WARRANTY; without even the implied warranty of >> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> +# Lesser General Public License for more details. >> +# >> +# You should have received a copy of the GNU Lesser General Public >> +# License along with the GNU C Library; if not, see >> +# . >> + >> +import sys >> + >> +TEMPLATE = """ >> +#include >> +#include >> + >> +#define STRIDE {stride} >> + >> +#define CALL_BENCH_FUNC(v, i) (__extension__ ({{ \\ >> + {rtype} mx0 = {fname}(svld1rq_f{prec_short} (svptrue_b{prec_short}(), \\ >> + variants[v].in[i].arg0), \\ >> + svptrue_b{prec_short}()); \\ >> + mx0; }})) >> + >> +struct args >> +{{ >> + {stype} arg0[STRIDE]; >> + double timing; >> +}}; >> + >> +struct _variants >> +{{ >> + const char *name; >> + int count; >> + struct args *in; >> +}}; >> + >> +struct args in0[{rowcount}] = {{ >> +{in_data} >> +}}; >> + >> +struct _variants variants[1] = {{ >> + {{"", {rowcount}, in0}}, >> +}}; >> + >> +#define NUM_VARIANTS 1 >> +#define NUM_SAMPLES(i) (variants[i].count) >> +#define VARIANT(i) (variants[i].name) >> + >> +// Cannot pass volatile pointer to svst1. This still does not appear to get optimised out. >> +static {stype} /*volatile*/ ret[STRIDE]; >> + >> +#define BENCH_FUNC(i, j) ({{ svst1_f{prec_short}(svwhilelt_b{prec_short}(0, 4), ret, CALL_BENCH_FUNC(i, j)); }}) >> +#define FUNCNAME "{fname}" >> +#include >> +""" >> + >> +def main(name): >> + _, prec, _, func = name.split("-") >> + scalar_to_sve_type = {"double": "svfloat64_t", "float": "svfloat32_t"} >> + >> + stride = {"double": 2, "float": 4}[prec] >> + rtype = scalar_to_sve_type[prec] >> + atype = scalar_to_sve_type[prec] >> + fname = f"_ZGVsMxv_{func}{'f' if prec == 'float' else ''}" >> + prec_short = {"double": 64, "float": 32}[prec] >> + >> + with open(f"../benchtests/{func}-inputs") as f: >> + in_vals = [l.strip() for l in f.readlines() if l and not l.startswith("#")] >> + in_vals = [in_vals[i:i+stride] for i in range(0, len(in_vals), stride)] >> + rowcount= len(in_vals) >> + in_data = ",\n".join("{{" + ", ".join(row) + "}, 0}" for row in in_vals) >> + >> + print(TEMPLATE.format(stride=stride, >> + rtype=rtype, >> + atype=atype, >> + fname=fname, >> + prec_short=prec_short, >> + in_data=in_data, >> + rowcount=rowcount, >> + stype=prec)) >> + >> + >> +if __name__ == "__main__": >> + main(sys.argv[1]) >> diff --git a/sysdeps/aarch64/fpu/sve_utils.h b/sysdeps/aarch64/fpu/sve_utils.h >> new file mode 100644 >> index 0000000000..dbdc03387c >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/sve_utils.h >> @@ -0,0 +1,55 @@ >> +/* Helpers for SVE vector math funtions. > > s/funtions/functions > >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#define SV_NAME_F1(fun) _ZGVsMxv_##fun##f >> +#define SV_NAME_D1(fun) _ZGVsMxv_##fun >> +#define SV_NAME_F2(fun) _ZGVsMxvv_##fun##f >> +#define SV_NAME_D2(fun) _ZGVsMxvv_##fun >> + >> +static inline svfloat32_t >> +sv_call_f32 (float (*f) (float), svfloat32_t x, svfloat32_t y, svbool_t cmp) >> +{ >> + svbool_t p = svpfirst (cmp, svpfalse ()); >> + while (svptest_any (cmp, p)) >> + { >> + float elem = svclastb_n_f32 (p, 0, x); >> + elem = (*f) (elem); >> + svfloat32_t y2 = svdup_n_f32 (elem); >> + y = svsel_f32 (p, y2, y); >> + p = svpnext_b32 (cmp, p); >> + } >> + return y; >> +} >> + >> +static inline svfloat64_t >> +sv_call_f64 (double (*f) (double), svfloat64_t x, svfloat64_t y, svbool_t cmp) >> +{ >> + svbool_t p = svpfirst (cmp, svpfalse ()); >> + while (svptest_any (cmp, p)) >> + { >> + double elem = svclastb_n_f64 (p, 0, x); >> + elem = (*f) (elem); >> + svfloat64_t y2 = svdup_n_f64 (elem); >> + y = svsel_f64 (p, y2, y); >> + p = svpnext_b64 (cmp, p); >> + } >> + return y; >> +} >> diff --git a/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c >> new file mode 100644 >> index 0000000000..52e330f469 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c >> @@ -0,0 +1,26 @@ >> +/* Scalar wrappers for double-precision Advanced SIMD vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "test-double-advsimd.h" >> + >> +#define VEC_TYPE float64x2_t >> + >> +VPCS_VECTOR_WRAPPER(cos_advsimd, _ZGVnN2v_cos) >> diff --git a/sysdeps/aarch64/fpu/test-double-advsimd.h b/sysdeps/aarch64/fpu/test-double-advsimd.h >> new file mode 100644 >> index 0000000000..8bd32b97fa >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-double-advsimd.h >> @@ -0,0 +1,25 @@ >> +/* Test declarations for double-precision Advanced SIMD vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include "test-double.h" >> +#include "test-math-vector.h" >> +#include "test-vpcs-vector-wrapper.h" >> + >> +#define VEC_SUFF _advsimd >> +#define VEC_LEN 2 >> diff --git a/sysdeps/aarch64/fpu/test-double-sve-wrappers.c b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c >> new file mode 100644 >> index 0000000000..8edc5ed5ab >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c >> @@ -0,0 +1,34 @@ >> +/* Scalar wrappers for double-precision SVE vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "test-double-sve.h" >> + >> +/* Wrapper from scalar to SVE function. Cannot just use VECTOR_WRAPPER due to predication. */ >> +#define SVE_VECTOR_WRAPPER(scalar_func, vector_func) \ >> + extern VEC_TYPE vector_func (VEC_TYPE, svbool_t); \ >> +FLOAT scalar_func (FLOAT x) \ >> +{ \ >> + VEC_TYPE mx = svdup_n_f64 (x); \ >> + VEC_TYPE mr = vector_func (mx, svptrue_b64 ()); \ >> + return svlastb_f64 (svptrue_b64 (), mr); \ >> +} >> + >> +SVE_VECTOR_WRAPPER(cos_sve, _ZGVsMxv_cos) >> diff --git a/sysdeps/aarch64/fpu/test-double-sve.h b/sysdeps/aarch64/fpu/test-double-sve.h >> new file mode 100644 >> index 0000000000..857a40861d >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-double-sve.h >> @@ -0,0 +1,26 @@ >> +/* Test declarations for double-precision SVE vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include "test-double.h" >> +#include "test-math-vector.h" >> + >> +#define REQUIRE_SVE >> +#define VEC_SUFF _sve >> +#define VEC_LEN svcntd() >> +#define VEC_TYPE svfloat64_t >> diff --git a/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c >> new file mode 100644 >> index 0000000000..3577ca93b8 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c >> @@ -0,0 +1,26 @@ >> +/* Scalar wrappers for single-precision Advanced SIMD vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "test-float-advsimd.h" >> + >> +#define VEC_TYPE float32x4_t >> + >> +VPCS_VECTOR_WRAPPER(cosf_advsimd, _ZGVnN4v_cosf) >> diff --git a/sysdeps/aarch64/fpu/test-float-advsimd.h b/sysdeps/aarch64/fpu/test-float-advsimd.h >> new file mode 100644 >> index 0000000000..86fce613cd >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-float-advsimd.h >> @@ -0,0 +1,25 @@ >> +/* Test declarations for singlex-precision Advanced SIMD vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include "test-float.h" >> +#include "test-math-vector.h" >> +#include "test-vpcs-vector-wrapper.h" >> + >> +#define VEC_SUFF _advsimd >> +#define VEC_LEN 4 >> diff --git a/sysdeps/aarch64/fpu/test-float-sve-wrappers.c b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c >> new file mode 100644 >> index 0000000000..b6a944d502 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c >> @@ -0,0 +1,34 @@ >> +/* Scalar wrappers for single-precision SVE vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include >> + >> +#include "test-float-sve.h" >> + >> +/* Wrapper from scalar to SVE function. Cannot just use VECTOR_WRAPPER due to predication. */ >> +#define SVE_VECTOR_WRAPPER(scalar_func, vector_func) \ >> + extern VEC_TYPE vector_func (VEC_TYPE, svbool_t); \ >> +FLOAT scalar_func (FLOAT x) \ >> +{ \ >> + VEC_TYPE mx = svdup_n_f32 (x); \ >> + VEC_TYPE mr = vector_func (mx, svptrue_b32 ()); \ >> + return svlastb_f32 (svptrue_b32 (), mr); \ >> +} >> + >> +SVE_VECTOR_WRAPPER(cosf_sve, _ZGVsMxv_cosf) >> diff --git a/sysdeps/aarch64/fpu/test-float-sve.h b/sysdeps/aarch64/fpu/test-float-sve.h >> new file mode 100644 >> index 0000000000..d6e122cf67 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-float-sve.h >> @@ -0,0 +1,26 @@ >> +/* Test declarations for single-precision SVE vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#include "test-float.h" >> +#include "test-math-vector.h" >> + >> +#define REQUIRE_SVE >> +#define VEC_SUFF _sve >> +#define VEC_LEN svcntw() >> +#define VEC_TYPE svfloat32_t >> diff --git a/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h b/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h >> new file mode 100644 >> index 0000000000..eb0f0db838 >> --- /dev/null >> +++ b/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h >> @@ -0,0 +1,30 @@ >> +/* Scalar wrapper for vpcs-enabled Advanced SIMD vector math functions. >> + >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + . */ >> + >> +#define VPCS_VECTOR_WRAPPER(scalar_func, vector_func) \ >> +extern __attribute__ ((aarch64_vector_pcs)) VEC_TYPE vector_func (VEC_TYPE); \ >> +FLOAT scalar_func (FLOAT x) \ >> +{ \ >> + int i; \ >> + VEC_TYPE mx; \ >> + INIT_VEC_LOOP (mx, x, VEC_LEN); \ >> + VEC_TYPE mr = vector_func (mx); \ >> + TEST_VEC_LOOP (mr, VEC_LEN); \ >> + return ((FLOAT) mr[0]); \ >> +} >> diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist >> new file mode 100644 >> index 0000000000..13af421af2 >> --- /dev/null >> +++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist >> @@ -0,0 +1,4 @@ >> +GLIBC_2.38 _ZGVnN2v_cos F >> +GLIBC_2.38 _ZGVnN4v_cosf F >> +GLIBC_2.38 _ZGVsMxv_cos F >> +GLIBC_2.38 _ZGVsMxv_cosf F