From: Joe Ramsay <joe.ramsay@arm.com>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
libc-alpha@sourceware.org, Szabolcs Nagy <szabolcs.nagy@arm.com>
Subject: Re: [PATCH] [RFC] Proposal for implementing AArch64 port of libmvec
Date: Thu, 9 Feb 2023 12:43:17 +0000 [thread overview]
Message-ID: <6565ff59-1a71-bdca-83f3-1f8b4f8f2e35@arm.com> (raw)
In-Reply-To: <aad589a7-f25f-4710-c03f-91da0030244c@linaro.org>
Thanks for the comments. I will attempt a patch that addresses them, in
the meantime just a few questions:
On 08/02/2023 13:11, Adhemerval Zanella Netto wrote:
>
>
> On 07/02/23 08:35, Joe Ramsay via Libc-alpha wrote:
>> Hi,
>>
>> The attached patch is an attempt to enable libmvec on AArch64. The
>> proposed change is mainly implementing build infrastructure to add the
>> new routines to ABI, tests and benchmarks. I have demonstrated how
>> this all fits together by adding implementations for vector cos, in
>> both single and double precision, targeting both Advanced SIMD and
>> SVE.
>>
>> The implementations of the routines themselves are just loops over the
>> scalar routine from libm for now, as we are more concerned with
>> getting the plumbing right at this point. We plan to contribute vector
>> routines from the Arm Optimized Routines repo that are compliant with
>> requirements described in the libmvec wiki.
>>
>> Any comments/thoughts much appreciated! In particular, the patch
>> raises the minimum GCC to 10, in order to be able to submit routines
>> written using ACLE instead of assembly. This is clearly a big jump,
>> but we have options if this is not acceptable. One option would be to
>> submit compiler-generated assembly, similar to the equivalent routines
>> under sysdeps/x86_64. If GCC 9 is an acceptable compromise then this
>> would only have to be for SVE routines.
>
> Using C implementation with intrinsics would be idea, there are more easily
> maintained and can leverage compiler improvements. I rather do it instead
> of the assembly dump Intel did.
>
> The minimum GCC 10 is not ideal, however I don't see it as blocker either
> (it should be up to arch-maintainers). One option might be check if
> compiler does not support building libmvec, disable the build and related
> checks. It is not ideal either, since the resulting glibc won't have
> a complete ABI.
>
>
OK, let's see what arch-maintainers make of it.
>>
>> Also, are there plans to merge libmvec into libm, or will they be kept
>> separate?
>
> There is none afaik. The libpthread, librt, etc. merge was done to
> fix long standing design and maintanance issues that is not really presented
> with libm and libmvec. There is still the partial upgrade one, but
> it is still present with a disjoint ld, libc, libm anyway.
>
> However, it is feasible to merge if your willing to work on it. We will
> need to keep the x86_64 lib with the sentinel compat symbol (similar to
> what we did for libpthread).
>
> What I would like to avoid is to have different arquitectures using different
> approaches, for instance aarch64 begin merged while having x86_64 still
> using a different library. It add a slight more complexity to the build
> process and extra arch specific boilerplate code.
>
This sounds good - keeping them separate would be our choice too. I have
not come across the sentinel compat symbol - is this something we need
to do for AArch64 also?
>>
>> Note that at this point users have to manually call the vector math
>> functions, there is no declaration in math.h to assist auto
>> vectorization of scalar math calls. This seems to be acceptable to
>> some downstream users.
>
> I think that's the current approach for x86_64 anyway, since most usages
> are done through compiler autovectorization code.
>
> Some comments below.
>
>>
>> Thanks,
>> Joe
>> ---
>> INSTALL | 3 +
>> manual/install.texi | 3 +
>> sysdeps/aarch64/configure | 28 ++++++
>> sysdeps/aarch64/configure.ac | 20 ++++
>> sysdeps/aarch64/fpu/Makefile | 66 +++++++++++++
>> sysdeps/aarch64/fpu/Versions | 8 ++
>> sysdeps/aarch64/fpu/advsimd_utils.h | 39 ++++++++
>> sysdeps/aarch64/fpu/bench-libmvec-skeleton.c | 83 +++++++++++++++++
>> sysdeps/aarch64/fpu/bits/math-vector.h | 65 +++++++++++++
>> sysdeps/aarch64/fpu/cos_advsimd.c | 28 ++++++
>> sysdeps/aarch64/fpu/cos_sve.c | 27 ++++++
>> sysdeps/aarch64/fpu/cosf_advsimd.c | 28 ++++++
>> sysdeps/aarch64/fpu/cosf_sve.c | 27 ++++++
>> sysdeps/aarch64/fpu/libm-test-ulps | 7 ++
>> sysdeps/aarch64/fpu/libm-test-ulps-name | 1 +
>> sysdeps/aarch64/fpu/math-tests-arch.h | 34 +++++++
>> .../fpu/scripts/bench_libmvec_advsimd.py | 91 ++++++++++++++++++
>> .../aarch64/fpu/scripts/bench_libmvec_sve.py | 93 +++++++++++++++++++
>> sysdeps/aarch64/fpu/sve_utils.h | 55 +++++++++++
>> .../fpu/test-double-advsimd-wrappers.c | 26 ++++++
>> sysdeps/aarch64/fpu/test-double-advsimd.h | 25 +++++
>> .../aarch64/fpu/test-double-sve-wrappers.c | 34 +++++++
>> sysdeps/aarch64/fpu/test-double-sve.h | 26 ++++++
>> .../aarch64/fpu/test-float-advsimd-wrappers.c | 26 ++++++
>> sysdeps/aarch64/fpu/test-float-advsimd.h | 25 +++++
>> sysdeps/aarch64/fpu/test-float-sve-wrappers.c | 34 +++++++
>> sysdeps/aarch64/fpu/test-float-sve.h | 26 ++++++
>> .../aarch64/fpu/test-vpcs-vector-wrapper.h | 30 ++++++
>> .../unix/sysv/linux/aarch64/libmvec.abilist | 4 +
>> 29 files changed, 962 insertions(+)
>> create mode 100644 sysdeps/aarch64/fpu/Makefile
>> create mode 100644 sysdeps/aarch64/fpu/Versions
>> create mode 100644 sysdeps/aarch64/fpu/advsimd_utils.h
>> create mode 100644 sysdeps/aarch64/fpu/bench-libmvec-skeleton.c
>> create mode 100644 sysdeps/aarch64/fpu/bits/math-vector.h
>> create mode 100644 sysdeps/aarch64/fpu/cos_advsimd.c
>> create mode 100644 sysdeps/aarch64/fpu/cos_sve.c
>> create mode 100644 sysdeps/aarch64/fpu/cosf_advsimd.c
>> create mode 100644 sysdeps/aarch64/fpu/cosf_sve.c
>> create mode 100644 sysdeps/aarch64/fpu/libm-test-ulps
>> create mode 100644 sysdeps/aarch64/fpu/libm-test-ulps-name
>> create mode 100644 sysdeps/aarch64/fpu/math-tests-arch.h
>> create mode 100644 sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py
>> create mode 100755 sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py
>> create mode 100644 sysdeps/aarch64/fpu/sve_utils.h
>> create mode 100644 sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c
>> create mode 100644 sysdeps/aarch64/fpu/test-double-advsimd.h
>> create mode 100644 sysdeps/aarch64/fpu/test-double-sve-wrappers.c
>> create mode 100644 sysdeps/aarch64/fpu/test-double-sve.h
>> create mode 100644 sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c
>> create mode 100644 sysdeps/aarch64/fpu/test-float-advsimd.h
>> create mode 100644 sysdeps/aarch64/fpu/test-float-sve-wrappers.c
>> create mode 100644 sysdeps/aarch64/fpu/test-float-sve.h
>> create mode 100644 sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h
>> create mode 100644 sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
>>
>> diff --git a/INSTALL b/INSTALL
>> index 970d6627e2..ba800e41d6 100644
>> --- a/INSTALL
>> +++ b/INSTALL
>> @@ -524,6 +524,9 @@ build the GNU C Library:
>> For s390x architecture builds, GCC 7.1 or higher is needed (See gcc
>> Bug 98269).
>>
>> + For AArch64 architecture builds with mathvec enabled, GCC 10 or
>> + higher is needed due to dependency on arm_sve.h.
>> +
>> For multi-arch support it is recommended to use a GCC which has
>> been built with support for GNU indirect functions. This ensures
>> that correct debugging information is generated for functions
>> diff --git a/manual/install.texi b/manual/install.texi
>> index 260f8a5c82..e9c62b51ae 100644
>> --- a/manual/install.texi
>> +++ b/manual/install.texi
>> @@ -567,6 +567,9 @@ For ARC architecture builds, GCC 8.3 or higher is needed.
>>
>> For s390x architecture builds, GCC 7.1 or higher is needed (See gcc Bug 98269).
>>
>> +For AArch64 architecture builds with mathvec enabled, GCC 10 or higher is needed
>> +due to dependency on arm_sve.h.
>> +
>> For multi-arch support it is recommended to use a GCC which has been built with
>> support for GNU indirect functions. This ensures that correct debugging
>> information is generated for functions selected by IFUNC resolvers. This
>> diff --git a/sysdeps/aarch64/configure b/sysdeps/aarch64/configure
>> index 2130f6b8f8..a71c32d70f 100644
>> --- a/sysdeps/aarch64/configure
>> +++ b/sysdeps/aarch64/configure
>> @@ -327,3 +327,31 @@ if test $libc_cv_aarch64_sve_asm = yes; then
>> $as_echo "#define HAVE_AARCH64_SVE_ASM 1" >>confdefs.h
>>
>> fi
>> +
>> +# Check if the local system can run SVE binary
>> +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for local SVE hardware" >&5
>> +$as_echo_n "checking for local SVE hardware... " >&6; }
>> +if ${libc_cv_can_run_sve+:} false; then :
>> + $as_echo_n "(cached) " >&6
>> +else
>> + cat > conftest.c <<EOF
>> +#include <sys/auxv.h>
>> +int main(void) {
>> + if (! (getauxval (AT_HWCAP) & HWCAP_SVE))
>> + return 1;
>> + return 0;
>> +}
>> +EOF
>> + libc_cv_can_run_sve=yes
>> + ${CC-cc} conftest.c -o conftest
>> + ./conftest || libc_cv_can_run_sve=no
>> + rm -f conftest*
>> +fi
>> +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_can_run_sve" >&5
>> +$as_echo "$libc_cv_can_run_sve" >&6; }
>> +config_vars="$config_vars
>> +aarch64-can-run-sve = $libc_cv_can_run_sve"
>> +
>> +if test x"$build_mathvec" = xnotset; then
>> + build_mathvec=yes
>> +fi
>> diff --git a/sysdeps/aarch64/configure.ac b/sysdeps/aarch64/configure.ac
>> index 85c6f76508..688f8772a6 100644
>> --- a/sysdeps/aarch64/configure.ac
>> +++ b/sysdeps/aarch64/configure.ac
>> @@ -101,3 +101,23 @@ rm -f conftest*])
>> if test $libc_cv_aarch64_sve_asm = yes; then
>> AC_DEFINE(HAVE_AARCH64_SVE_ASM)
>> fi
>> +
>> +# Check if the local system can run SVE binary
>> +AC_CACHE_CHECK(for local SVE hardware, libc_cv_can_run_sve, [dnl
>> + cat > conftest.c <<EOF
>> +#include <sys/auxv.h>
>> +int main(void) {
>> + if (! (getauxval (AT_HWCAP) & HWCAP_SVE))
>> + return 1;
>> + return 0;
>> +}
>> +EOF
>> + libc_cv_can_run_sve=yes
>> + ${CC-cc} conftest.c -o conftest
>> + ./conftest || libc_cv_can_run_sve=no
>> + rm -f conftest*])
>> +LIBC_CONFIG_VAR([aarch64-can-run-sve], [$libc_cv_can_run_sve])
>> +
>> +if test x"$build_mathvec" = xnotset; then
>> + build_mathvec=yes
>> +fi
>> diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile
>> new file mode 100644
>> index 0000000000..caf5d60669
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/Makefile
>> @@ -0,0 +1,66 @@
>> +float-advsimd-funcs = cos
>> +
>> +double-advsimd-funcs = cos
>> +
>> +float-sve-funcs = cos
>> +
>> +double-sve-funcs = cos
>> +
>> +ifeq ($(subdir),mathvec)
>> +libmvec-support = $(addsuffix f_advsimd,$(float-advsimd-funcs)) \
>> + $(addsuffix _advsimd,$(double-advsimd-funcs)) \
>> + $(addsuffix f_sve,$(float-sve-funcs)) \
>> + $(addsuffix _sve,$(double-sve-funcs))
>> +endif
>> +
>> +sve-cflags = -march=armv8-a+sve
>> +
>> +
>> +ifeq ($(build-mathvec),yes)
>> +bench-libmvec = $(addprefix float-advsimd-,$(float-advsimd-funcs)) \
>> + $(addprefix double-advsimd-,$(double-advsimd-funcs))
>> +
>> +# If not on an SVE-enabled machine, do not add SVE routines to benchmarks.
>> +# The routines are still built.
>> +ifeq ($(aarch64-can-run-sve),yes)
>> + bench-libmvec += $(addprefix float-sve-,$(float-sve-funcs)) \
>> + $(addprefix double-sve-,$(double-sve-funcs))
>> +endif
>> +endif
>> +
>> +$(objpfx)bench-float-advsimd-%.c:
>> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py $(basename $(@F)) > $@
>> +$(objpfx)bench-double-advsimd-%.c:
>> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py $(basename $(@F)) > $@
>> +$(objpfx)bench-float-sve-%.c:
>> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py $(basename $(@F)) > $@
>> +$(objpfx)bench-double-sve-%.c:
>> + $(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py $(basename $(@F)) > $@
>> +
>> +ifeq (${STATIC-BENCHTESTS},yes)
>> +libmvec-benchtests = $(common-objpfx)mathvec/libmvec.a $(common-objpfx)math/libm.a
>> +else
>> +libmvec-benchtests = $(libmvec) $(libm)
>> +endif
>> +
>> +$(addprefix $(objpfx)bench-,$(bench-libmvec)): $(libmvec-benchtests)
>> +
>> +ifeq ($(build-mathvec),yes)
>> +libmvec-tests += float-advsimd double-advsimd float-sve double-sve
>> +endif
>> +
>> +define sve-float-cflags-template
>> +CFLAGS-$(1)f_sve.c += $(sve-cflags)
>> +CFLAGS-bench-float-sve-$(1).c += $(sve-cflags)
>> +endef
>> +
>> +define sve-double-cflags-template
>> +CFLAGS-$(1)_sve.c += $(sve-cflags)
>> +CFLAGS-bench-double-sve-$(1).c += $(sve-cflags)
>> +endef
>> +
>> +$(foreach f,$(float-sve-funcs), $(eval $(call sve-float-cflags-template,$(f))))
>> +$(foreach f,$(double-sve-funcs), $(eval $(call sve-double-cflags-template,$(f))))
>> +
>> +CFLAGS-test-float-sve-wrappers.c = $(sve-cflags)
>> +CFLAGS-test-double-sve-wrappers.c = $(sve-cflags)
>> diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions
>> new file mode 100644
>> index 0000000000..5222a6f180
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/Versions
>> @@ -0,0 +1,8 @@
>> +libmvec {
>> + GLIBC_2.38 {
>> + _ZGVnN2v_cos;
>> + _ZGVnN4v_cosf;
>> + _ZGVsMxv_cos;
>> + _ZGVsMxv_cosf;
>> + }
>> +}
>> diff --git a/sysdeps/aarch64/fpu/advsimd_utils.h b/sysdeps/aarch64/fpu/advsimd_utils.h
>> new file mode 100644
>> index 0000000000..b597a18b8f
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/advsimd_utils.h
>> @@ -0,0 +1,39 @@
>> +/* Helpers for Advanced SIMD vector math funtions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <arm_neon.h>
>> +
>> +#define VPCS_ATTR __attribute__ ((aarch64_vector_pcs))
>> +
>> +#define V_NAME_F1(fun) _ZGVnN4v_##fun##f
>> +#define V_NAME_D1(fun) _ZGVnN2v_##fun
>> +#define V_NAME_F2(fun) _ZGVnN4vv_##fun##f
>> +#define V_NAME_D2(fun) _ZGVnN2vv_##fun
>> +
>> +static inline float32x4_t
>
> You might considere using __always_inline here if the idea is to use this functions
> as macros.
>
>> +v_call_f32 (float (*f) (float), float32x4_t x)
>> +{
>> + return (float32x4_t){f (x[0]), f (x[1]), f (x[2]), f (x[3])};
>> +}
>> +
>> +static inline float64x2_t
>> +v_call_f64 (double (*f) (double), float64x2_t x)
>> +{
>> + return (float64x2_t){f (x[0]), f (x[1])};
>> +}
>> diff --git a/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c b/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c
>> new file mode 100644
>> index 0000000000..ca6a10d1fe
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c
>> @@ -0,0 +1,83 @@
>> +/* Skeleton for libmvec benchmark programs.
>> + Copyright (C) 2021-2023 Free Software Foundation, Inc.
>
> I think Copyright year is only 2023 here.
>
Carlos flagged this up as well. This file was copied, with some
modifications, from sysdeps/x86_64/fpu/bench-libmvec-skeleton.c - is it
OK for it to get a brand new copyright header despite not being brand
new code?
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <string.h>
>> +#include <stdint.h>
>> +#include <stdbool.h>
>> +#include <stdio.h>
>> +#include <time.h>
>> +#include <inttypes.h>
>> +#include <bench-timing.h>
>> +#include <json-lib.h>
>> +#include <bench-util.h>
>> +#include <math-tests-arch.h>
>> +
>> +#include <bench-util.c>
>> +#define D_ITERS 10000
>> +
>> +int
>> +main (int argc, char **argv)
>> +{
>> + unsigned long i, k;
>> + timing_t start, end;
>> + json_ctx_t json_ctx;> +
>> + bench_start ();
>> +
>> +#ifdef BENCH_INIT
>> + BENCH_INIT ();
>> +#endif
>> +
>> + json_init (&json_ctx, 2, stdout);
>> +
>> + /* Begin function. */
>> + json_attr_object_begin (&json_ctx, FUNCNAME);
>> +
>> + for (int v = 0; v < NUM_VARIANTS; v++)
>> + {
>> + double d_total_time = 0;
>> + timing_t cur;
>> + for (k = 0; k < D_ITERS; k++)
>> + {
>> + TIMING_NOW (start);
>> + for (i = 0; i < NUM_SAMPLES (v); i++)
>> + BENCH_FUNC (v, i);
>> + TIMING_NOW (end);
>> +
>> + TIMING_DIFF (cur, start, end);
>> +
>> + TIMING_ACCUM (d_total_time, cur);
>> + }
>> + double d_total_data_set = D_ITERS * NUM_SAMPLES (v) * STRIDE;
>> +
>> + /* Begin variant. */
>> + json_attr_object_begin (&json_ctx, VARIANT (v));
>> +
>> + json_attr_double (&json_ctx, "duration", d_total_time);
>> + json_attr_double (&json_ctx, "iterations", d_total_data_set);
>> + json_attr_double (&json_ctx, "mean", d_total_time / d_total_data_set);
>> +
>> + /* End variant. */
>> + json_attr_object_end (&json_ctx);
>> + }
>> +
>> + /* End function. */
>> + json_attr_object_end (&json_ctx);
>> +
>> + return 0;
>> +}
>
> This file is quite similar to x86_64 modulo the extra CPU_FEATURE_ACTIVE checks.
> Maybe try to refactor to use a common definition and parametrize the x86 code
> on a arch-specific code?
>
>> diff --git a/sysdeps/aarch64/fpu/bits/math-vector.h b/sysdeps/aarch64/fpu/bits/math-vector.h
>> new file mode 100644
>> index 0000000000..a25845bff8
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/bits/math-vector.h
>> @@ -0,0 +1,65 @@
>> +/* Platform-specific SIMD declarations of math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#ifndef _MATH_H
>> +# error "Never include <bits/math-vector.h> directly;\
>> + include <math.h> instead."
>> +#endif
>> +
>> +/* Get default empty definitions for simd declarations. */
>> +#include <bits/libm-simd-decl-stubs.h>
>> +
>> +#if __GNUC_PREREQ (9, 0)
>
> I think these tests should move to configure tests instead, it advertises beforehand
> the user that it needs to update the compiler instead through a compiler error.
>
> The configure check will then check for both advsimd and SVE support, so there is
> no need for __ADVSIMD_VEC_MATH_SUPPORTED or __SVE_VEC_MATH_SUPPORTED.
>
Apologies, I don't quite understand what you mean by this. I put these
tests in so that users could compile against the new symbols with math.h
as long as they had a sufficiently new compiler, but wouldn't get
undefined types in math.h if they were using an old compiler that didn't
have e.g. __Float32x4_t. (I think I remarked in the original message
that new symbols hadn't been added to math.h, but this was not correct).
I don't see how this relates to configure, since this isn't for the
benefit of library builders, but maybe I have misunderstood? We could
put a separate check in at configure time for a compiler which is
sufficient for vector types, but that would not IMO make these tests
redundant. Let me know what you think.
>> +# define __ADVSIMD_VEC_MATH_SUPPORTED
>> +typedef __Float32x4_t __f32x4_t;
>> +typedef __Float64x2_t __f64x2_t;
>> +#elif __clang_major__ >= 8
>> +# define __ADVSIMD_VEC_MATH_SUPPORTED
>> +typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t;
>> +typedef __attribute__((__neon_vector_type__(2))) double __f64x2_t;
>> +#endif
>> +
>> +#if __GNUC_PREREQ (10, 0) || __clang_major >= 11
>> +# define __SVE_VEC_MATH_SUPPORTED
>> +typedef __SVFloat32_t __sv_f32_t;
>> +typedef __SVFloat64_t __sv_f64_t;
>> +typedef __SVBool_t __sv_bool_t;
>> +#endif
>> +
>> +/* If vector types and vector PCS are unsupported in the working
>> + compiler, no choice but to omit vector math declarations. */
>> +
>> +#ifdef __ADVSIMD_VEC_MATH_SUPPORTED
>> +
>> +# define __vpcs __attribute__((__aarch64_vector_pcs__))
>> +
>> +__vpcs __f32x4_t _ZGVnN4v_cosf (__f32x4_t);
>> +__vpcs __f64x2_t _ZGVnN2v_cos (__f64x2_t);
>> +
>> +#undef __ADVSIMD_VEC_MATH_SUPPORTED
>> +#endif /* __ADVSIMD_VEC_MATH_SUPPORTED */
>> +
>> +#ifdef __SVE_VEC_MATH_SUPPORTED
>> +
>> +__sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t);
>> +__sv_f64_t _ZGVsMxv_cos (__sv_f64_t, __sv_bool_t);
>> +
>> +#undef __SVE_VEC_MATH_SUPPORTED
>> +#endif /* __SVE_VEC_MATH_SUPPORTED */
>> +
>> diff --git a/sysdeps/aarch64/fpu/cos_advsimd.c b/sysdeps/aarch64/fpu/cos_advsimd.c
>> new file mode 100644
>> index 0000000000..5a42fbb182
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cos_advsimd.c
>> @@ -0,0 +1,28 @@
>> +/* Double-precision vector (Advanced SIMD) cos function.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <math.h>
>> +
>> +#include "advsimd_utils.h"
>> +
>> +VPCS_ATTR
>> +float64x2_t V_NAME_D1 (cos) (float64x2_t x)
>> +{
>> + return v_call_f64 (cos, x);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/cos_sve.c b/sysdeps/aarch64/fpu/cos_sve.c
>> new file mode 100644
>> index 0000000000..62bd2ece0e
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cos_sve.c
>> @@ -0,0 +1,27 @@
>> +/* Double-precision vector (SVE) cos function.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <math.h>
>> +
>> +#include "sve_utils.h"
>> +
>> +svfloat64_t SV_NAME_D1 (cos) (svfloat64_t x, svbool_t pg)
>> +{
>> + return sv_call_f64 (cos, x, svdup_n_f64 (0), pg);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/cosf_advsimd.c b/sysdeps/aarch64/fpu/cosf_advsimd.c
>> new file mode 100644
>> index 0000000000..23f54bd905
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cosf_advsimd.c
>> @@ -0,0 +1,28 @@
>> +/* Single-precision vector (Advanced SIMD) cos function.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <math.h>
>> +
>> +#include "advsimd_utils.h"
>> +
>> +VPCS_ATTR
>> +float32x4_t V_NAME_F1 (cos) (float32x4_t x)
>> +{
>> + return v_call_f32 (cosf, x);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/cosf_sve.c b/sysdeps/aarch64/fpu/cosf_sve.c
>> new file mode 100644
>> index 0000000000..0c4e365e1e
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cosf_sve.c
>> @@ -0,0 +1,27 @@
>> +/* Single-precision vector (SVE) cos function.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <math.h>
>> +
>> +#include "sve_utils.h"
>> +
>> +svfloat32_t SV_NAME_F1 (cos) (svfloat32_t x, svbool_t pg)
>> +{
>> + return sv_call_f32 (cosf, x, svdup_n_f32 (0), pg);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/libm-test-ulps b/sysdeps/aarch64/fpu/libm-test-ulps
>> new file mode 100644
>> index 0000000000..b199d7ddab
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/libm-test-ulps
>> @@ -0,0 +1,7 @@
>> +Function: "cos_advsimd":
>> +double: 2
>> +float: 2
>> +
>> +Function: "cos_sve":
>> +double: 2
>> +float: 2
>> \ No newline at end of file
>
> Bogus line feed here.
>
>> diff --git a/sysdeps/aarch64/fpu/libm-test-ulps-name b/sysdeps/aarch64/fpu/libm-test-ulps-name
>> new file mode 100644
>> index 0000000000..1f66c5cda0
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/libm-test-ulps-name
>> @@ -0,0 +1 @@
>> +AArch64
>> diff --git a/sysdeps/aarch64/fpu/math-tests-arch.h b/sysdeps/aarch64/fpu/math-tests-arch.h
>> new file mode 100644
>> index 0000000000..263d4cabf1
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/math-tests-arch.h
>> @@ -0,0 +1,34 @@
>> +/* Runtime architecture check for math tests. AArch64 version.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#ifdef REQUIRE_SVE
>> +# include <sys/auxv.h>
>> +
>> +# define INIT_ARCH_EXT
>> +# define CHECK_ARCH_EXT \
>> + do \
>> + { \
>> + if (!(getauxval (AT_HWCAP) & HWCAP_SVE)) return; \
>> + } \
>> + while (0)
>> +
>> +#else
>> +# include <sysdeps/generic/math-tests-arch.h>
>> +#endif
>> +
>
> Spurions new line here.
>
>> diff --git a/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py b/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py
>> new file mode 100644
>> index 0000000000..9c092670d7
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py
>> @@ -0,0 +1,91 @@
>> +#!/usr/bin/python3
>> +# Copyright (C) 2023 Free Software Foundation, Inc.
>> +# This file is part of the GNU C Library.
>> +#
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <https://www.gnu.org/licenses/>.
>> +
>> +import sys
>> +
>> +TEMPLATE = """
>> +#include <math.h>
>> +#include <arm_neon.h>
>> +
>> +#define STRIDE {stride}
>> +
>> +#define CALL_BENCH_FUNC(v, i) (__extension__ ({{ \\
>> + {rtype} mx0 = {fname}(vld1q_f{prec_short} (variants[v].in[i].arg0)); \\
>> + mx0; }}))
>> +
>> +struct args
>> +{{
>> + {stype} arg0[STRIDE];
>> + double timing;
>> +}};
>> +
>> +struct _variants
>> +{{
>> + const char *name;
>> + int count;
>> + struct args *in;
>> +}};
>> +
>> +struct args in0[{rowcount}] = {{
>> +{in_data}
>> +}};
>> +
>> +struct _variants variants[1] = {{
>> + {{"", {rowcount}, in0}},
>> +}};
>
> Maybe define them as static const?
>
>> +
>> +#define NUM_VARIANTS 1
>> +#define NUM_SAMPLES(i) (variants[i].count)
>> +#define VARIANT(i) (variants[i].name)
>> +
>> +// Cannot pass volatile pointer to svst1. This still does not appear to get optimised out.
>> +static {rtype} volatile ret;
>> +
>> +#define BENCH_FUNC(i, j) ({{ ret = CALL_BENCH_FUNC(i, j); }})
>> +#define FUNCNAME "{fname}"
>> +#include <bench-libmvec-skeleton.c>
>> +"""
>> +
>> +def main(name):
>> + _, prec, _, func = name.split("-")
>> + scalar_to_advsimd_type = {"double": "float64x2_t", "float": "float32x4_t"}
>> +
>> + stride = {"double": 2, "float": 4}[prec]
>> + rtype = scalar_to_advsimd_type[prec]
>> + atype = scalar_to_advsimd_type[prec]
>> + fname = f"_ZGVnN{stride}v_{func}{'f' if prec == 'float' else ''}"
>> + prec_short = {"double": 64, "float": 32}[prec]
>> +
>> + with open(f"../benchtests/{func}-inputs") as f:
>> + in_vals = [l.strip() for l in f.readlines() if l and not l.startswith("#")]
>> + in_vals = [in_vals[i:i+stride] for i in range(0, len(in_vals), stride)]
>> + rowcount= len(in_vals)
>> + in_data = ",\n".join("{{" + ", ".join(row) + "}, 0}" for row in in_vals)
>> +
>> + print(TEMPLATE.format(stride=stride,
>> + rtype=rtype,
>> + atype=atype,
>> + fname=fname,
>> + prec_short=prec_short,
>> + in_data=in_data,
>> + rowcount=rowcount,
>> + stype=prec))
>> +
>> +
>> +if __name__ == "__main__":
>> + main(sys.argv[1])
>> diff --git a/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py b/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py
>> new file mode 100755
>> index 0000000000..0ea21c4c69
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py
>> @@ -0,0 +1,93 @@
>> +#!/usr/bin/python3
>> +# Copyright (C) 2023 Free Software Foundation, Inc.
>> +# This file is part of the GNU C Library.
>> +#
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <https://www.gnu.org/licenses/>.
>> +
>> +import sys
>> +
>> +TEMPLATE = """
>> +#include <math.h>
>> +#include <arm_sve.h>
>> +
>> +#define STRIDE {stride}
>> +
>> +#define CALL_BENCH_FUNC(v, i) (__extension__ ({{ \\
>> + {rtype} mx0 = {fname}(svld1rq_f{prec_short} (svptrue_b{prec_short}(), \\
>> + variants[v].in[i].arg0), \\
>> + svptrue_b{prec_short}()); \\
>> + mx0; }}))
>> +
>> +struct args
>> +{{
>> + {stype} arg0[STRIDE];
>> + double timing;
>> +}};
>> +
>> +struct _variants
>> +{{
>> + const char *name;
>> + int count;
>> + struct args *in;
>> +}};
>> +
>> +struct args in0[{rowcount}] = {{
>> +{in_data}
>> +}};
>> +
>> +struct _variants variants[1] = {{
>> + {{"", {rowcount}, in0}},
>> +}};
>> +
>> +#define NUM_VARIANTS 1
>> +#define NUM_SAMPLES(i) (variants[i].count)
>> +#define VARIANT(i) (variants[i].name)
>> +
>> +// Cannot pass volatile pointer to svst1. This still does not appear to get optimised out.
>> +static {stype} /*volatile*/ ret[STRIDE];
>> +
>> +#define BENCH_FUNC(i, j) ({{ svst1_f{prec_short}(svwhilelt_b{prec_short}(0, 4), ret, CALL_BENCH_FUNC(i, j)); }})
>> +#define FUNCNAME "{fname}"
>> +#include <bench-libmvec-skeleton.c>
>> +"""
>> +
>> +def main(name):
>> + _, prec, _, func = name.split("-")
>> + scalar_to_sve_type = {"double": "svfloat64_t", "float": "svfloat32_t"}
>> +
>> + stride = {"double": 2, "float": 4}[prec]
>> + rtype = scalar_to_sve_type[prec]
>> + atype = scalar_to_sve_type[prec]
>> + fname = f"_ZGVsMxv_{func}{'f' if prec == 'float' else ''}"
>> + prec_short = {"double": 64, "float": 32}[prec]
>> +
>> + with open(f"../benchtests/{func}-inputs") as f:
>> + in_vals = [l.strip() for l in f.readlines() if l and not l.startswith("#")]
>> + in_vals = [in_vals[i:i+stride] for i in range(0, len(in_vals), stride)]
>> + rowcount= len(in_vals)
>> + in_data = ",\n".join("{{" + ", ".join(row) + "}, 0}" for row in in_vals)
>> +
>> + print(TEMPLATE.format(stride=stride,
>> + rtype=rtype,
>> + atype=atype,
>> + fname=fname,
>> + prec_short=prec_short,
>> + in_data=in_data,
>> + rowcount=rowcount,
>> + stype=prec))
>> +
>> +
>> +if __name__ == "__main__":
>> + main(sys.argv[1])
>> diff --git a/sysdeps/aarch64/fpu/sve_utils.h b/sysdeps/aarch64/fpu/sve_utils.h
>> new file mode 100644
>> index 0000000000..dbdc03387c
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/sve_utils.h
>> @@ -0,0 +1,55 @@
>> +/* Helpers for SVE vector math funtions.
>
> s/funtions/functions
>
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <arm_sve.h>
>> +
>> +#define SV_NAME_F1(fun) _ZGVsMxv_##fun##f
>> +#define SV_NAME_D1(fun) _ZGVsMxv_##fun
>> +#define SV_NAME_F2(fun) _ZGVsMxvv_##fun##f
>> +#define SV_NAME_D2(fun) _ZGVsMxvv_##fun
>> +
>> +static inline svfloat32_t
>> +sv_call_f32 (float (*f) (float), svfloat32_t x, svfloat32_t y, svbool_t cmp)
>> +{
>> + svbool_t p = svpfirst (cmp, svpfalse ());
>> + while (svptest_any (cmp, p))
>> + {
>> + float elem = svclastb_n_f32 (p, 0, x);
>> + elem = (*f) (elem);
>> + svfloat32_t y2 = svdup_n_f32 (elem);
>> + y = svsel_f32 (p, y2, y);
>> + p = svpnext_b32 (cmp, p);
>> + }
>> + return y;
>> +}
>> +
>> +static inline svfloat64_t
>> +sv_call_f64 (double (*f) (double), svfloat64_t x, svfloat64_t y, svbool_t cmp)
>> +{
>> + svbool_t p = svpfirst (cmp, svpfalse ());
>> + while (svptest_any (cmp, p))
>> + {
>> + double elem = svclastb_n_f64 (p, 0, x);
>> + elem = (*f) (elem);
>> + svfloat64_t y2 = svdup_n_f64 (elem);
>> + y = svsel_f64 (p, y2, y);
>> + p = svpnext_b64 (cmp, p);
>> + }
>> + return y;
>> +}
>> diff --git a/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c
>> new file mode 100644
>> index 0000000000..52e330f469
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c
>> @@ -0,0 +1,26 @@
>> +/* Scalar wrappers for double-precision Advanced SIMD vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <arm_neon.h>
>> +
>> +#include "test-double-advsimd.h"
>> +
>> +#define VEC_TYPE float64x2_t
>> +
>> +VPCS_VECTOR_WRAPPER(cos_advsimd, _ZGVnN2v_cos)
>> diff --git a/sysdeps/aarch64/fpu/test-double-advsimd.h b/sysdeps/aarch64/fpu/test-double-advsimd.h
>> new file mode 100644
>> index 0000000000..8bd32b97fa
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-advsimd.h
>> @@ -0,0 +1,25 @@
>> +/* Test declarations for double-precision Advanced SIMD vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include "test-double.h"
>> +#include "test-math-vector.h"
>> +#include "test-vpcs-vector-wrapper.h"
>> +
>> +#define VEC_SUFF _advsimd
>> +#define VEC_LEN 2
>> diff --git a/sysdeps/aarch64/fpu/test-double-sve-wrappers.c b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c
>> new file mode 100644
>> index 0000000000..8edc5ed5ab
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c
>> @@ -0,0 +1,34 @@
>> +/* Scalar wrappers for double-precision SVE vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <arm_sve.h>
>> +
>> +#include "test-double-sve.h"
>> +
>> +/* Wrapper from scalar to SVE function. Cannot just use VECTOR_WRAPPER due to predication. */
>> +#define SVE_VECTOR_WRAPPER(scalar_func, vector_func) \
>> + extern VEC_TYPE vector_func (VEC_TYPE, svbool_t); \
>> +FLOAT scalar_func (FLOAT x) \
>> +{ \
>> + VEC_TYPE mx = svdup_n_f64 (x); \
>> + VEC_TYPE mr = vector_func (mx, svptrue_b64 ()); \
>> + return svlastb_f64 (svptrue_b64 (), mr); \
>> +}
>> +
>> +SVE_VECTOR_WRAPPER(cos_sve, _ZGVsMxv_cos)
>> diff --git a/sysdeps/aarch64/fpu/test-double-sve.h b/sysdeps/aarch64/fpu/test-double-sve.h
>> new file mode 100644
>> index 0000000000..857a40861d
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-sve.h
>> @@ -0,0 +1,26 @@
>> +/* Test declarations for double-precision SVE vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include "test-double.h"
>> +#include "test-math-vector.h"
>> +
>> +#define REQUIRE_SVE
>> +#define VEC_SUFF _sve
>> +#define VEC_LEN svcntd()
>> +#define VEC_TYPE svfloat64_t
>> diff --git a/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c
>> new file mode 100644
>> index 0000000000..3577ca93b8
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c
>> @@ -0,0 +1,26 @@
>> +/* Scalar wrappers for single-precision Advanced SIMD vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <arm_neon.h>
>> +
>> +#include "test-float-advsimd.h"
>> +
>> +#define VEC_TYPE float32x4_t
>> +
>> +VPCS_VECTOR_WRAPPER(cosf_advsimd, _ZGVnN4v_cosf)
>> diff --git a/sysdeps/aarch64/fpu/test-float-advsimd.h b/sysdeps/aarch64/fpu/test-float-advsimd.h
>> new file mode 100644
>> index 0000000000..86fce613cd
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-advsimd.h
>> @@ -0,0 +1,25 @@
>> +/* Test declarations for singlex-precision Advanced SIMD vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include "test-float.h"
>> +#include "test-math-vector.h"
>> +#include "test-vpcs-vector-wrapper.h"
>> +
>> +#define VEC_SUFF _advsimd
>> +#define VEC_LEN 4
>> diff --git a/sysdeps/aarch64/fpu/test-float-sve-wrappers.c b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c
>> new file mode 100644
>> index 0000000000..b6a944d502
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c
>> @@ -0,0 +1,34 @@
>> +/* Scalar wrappers for single-precision SVE vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include <arm_sve.h>
>> +
>> +#include "test-float-sve.h"
>> +
>> +/* Wrapper from scalar to SVE function. Cannot just use VECTOR_WRAPPER due to predication. */
>> +#define SVE_VECTOR_WRAPPER(scalar_func, vector_func) \
>> + extern VEC_TYPE vector_func (VEC_TYPE, svbool_t); \
>> +FLOAT scalar_func (FLOAT x) \
>> +{ \
>> + VEC_TYPE mx = svdup_n_f32 (x); \
>> + VEC_TYPE mr = vector_func (mx, svptrue_b32 ()); \
>> + return svlastb_f32 (svptrue_b32 (), mr); \
>> +}
>> +
>> +SVE_VECTOR_WRAPPER(cosf_sve, _ZGVsMxv_cosf)
>> diff --git a/sysdeps/aarch64/fpu/test-float-sve.h b/sysdeps/aarch64/fpu/test-float-sve.h
>> new file mode 100644
>> index 0000000000..d6e122cf67
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-sve.h
>> @@ -0,0 +1,26 @@
>> +/* Test declarations for single-precision SVE vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#include "test-float.h"
>> +#include "test-math-vector.h"
>> +
>> +#define REQUIRE_SVE
>> +#define VEC_SUFF _sve
>> +#define VEC_LEN svcntw()
>> +#define VEC_TYPE svfloat32_t
>> diff --git a/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h b/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h
>> new file mode 100644
>> index 0000000000..eb0f0db838
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h
>> @@ -0,0 +1,30 @@
>> +/* Scalar wrapper for vpcs-enabled Advanced SIMD vector math functions.
>> +
>> + Copyright (C) 2023 Free Software Foundation, Inc.
>> + This file is part of the GNU C Library.
>> +
>> + The GNU C Library is free software; you can redistribute it and/or
>> + modify it under the terms of the GNU Lesser General Public
>> + License as published by the Free Software Foundation; either
>> + version 2.1 of the License, or (at your option) any later version.
>> +
>> + The GNU C Library is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + Lesser General Public License for more details.
>> +
>> + You should have received a copy of the GNU Lesser General Public
>> + License along with the GNU C Library; if not, see
>> + <https://www.gnu.org/licenses/>. */
>> +
>> +#define VPCS_VECTOR_WRAPPER(scalar_func, vector_func) \
>> +extern __attribute__ ((aarch64_vector_pcs)) VEC_TYPE vector_func (VEC_TYPE); \
>> +FLOAT scalar_func (FLOAT x) \
>> +{ \
>> + int i; \
>> + VEC_TYPE mx; \
>> + INIT_VEC_LOOP (mx, x, VEC_LEN); \
>> + VEC_TYPE mr = vector_func (mx); \
>> + TEST_VEC_LOOP (mr, VEC_LEN); \
>> + return ((FLOAT) mr[0]); \
>> +}
>> diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
>> new file mode 100644
>> index 0000000000..13af421af2
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
>> @@ -0,0 +1,4 @@
>> +GLIBC_2.38 _ZGVnN2v_cos F
>> +GLIBC_2.38 _ZGVnN4v_cosf F
>> +GLIBC_2.38 _ZGVsMxv_cos F
>> +GLIBC_2.38 _ZGVsMxv_cosf F
next prev parent reply other threads:[~2023-02-09 12:43 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 11:35 Joe Ramsay
2023-02-07 13:11 ` Carlos Seo
2023-02-08 13:11 ` Adhemerval Zanella Netto
2023-02-09 12:43 ` Joe Ramsay [this message]
2023-02-09 12:55 ` Adhemerval Zanella Netto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6565ff59-1a71-bdca-83f3-1f8b4f8f2e35@arm.com \
--to=joe.ramsay@arm.com \
--cc=adhemerval.zanella@linaro.org \
--cc=libc-alpha@sourceware.org \
--cc=szabolcs.nagy@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).