Re: [PATCH] [RFC] Proposal for implementing AArch64 port of libmvec

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Joe Ramsay <joe.ramsay@arm.com>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	libc-alpha@sourceware.org, Szabolcs Nagy <szabolcs.nagy@arm.com>
Subject: Re: [PATCH] [RFC] Proposal for implementing AArch64 port of libmvec
Date: Thu, 9 Feb 2023 12:43:17 +0000	[thread overview]
Message-ID: <6565ff59-1a71-bdca-83f3-1f8b4f8f2e35@arm.com> (raw)
In-Reply-To: <aad589a7-f25f-4710-c03f-91da0030244c@linaro.org>

Thanks for the comments. I will attempt a patch that addresses them, in 
the meantime just a few questions:

On 08/02/2023 13:11, Adhemerval Zanella Netto wrote:
> 
> 
> On 07/02/23 08:35, Joe Ramsay via Libc-alpha wrote:
>> Hi,
>>
>> The attached patch is an attempt to enable libmvec on AArch64. The
>> proposed change is mainly implementing build infrastructure to add the
>> new routines to ABI, tests and benchmarks. I have demonstrated how
>> this all fits together by adding implementations for vector cos, in
>> both single and double precision, targeting both Advanced SIMD and
>> SVE.
>>
>> The implementations of the routines themselves are just loops over the
>> scalar routine from libm for now, as we are more concerned with
>> getting the plumbing right at this point. We plan to contribute vector
>> routines from the Arm Optimized Routines repo that are compliant with
>> requirements described in the libmvec wiki.
>>
>> Any comments/thoughts much appreciated! In particular, the patch
>> raises the minimum GCC to 10, in order to be able to submit routines
>> written using ACLE instead of assembly. This is clearly a big jump,
>> but we have options if this is not acceptable. One option would be to
>> submit compiler-generated assembly, similar to the equivalent routines
>> under sysdeps/x86_64. If GCC 9 is an acceptable compromise then this
>> would only have to be for SVE routines.
> 
> Using C implementation with intrinsics would be idea, there are more easily
> maintained and can leverage compiler improvements.  I rather do it instead
> of the assembly dump Intel did.
> 
> The minimum GCC 10 is not ideal, however I don't see it as blocker either
> (it should be up to arch-maintainers).  One option might be check if
> compiler does not support building libmvec, disable the build and related
> checks.  It is not ideal either, since the resulting glibc won't have
> a complete ABI.
> 
> 
OK, let's see what arch-maintainers make of it.
>>
>> Also, are there plans to merge libmvec into libm, or will they be kept
>> separate?
> 
> There is none afaik.  The libpthread, librt, etc. merge was done to
> fix long standing design and maintanance issues that is not really presented
> with libm and libmvec.  There is still the partial upgrade one, but
> it is still present with a disjoint ld, libc, libm anyway.
> 
> However, it is feasible to merge if your willing to work on it.  We will
> need to keep the x86_64 lib with the sentinel compat symbol (similar to
> what we did for libpthread).
> 
> What I would like to avoid is to have different arquitectures using different
> approaches, for instance aarch64 begin merged while having x86_64 still
> using a different library.  It add a slight more complexity to the build
> process and extra arch specific boilerplate code.
> 
This sounds good - keeping them separate would be our choice too. I have 
not come across the sentinel compat symbol - is this something we need 
to do for AArch64 also?
>>
>> Note that at this point users have to manually call the vector math
>> functions, there is no declaration in math.h to assist auto
>> vectorization of scalar math calls. This seems to be acceptable to
>> some downstream users.
> 
> I think that's the current approach for x86_64 anyway, since most usages
> are done through compiler autovectorization code.
> 
> Some comments below.
> 
>>
>> Thanks,
>> Joe
>> ---
>>   INSTALL                                       |  3 +
>>   manual/install.texi                           |  3 +
>>   sysdeps/aarch64/configure                     | 28 ++++++
>>   sysdeps/aarch64/configure.ac                  | 20 ++++
>>   sysdeps/aarch64/fpu/Makefile                  | 66 +++++++++++++
>>   sysdeps/aarch64/fpu/Versions                  |  8 ++
>>   sysdeps/aarch64/fpu/advsimd_utils.h           | 39 ++++++++
>>   sysdeps/aarch64/fpu/bench-libmvec-skeleton.c  | 83 +++++++++++++++++
>>   sysdeps/aarch64/fpu/bits/math-vector.h        | 65 +++++++++++++
>>   sysdeps/aarch64/fpu/cos_advsimd.c             | 28 ++++++
>>   sysdeps/aarch64/fpu/cos_sve.c                 | 27 ++++++
>>   sysdeps/aarch64/fpu/cosf_advsimd.c            | 28 ++++++
>>   sysdeps/aarch64/fpu/cosf_sve.c                | 27 ++++++
>>   sysdeps/aarch64/fpu/libm-test-ulps            |  7 ++
>>   sysdeps/aarch64/fpu/libm-test-ulps-name       |  1 +
>>   sysdeps/aarch64/fpu/math-tests-arch.h         | 34 +++++++
>>   .../fpu/scripts/bench_libmvec_advsimd.py      | 91 ++++++++++++++++++
>>   .../aarch64/fpu/scripts/bench_libmvec_sve.py  | 93 +++++++++++++++++++
>>   sysdeps/aarch64/fpu/sve_utils.h               | 55 +++++++++++
>>   .../fpu/test-double-advsimd-wrappers.c        | 26 ++++++
>>   sysdeps/aarch64/fpu/test-double-advsimd.h     | 25 +++++
>>   .../aarch64/fpu/test-double-sve-wrappers.c    | 34 +++++++
>>   sysdeps/aarch64/fpu/test-double-sve.h         | 26 ++++++
>>   .../aarch64/fpu/test-float-advsimd-wrappers.c | 26 ++++++
>>   sysdeps/aarch64/fpu/test-float-advsimd.h      | 25 +++++
>>   sysdeps/aarch64/fpu/test-float-sve-wrappers.c | 34 +++++++
>>   sysdeps/aarch64/fpu/test-float-sve.h          | 26 ++++++
>>   .../aarch64/fpu/test-vpcs-vector-wrapper.h    | 30 ++++++
>>   .../unix/sysv/linux/aarch64/libmvec.abilist   |  4 +
>>   29 files changed, 962 insertions(+)
>>   create mode 100644 sysdeps/aarch64/fpu/Makefile
>>   create mode 100644 sysdeps/aarch64/fpu/Versions
>>   create mode 100644 sysdeps/aarch64/fpu/advsimd_utils.h
>>   create mode 100644 sysdeps/aarch64/fpu/bench-libmvec-skeleton.c
>>   create mode 100644 sysdeps/aarch64/fpu/bits/math-vector.h
>>   create mode 100644 sysdeps/aarch64/fpu/cos_advsimd.c
>>   create mode 100644 sysdeps/aarch64/fpu/cos_sve.c
>>   create mode 100644 sysdeps/aarch64/fpu/cosf_advsimd.c
>>   create mode 100644 sysdeps/aarch64/fpu/cosf_sve.c
>>   create mode 100644 sysdeps/aarch64/fpu/libm-test-ulps
>>   create mode 100644 sysdeps/aarch64/fpu/libm-test-ulps-name
>>   create mode 100644 sysdeps/aarch64/fpu/math-tests-arch.h
>>   create mode 100644 sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py
>>   create mode 100755 sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py
>>   create mode 100644 sysdeps/aarch64/fpu/sve_utils.h
>>   create mode 100644 sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c
>>   create mode 100644 sysdeps/aarch64/fpu/test-double-advsimd.h
>>   create mode 100644 sysdeps/aarch64/fpu/test-double-sve-wrappers.c
>>   create mode 100644 sysdeps/aarch64/fpu/test-double-sve.h
>>   create mode 100644 sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c
>>   create mode 100644 sysdeps/aarch64/fpu/test-float-advsimd.h
>>   create mode 100644 sysdeps/aarch64/fpu/test-float-sve-wrappers.c
>>   create mode 100644 sysdeps/aarch64/fpu/test-float-sve.h
>>   create mode 100644 sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h
>>   create mode 100644 sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
>>
>> diff --git a/INSTALL b/INSTALL
>> index 970d6627e2..ba800e41d6 100644
>> --- a/INSTALL
>> +++ b/INSTALL
>> @@ -524,6 +524,9 @@ build the GNU C Library:
>>        For s390x architecture builds, GCC 7.1 or higher is needed (See gcc
>>        Bug 98269).
>>   
>> +     For AArch64 architecture builds with mathvec enabled, GCC 10 or
>> +     higher is needed due to dependency on arm_sve.h.
>> +
>>        For multi-arch support it is recommended to use a GCC which has
>>        been built with support for GNU indirect functions.  This ensures
>>        that correct debugging information is generated for functions
>> diff --git a/manual/install.texi b/manual/install.texi
>> index 260f8a5c82..e9c62b51ae 100644
>> --- a/manual/install.texi
>> +++ b/manual/install.texi
>> @@ -567,6 +567,9 @@ For ARC architecture builds, GCC 8.3 or higher is needed.
>>   
>>   For s390x architecture builds, GCC 7.1 or higher is needed (See gcc Bug 98269).
>>   
>> +For AArch64 architecture builds with mathvec enabled, GCC 10 or higher is needed
>> +due to dependency on arm_sve.h.
>> +
>>   For multi-arch support it is recommended to use a GCC which has been built with
>>   support for GNU indirect functions.  This ensures that correct debugging
>>   information is generated for functions selected by IFUNC resolvers.  This
>> diff --git a/sysdeps/aarch64/configure b/sysdeps/aarch64/configure
>> index 2130f6b8f8..a71c32d70f 100644
>> --- a/sysdeps/aarch64/configure
>> +++ b/sysdeps/aarch64/configure
>> @@ -327,3 +327,31 @@ if test $libc_cv_aarch64_sve_asm = yes; then
>>     $as_echo "#define HAVE_AARCH64_SVE_ASM 1" >>confdefs.h
>>   
>>   fi
>> +
>> +# Check if the local system can run SVE binary
>> +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for local SVE hardware" >&5
>> +$as_echo_n "checking for local SVE hardware... " >&6; }
>> +if ${libc_cv_can_run_sve+:} false; then :
>> +  $as_echo_n "(cached) " >&6
>> +else
>> +    cat > conftest.c <<EOF
>> +#include <sys/auxv.h>
>> +int main(void) {
>> +  if (! (getauxval (AT_HWCAP) & HWCAP_SVE))
>> +    return 1;
>> +  return 0;
>> +}
>> +EOF
>> +  libc_cv_can_run_sve=yes
>> +  ${CC-cc} conftest.c -o conftest
>> +  ./conftest || libc_cv_can_run_sve=no
>> +  rm -f conftest*
>> +fi
>> +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_can_run_sve" >&5
>> +$as_echo "$libc_cv_can_run_sve" >&6; }
>> +config_vars="$config_vars
>> +aarch64-can-run-sve = $libc_cv_can_run_sve"
>> +
>> +if test x"$build_mathvec" = xnotset; then
>> +  build_mathvec=yes
>> +fi
>> diff --git a/sysdeps/aarch64/configure.ac b/sysdeps/aarch64/configure.ac
>> index 85c6f76508..688f8772a6 100644
>> --- a/sysdeps/aarch64/configure.ac
>> +++ b/sysdeps/aarch64/configure.ac
>> @@ -101,3 +101,23 @@ rm -f conftest*])
>>   if test $libc_cv_aarch64_sve_asm = yes; then
>>     AC_DEFINE(HAVE_AARCH64_SVE_ASM)
>>   fi
>> +
>> +# Check if the local system can run SVE binary
>> +AC_CACHE_CHECK(for local SVE hardware, libc_cv_can_run_sve, [dnl
>> +  cat > conftest.c <<EOF
>> +#include <sys/auxv.h>
>> +int main(void) {
>> +  if (! (getauxval (AT_HWCAP) & HWCAP_SVE))
>> +    return 1;
>> +  return 0;
>> +}
>> +EOF
>> +  libc_cv_can_run_sve=yes
>> +  ${CC-cc} conftest.c -o conftest
>> +  ./conftest || libc_cv_can_run_sve=no
>> +  rm -f conftest*])
>> +LIBC_CONFIG_VAR([aarch64-can-run-sve], [$libc_cv_can_run_sve])
>> +
>> +if test x"$build_mathvec" = xnotset; then
>> +  build_mathvec=yes
>> +fi
>> diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile
>> new file mode 100644
>> index 0000000000..caf5d60669
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/Makefile
>> @@ -0,0 +1,66 @@
>> +float-advsimd-funcs = cos
>> +
>> +double-advsimd-funcs = cos
>> +
>> +float-sve-funcs = cos
>> +
>> +double-sve-funcs = cos
>> +
>> +ifeq ($(subdir),mathvec)
>> +libmvec-support = $(addsuffix f_advsimd,$(float-advsimd-funcs)) \
>> +                  $(addsuffix _advsimd,$(double-advsimd-funcs)) \
>> +                  $(addsuffix f_sve,$(float-sve-funcs)) \
>> +                  $(addsuffix _sve,$(double-sve-funcs))
>> +endif
>> +
>> +sve-cflags = -march=armv8-a+sve
>> +
>> +
>> +ifeq ($(build-mathvec),yes)
>> +bench-libmvec = $(addprefix float-advsimd-,$(float-advsimd-funcs)) \
>> +                $(addprefix double-advsimd-,$(double-advsimd-funcs))
>> +
>> +# If not on an SVE-enabled machine, do not add SVE routines to benchmarks.
>> +# The routines are still built.
>> +ifeq ($(aarch64-can-run-sve),yes)
>> +  bench-libmvec += $(addprefix float-sve-,$(float-sve-funcs)) \
>> +                   $(addprefix double-sve-,$(double-sve-funcs))
>> +endif
>> +endif
>> +
>> +$(objpfx)bench-float-advsimd-%.c:
>> +	$(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py $(basename $(@F)) > $@
>> +$(objpfx)bench-double-advsimd-%.c:
>> +	$(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py $(basename $(@F)) > $@
>> +$(objpfx)bench-float-sve-%.c:
>> +	$(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py $(basename $(@F)) > $@
>> +$(objpfx)bench-double-sve-%.c:
>> +	$(PYTHON) $(..)sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py $(basename $(@F)) > $@
>> +
>> +ifeq (${STATIC-BENCHTESTS},yes)
>> +libmvec-benchtests = $(common-objpfx)mathvec/libmvec.a $(common-objpfx)math/libm.a
>> +else
>> +libmvec-benchtests = $(libmvec) $(libm)
>> +endif
>> +
>> +$(addprefix $(objpfx)bench-,$(bench-libmvec)): $(libmvec-benchtests)
>> +
>> +ifeq ($(build-mathvec),yes)
>> +libmvec-tests += float-advsimd double-advsimd float-sve double-sve
>> +endif
>> +
>> +define sve-float-cflags-template
>> +CFLAGS-$(1)f_sve.c += $(sve-cflags)
>> +CFLAGS-bench-float-sve-$(1).c += $(sve-cflags)
>> +endef
>> +
>> +define sve-double-cflags-template
>> +CFLAGS-$(1)_sve.c += $(sve-cflags)
>> +CFLAGS-bench-double-sve-$(1).c += $(sve-cflags)
>> +endef
>> +
>> +$(foreach f,$(float-sve-funcs), $(eval $(call sve-float-cflags-template,$(f))))
>> +$(foreach f,$(double-sve-funcs), $(eval $(call sve-double-cflags-template,$(f))))
>> +
>> +CFLAGS-test-float-sve-wrappers.c = $(sve-cflags)
>> +CFLAGS-test-double-sve-wrappers.c = $(sve-cflags)
>> diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions
>> new file mode 100644
>> index 0000000000..5222a6f180
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/Versions
>> @@ -0,0 +1,8 @@
>> +libmvec {
>> +  GLIBC_2.38 {
>> +    _ZGVnN2v_cos;
>> +    _ZGVnN4v_cosf;
>> +    _ZGVsMxv_cos;
>> +    _ZGVsMxv_cosf;
>> +  }
>> +}
>> diff --git a/sysdeps/aarch64/fpu/advsimd_utils.h b/sysdeps/aarch64/fpu/advsimd_utils.h
>> new file mode 100644
>> index 0000000000..b597a18b8f
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/advsimd_utils.h
>> @@ -0,0 +1,39 @@
>> +/* Helpers for Advanced SIMD vector math funtions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <arm_neon.h>
>> +
>> +#define VPCS_ATTR __attribute__ ((aarch64_vector_pcs))
>> +
>> +#define V_NAME_F1(fun) _ZGVnN4v_##fun##f
>> +#define V_NAME_D1(fun) _ZGVnN2v_##fun
>> +#define V_NAME_F2(fun) _ZGVnN4vv_##fun##f
>> +#define V_NAME_D2(fun) _ZGVnN2vv_##fun
>> +
>> +static inline float32x4_t
> 
> You might considere using __always_inline here if the idea is to use this functions
> as macros.
> 
>> +v_call_f32 (float (*f) (float), float32x4_t x)
>> +{
>> +  return (float32x4_t){f (x[0]), f (x[1]), f (x[2]), f (x[3])};
>> +}
>> +
>> +static inline float64x2_t
>> +v_call_f64 (double (*f) (double), float64x2_t x)
>> +{
>> +  return (float64x2_t){f (x[0]), f (x[1])};
>> +}
>> diff --git a/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c b/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c
>> new file mode 100644
>> index 0000000000..ca6a10d1fe
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/bench-libmvec-skeleton.c
>> @@ -0,0 +1,83 @@
>> +/* Skeleton for libmvec benchmark programs.
>> +   Copyright (C) 2021-2023 Free Software Foundation, Inc.
> 
> I think Copyright year is only 2023 here.
> 
Carlos flagged this up as well. This file was copied, with some 
modifications, from sysdeps/x86_64/fpu/bench-libmvec-skeleton.c - is it 
OK for it to get a brand new copyright header despite not being brand 
new code?
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <string.h>
>> +#include <stdint.h>
>> +#include <stdbool.h>
>> +#include <stdio.h>
>> +#include <time.h>
>> +#include <inttypes.h>
>> +#include <bench-timing.h>
>> +#include <json-lib.h>
>> +#include <bench-util.h>
>> +#include <math-tests-arch.h>
>> +
>> +#include <bench-util.c>
>> +#define D_ITERS 10000
>> +
>> +int
>> +main (int argc, char **argv)
>> +{
>> +  unsigned long i, k;
>> +  timing_t start, end;
>> +  json_ctx_t json_ctx;> +
>> +  bench_start ();
>> +
>> +#ifdef BENCH_INIT
>> +  BENCH_INIT ();
>> +#endif
>> +
>> +  json_init (&json_ctx, 2, stdout);
>> +
>> +  /* Begin function.  */
>> +  json_attr_object_begin (&json_ctx, FUNCNAME);
>> +
>> +  for (int v = 0; v < NUM_VARIANTS; v++)
>> +    {
>> +      double d_total_time = 0;
>> +      timing_t cur;
>> +      for (k = 0; k < D_ITERS; k++)
>> +	{
>> +	  TIMING_NOW (start);
>> +	  for (i = 0; i < NUM_SAMPLES (v); i++)
>> +	    BENCH_FUNC (v, i);
>> +	  TIMING_NOW (end);
>> +
>> +	  TIMING_DIFF (cur, start, end);
>> +
>> +	  TIMING_ACCUM (d_total_time, cur);
>> +	}
>> +      double d_total_data_set = D_ITERS * NUM_SAMPLES (v) * STRIDE;
>> +
>> +      /* Begin variant.  */
>> +      json_attr_object_begin (&json_ctx, VARIANT (v));
>> +
>> +      json_attr_double (&json_ctx, "duration", d_total_time);
>> +      json_attr_double (&json_ctx, "iterations", d_total_data_set);
>> +      json_attr_double (&json_ctx, "mean", d_total_time / d_total_data_set);
>> +
>> +      /* End variant.  */
>> +      json_attr_object_end (&json_ctx);
>> +    }
>> +
>> +  /* End function.  */
>> +  json_attr_object_end (&json_ctx);
>> +
>> +  return 0;
>> +}
> 
> This file is quite similar to x86_64 modulo the extra CPU_FEATURE_ACTIVE checks.
> Maybe try to refactor to use a common definition and parametrize the x86 code
> on a arch-specific code?
> 
>> diff --git a/sysdeps/aarch64/fpu/bits/math-vector.h b/sysdeps/aarch64/fpu/bits/math-vector.h
>> new file mode 100644
>> index 0000000000..a25845bff8
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/bits/math-vector.h
>> @@ -0,0 +1,65 @@
>> +/* Platform-specific SIMD declarations of math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#ifndef _MATH_H
>> +# error "Never include <bits/math-vector.h> directly;\
>> + include <math.h> instead."
>> +#endif
>> +
>> +/* Get default empty definitions for simd declarations.  */
>> +#include <bits/libm-simd-decl-stubs.h>
>> +
>> +#if __GNUC_PREREQ (9, 0)
> 
> I think these tests should move to configure tests instead, it advertises beforehand
> the user that it needs to update the compiler instead through a compiler error.
> 
> The configure check will then check for both advsimd and SVE support, so there is
> no need for __ADVSIMD_VEC_MATH_SUPPORTED or __SVE_VEC_MATH_SUPPORTED.
> 
Apologies, I don't quite understand what you mean by this. I put these 
tests in so that users could compile against the new symbols with math.h 
as long as they had a sufficiently new compiler, but wouldn't get 
undefined types in math.h if they were using an old compiler that didn't 
have e.g. __Float32x4_t. (I think I remarked in the original message 
that new symbols hadn't been added to math.h, but this was not correct).
I don't see how this relates to configure, since this isn't for the 
benefit of library builders, but maybe I have misunderstood? We could 
put a separate check in at configure time for a compiler which is 
sufficient for vector types, but that would not IMO make these tests 
redundant. Let me know what you think.
>> +# define __ADVSIMD_VEC_MATH_SUPPORTED
>> +typedef __Float32x4_t __f32x4_t;
>> +typedef __Float64x2_t __f64x2_t;
>> +#elif __clang_major__ >= 8
>> +# define __ADVSIMD_VEC_MATH_SUPPORTED
>> +typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t;
>> +typedef __attribute__((__neon_vector_type__(2))) double __f64x2_t;
>> +#endif
>> +
>> +#if __GNUC_PREREQ (10, 0) || __clang_major >= 11
>> +# define __SVE_VEC_MATH_SUPPORTED
>> +typedef __SVFloat32_t __sv_f32_t;
>> +typedef __SVFloat64_t __sv_f64_t;
>> +typedef __SVBool_t __sv_bool_t;
>> +#endif
>> +
>> +/* If vector types and vector PCS are unsupported in the working
>> +   compiler, no choice but to omit vector math declarations.  */
>> +
>> +#ifdef __ADVSIMD_VEC_MATH_SUPPORTED
>> +
>> +# define __vpcs __attribute__((__aarch64_vector_pcs__))
>> +
>> +__vpcs __f32x4_t _ZGVnN4v_cosf (__f32x4_t);
>> +__vpcs __f64x2_t _ZGVnN2v_cos (__f64x2_t);
>> +
>> +#undef __ADVSIMD_VEC_MATH_SUPPORTED
>> +#endif /* __ADVSIMD_VEC_MATH_SUPPORTED */
>> +
>> +#ifdef __SVE_VEC_MATH_SUPPORTED
>> +
>> +__sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t);
>> +__sv_f64_t _ZGVsMxv_cos (__sv_f64_t, __sv_bool_t);
>> +
>> +#undef __SVE_VEC_MATH_SUPPORTED
>> +#endif /* __SVE_VEC_MATH_SUPPORTED */
>> +
>> diff --git a/sysdeps/aarch64/fpu/cos_advsimd.c b/sysdeps/aarch64/fpu/cos_advsimd.c
>> new file mode 100644
>> index 0000000000..5a42fbb182
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cos_advsimd.c
>> @@ -0,0 +1,28 @@
>> +/* Double-precision vector (Advanced SIMD) cos function.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <math.h>
>> +
>> +#include "advsimd_utils.h"
>> +
>> +VPCS_ATTR
>> +float64x2_t V_NAME_D1 (cos) (float64x2_t x)
>> +{
>> +  return v_call_f64 (cos, x);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/cos_sve.c b/sysdeps/aarch64/fpu/cos_sve.c
>> new file mode 100644
>> index 0000000000..62bd2ece0e
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cos_sve.c
>> @@ -0,0 +1,27 @@
>> +/* Double-precision vector (SVE) cos function.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <math.h>
>> +
>> +#include "sve_utils.h"
>> +
>> +svfloat64_t SV_NAME_D1 (cos) (svfloat64_t x, svbool_t pg)
>> +{
>> +  return sv_call_f64 (cos, x, svdup_n_f64 (0), pg);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/cosf_advsimd.c b/sysdeps/aarch64/fpu/cosf_advsimd.c
>> new file mode 100644
>> index 0000000000..23f54bd905
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cosf_advsimd.c
>> @@ -0,0 +1,28 @@
>> +/* Single-precision vector (Advanced SIMD) cos function.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <math.h>
>> +
>> +#include "advsimd_utils.h"
>> +
>> +VPCS_ATTR
>> +float32x4_t V_NAME_F1 (cos) (float32x4_t x)
>> +{
>> +  return v_call_f32 (cosf, x);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/cosf_sve.c b/sysdeps/aarch64/fpu/cosf_sve.c
>> new file mode 100644
>> index 0000000000..0c4e365e1e
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/cosf_sve.c
>> @@ -0,0 +1,27 @@
>> +/* Single-precision vector (SVE) cos function.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <math.h>
>> +
>> +#include "sve_utils.h"
>> +
>> +svfloat32_t SV_NAME_F1 (cos) (svfloat32_t x, svbool_t pg)
>> +{
>> +  return sv_call_f32 (cosf, x, svdup_n_f32 (0), pg);
>> +}
>> diff --git a/sysdeps/aarch64/fpu/libm-test-ulps b/sysdeps/aarch64/fpu/libm-test-ulps
>> new file mode 100644
>> index 0000000000..b199d7ddab
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/libm-test-ulps
>> @@ -0,0 +1,7 @@
>> +Function: "cos_advsimd":
>> +double: 2
>> +float: 2
>> +
>> +Function: "cos_sve":
>> +double: 2
>> +float: 2
>> \ No newline at end of file
> 
> Bogus line feed here.
> 
>> diff --git a/sysdeps/aarch64/fpu/libm-test-ulps-name b/sysdeps/aarch64/fpu/libm-test-ulps-name
>> new file mode 100644
>> index 0000000000..1f66c5cda0
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/libm-test-ulps-name
>> @@ -0,0 +1 @@
>> +AArch64
>> diff --git a/sysdeps/aarch64/fpu/math-tests-arch.h b/sysdeps/aarch64/fpu/math-tests-arch.h
>> new file mode 100644
>> index 0000000000..263d4cabf1
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/math-tests-arch.h
>> @@ -0,0 +1,34 @@
>> +/* Runtime architecture check for math tests. AArch64 version.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#ifdef REQUIRE_SVE
>> +# include <sys/auxv.h>
>> +
>> +# define INIT_ARCH_EXT
>> +# define CHECK_ARCH_EXT							\
>> +   do									\
>> +     {									\
>> +       if (!(getauxval (AT_HWCAP) & HWCAP_SVE)) return;			\
>> +     }									\
>> +   while (0)
>> +
>> +#else
>> +# include <sysdeps/generic/math-tests-arch.h>
>> +#endif
>> +
> 
> Spurions new line here.
> 
>> diff --git a/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py b/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py
>> new file mode 100644
>> index 0000000000..9c092670d7
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/scripts/bench_libmvec_advsimd.py
>> @@ -0,0 +1,91 @@
>> +#!/usr/bin/python3
>> +# Copyright (C) 2023 Free Software Foundation, Inc.
>> +# This file is part of the GNU C Library.
>> +#
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <https://www.gnu.org/licenses/>.
>> +
>> +import sys
>> +
>> +TEMPLATE = """
>> +#include <math.h>
>> +#include <arm_neon.h>
>> +
>> +#define STRIDE {stride}
>> +
>> +#define CALL_BENCH_FUNC(v, i) (__extension__ ({{                         \\
>> +   {rtype} mx0 = {fname}(vld1q_f{prec_short} (variants[v].in[i].arg0));  \\
>> +   mx0; }}))
>> +
>> +struct args
>> +{{
>> +  {stype} arg0[STRIDE];
>> +  double timing;
>> +}};
>> +
>> +struct _variants
>> +{{
>> +  const char *name;
>> +  int count;
>> +  struct args *in;
>> +}};
>> +
>> +struct args in0[{rowcount}] = {{
>> +{in_data}
>> +}};
>> +
>> +struct _variants variants[1] = {{
>> +  {{"", {rowcount}, in0}},
>> +}};
> 
> Maybe define them as static const?
> 
>> +
>> +#define NUM_VARIANTS 1
>> +#define NUM_SAMPLES(i) (variants[i].count)
>> +#define VARIANT(i) (variants[i].name)
>> +
>> +// Cannot pass volatile pointer to svst1. This still does not appear to get optimised out.
>> +static {rtype} volatile ret;
>> +
>> +#define BENCH_FUNC(i, j) ({{ ret = CALL_BENCH_FUNC(i, j); }})
>> +#define FUNCNAME "{fname}"
>> +#include <bench-libmvec-skeleton.c>
>> +"""
>> +
>> +def main(name):
>> +    _, prec, _, func = name.split("-")
>> +    scalar_to_advsimd_type = {"double": "float64x2_t", "float": "float32x4_t"}
>> +
>> +    stride = {"double": 2, "float": 4}[prec]
>> +    rtype = scalar_to_advsimd_type[prec]
>> +    atype = scalar_to_advsimd_type[prec]
>> +    fname = f"_ZGVnN{stride}v_{func}{'f' if prec == 'float' else ''}"
>> +    prec_short = {"double": 64, "float": 32}[prec]
>> +
>> +    with open(f"../benchtests/{func}-inputs") as f:
>> +        in_vals = [l.strip() for l in f.readlines() if l and not l.startswith("#")]
>> +    in_vals = [in_vals[i:i+stride] for i in range(0, len(in_vals), stride)]
>> +    rowcount= len(in_vals)
>> +    in_data = ",\n".join("{{" + ", ".join(row) + "}, 0}" for row in in_vals)
>> +
>> +    print(TEMPLATE.format(stride=stride,
>> +                          rtype=rtype,
>> +                          atype=atype,
>> +                          fname=fname,
>> +                          prec_short=prec_short,
>> +                          in_data=in_data,
>> +                          rowcount=rowcount,
>> +                          stype=prec))
>> +
>> +
>> +if __name__ == "__main__":
>> +    main(sys.argv[1])
>> diff --git a/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py b/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py
>> new file mode 100755
>> index 0000000000..0ea21c4c69
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/scripts/bench_libmvec_sve.py
>> @@ -0,0 +1,93 @@
>> +#!/usr/bin/python3
>> +# Copyright (C) 2023 Free Software Foundation, Inc.
>> +# This file is part of the GNU C Library.
>> +#
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <https://www.gnu.org/licenses/>.
>> +
>> +import sys
>> +
>> +TEMPLATE = """
>> +#include <math.h>
>> +#include <arm_sve.h>
>> +
>> +#define STRIDE {stride}
>> +
>> +#define CALL_BENCH_FUNC(v, i) (__extension__ ({{                         \\
>> +   {rtype} mx0 = {fname}(svld1rq_f{prec_short} (svptrue_b{prec_short}(), \\
>> +                                                variants[v].in[i].arg0), \\
>> +                         svptrue_b{prec_short}());                       \\
>> +   mx0; }}))
>> +
>> +struct args
>> +{{
>> +  {stype} arg0[STRIDE];
>> +  double timing;
>> +}};
>> +
>> +struct _variants
>> +{{
>> +  const char *name;
>> +  int count;
>> +  struct args *in;
>> +}};
>> +
>> +struct args in0[{rowcount}] = {{
>> +{in_data}
>> +}};
>> +
>> +struct _variants variants[1] = {{
>> +  {{"", {rowcount}, in0}},
>> +}};
>> +
>> +#define NUM_VARIANTS 1
>> +#define NUM_SAMPLES(i) (variants[i].count)
>> +#define VARIANT(i) (variants[i].name)
>> +
>> +// Cannot pass volatile pointer to svst1. This still does not appear to get optimised out.
>> +static {stype} /*volatile*/ ret[STRIDE];
>> +
>> +#define BENCH_FUNC(i, j) ({{ svst1_f{prec_short}(svwhilelt_b{prec_short}(0, 4), ret, CALL_BENCH_FUNC(i, j)); }})
>> +#define FUNCNAME "{fname}"
>> +#include <bench-libmvec-skeleton.c>
>> +"""
>> +
>> +def main(name):
>> +    _, prec, _, func = name.split("-")
>> +    scalar_to_sve_type = {"double": "svfloat64_t", "float": "svfloat32_t"}
>> +
>> +    stride = {"double": 2, "float": 4}[prec]
>> +    rtype = scalar_to_sve_type[prec]
>> +    atype = scalar_to_sve_type[prec]
>> +    fname = f"_ZGVsMxv_{func}{'f' if prec == 'float' else ''}"
>> +    prec_short = {"double": 64, "float": 32}[prec]
>> +
>> +    with open(f"../benchtests/{func}-inputs") as f:
>> +        in_vals = [l.strip() for l in f.readlines() if l and not l.startswith("#")]
>> +    in_vals = [in_vals[i:i+stride] for i in range(0, len(in_vals), stride)]
>> +    rowcount= len(in_vals)
>> +    in_data = ",\n".join("{{" + ", ".join(row) + "}, 0}" for row in in_vals)
>> +
>> +    print(TEMPLATE.format(stride=stride,
>> +                          rtype=rtype,
>> +                          atype=atype,
>> +                          fname=fname,
>> +                          prec_short=prec_short,
>> +                          in_data=in_data,
>> +                          rowcount=rowcount,
>> +                          stype=prec))
>> +
>> +
>> +if __name__ == "__main__":
>> +    main(sys.argv[1])
>> diff --git a/sysdeps/aarch64/fpu/sve_utils.h b/sysdeps/aarch64/fpu/sve_utils.h
>> new file mode 100644
>> index 0000000000..dbdc03387c
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/sve_utils.h
>> @@ -0,0 +1,55 @@
>> +/* Helpers for SVE vector math funtions.
> 
> s/funtions/functions
> 
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <arm_sve.h>
>> +
>> +#define SV_NAME_F1(fun) _ZGVsMxv_##fun##f
>> +#define SV_NAME_D1(fun) _ZGVsMxv_##fun
>> +#define SV_NAME_F2(fun) _ZGVsMxvv_##fun##f
>> +#define SV_NAME_D2(fun) _ZGVsMxvv_##fun
>> +
>> +static inline svfloat32_t
>> +sv_call_f32 (float (*f) (float), svfloat32_t x, svfloat32_t y, svbool_t cmp)
>> +{
>> +  svbool_t p = svpfirst (cmp, svpfalse ());
>> +  while (svptest_any (cmp, p))
>> +    {
>> +      float elem = svclastb_n_f32 (p, 0, x);
>> +      elem = (*f) (elem);
>> +      svfloat32_t y2 = svdup_n_f32 (elem);
>> +      y = svsel_f32 (p, y2, y);
>> +      p = svpnext_b32 (cmp, p);
>> +    }
>> +  return y;
>> +}
>> +
>> +static inline svfloat64_t
>> +sv_call_f64 (double (*f) (double), svfloat64_t x, svfloat64_t y, svbool_t cmp)
>> +{
>> +  svbool_t p = svpfirst (cmp, svpfalse ());
>> +  while (svptest_any (cmp, p))
>> +    {
>> +      double elem = svclastb_n_f64 (p, 0, x);
>> +      elem = (*f) (elem);
>> +      svfloat64_t y2 = svdup_n_f64 (elem);
>> +      y = svsel_f64 (p, y2, y);
>> +      p = svpnext_b64 (cmp, p);
>> +    }
>> +  return y;
>> +}
>> diff --git a/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c
>> new file mode 100644
>> index 0000000000..52e330f469
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c
>> @@ -0,0 +1,26 @@
>> +/* Scalar wrappers for double-precision Advanced SIMD vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <arm_neon.h>
>> +
>> +#include "test-double-advsimd.h"
>> +
>> +#define VEC_TYPE float64x2_t
>> +
>> +VPCS_VECTOR_WRAPPER(cos_advsimd, _ZGVnN2v_cos)
>> diff --git a/sysdeps/aarch64/fpu/test-double-advsimd.h b/sysdeps/aarch64/fpu/test-double-advsimd.h
>> new file mode 100644
>> index 0000000000..8bd32b97fa
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-advsimd.h
>> @@ -0,0 +1,25 @@
>> +/* Test declarations for double-precision Advanced SIMD vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include "test-double.h"
>> +#include "test-math-vector.h"
>> +#include "test-vpcs-vector-wrapper.h"
>> +
>> +#define VEC_SUFF _advsimd
>> +#define VEC_LEN 2
>> diff --git a/sysdeps/aarch64/fpu/test-double-sve-wrappers.c b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c
>> new file mode 100644
>> index 0000000000..8edc5ed5ab
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c
>> @@ -0,0 +1,34 @@
>> +/* Scalar wrappers for double-precision SVE vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <arm_sve.h>
>> +
>> +#include "test-double-sve.h"
>> +
>> +/* Wrapper from scalar to SVE function. Cannot just use VECTOR_WRAPPER due to predication.  */
>> +#define SVE_VECTOR_WRAPPER(scalar_func, vector_func)			\
>> +  extern VEC_TYPE vector_func (VEC_TYPE, svbool_t);			\
>> +FLOAT scalar_func (FLOAT x)						\
>> +{									\
>> +  VEC_TYPE mx = svdup_n_f64 (x);					\
>> +  VEC_TYPE mr = vector_func (mx, svptrue_b64 ());			\
>> +  return svlastb_f64 (svptrue_b64 (), mr);				\
>> +}
>> +
>> +SVE_VECTOR_WRAPPER(cos_sve, _ZGVsMxv_cos)
>> diff --git a/sysdeps/aarch64/fpu/test-double-sve.h b/sysdeps/aarch64/fpu/test-double-sve.h
>> new file mode 100644
>> index 0000000000..857a40861d
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-double-sve.h
>> @@ -0,0 +1,26 @@
>> +/* Test declarations for double-precision SVE vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include "test-double.h"
>> +#include "test-math-vector.h"
>> +
>> +#define REQUIRE_SVE
>> +#define VEC_SUFF _sve
>> +#define VEC_LEN svcntd()
>> +#define VEC_TYPE svfloat64_t
>> diff --git a/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c
>> new file mode 100644
>> index 0000000000..3577ca93b8
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c
>> @@ -0,0 +1,26 @@
>> +/* Scalar wrappers for single-precision Advanced SIMD vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <arm_neon.h>
>> +
>> +#include "test-float-advsimd.h"
>> +
>> +#define VEC_TYPE float32x4_t
>> +
>> +VPCS_VECTOR_WRAPPER(cosf_advsimd, _ZGVnN4v_cosf)
>> diff --git a/sysdeps/aarch64/fpu/test-float-advsimd.h b/sysdeps/aarch64/fpu/test-float-advsimd.h
>> new file mode 100644
>> index 0000000000..86fce613cd
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-advsimd.h
>> @@ -0,0 +1,25 @@
>> +/* Test declarations for singlex-precision Advanced SIMD vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include "test-float.h"
>> +#include "test-math-vector.h"
>> +#include "test-vpcs-vector-wrapper.h"
>> +
>> +#define VEC_SUFF _advsimd
>> +#define VEC_LEN 4
>> diff --git a/sysdeps/aarch64/fpu/test-float-sve-wrappers.c b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c
>> new file mode 100644
>> index 0000000000..b6a944d502
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c
>> @@ -0,0 +1,34 @@
>> +/* Scalar wrappers for single-precision SVE vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <arm_sve.h>
>> +
>> +#include "test-float-sve.h"
>> +
>> +/* Wrapper from scalar to SVE function. Cannot just use VECTOR_WRAPPER due to predication.  */
>> +#define SVE_VECTOR_WRAPPER(scalar_func, vector_func)			\
>> +  extern VEC_TYPE vector_func (VEC_TYPE, svbool_t);			\
>> +FLOAT scalar_func (FLOAT x)						\
>> +{									\
>> +  VEC_TYPE mx = svdup_n_f32 (x);					\
>> +  VEC_TYPE mr = vector_func (mx, svptrue_b32 ());			\
>> +  return svlastb_f32 (svptrue_b32 (), mr);				\
>> +}
>> +
>> +SVE_VECTOR_WRAPPER(cosf_sve, _ZGVsMxv_cosf)
>> diff --git a/sysdeps/aarch64/fpu/test-float-sve.h b/sysdeps/aarch64/fpu/test-float-sve.h
>> new file mode 100644
>> index 0000000000..d6e122cf67
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-float-sve.h
>> @@ -0,0 +1,26 @@
>> +/* Test declarations for single-precision SVE vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include "test-float.h"
>> +#include "test-math-vector.h"
>> +
>> +#define REQUIRE_SVE
>> +#define VEC_SUFF _sve
>> +#define VEC_LEN svcntw()
>> +#define VEC_TYPE svfloat32_t
>> diff --git a/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h b/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h
>> new file mode 100644
>> index 0000000000..eb0f0db838
>> --- /dev/null
>> +++ b/sysdeps/aarch64/fpu/test-vpcs-vector-wrapper.h
>> @@ -0,0 +1,30 @@
>> +/* Scalar wrapper for vpcs-enabled Advanced SIMD vector math functions.
>> +
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#define VPCS_VECTOR_WRAPPER(scalar_func, vector_func)				\
>> +extern __attribute__ ((aarch64_vector_pcs)) VEC_TYPE vector_func (VEC_TYPE);	\
>> +FLOAT scalar_func (FLOAT x)							\
>> +{										\
>> +  int i;									\
>> +  VEC_TYPE mx;									\
>> +  INIT_VEC_LOOP (mx, x, VEC_LEN);						\
>> +  VEC_TYPE mr = vector_func (mx);						\
>> +  TEST_VEC_LOOP (mr, VEC_LEN);							\
>> +  return ((FLOAT) mr[0]);							\
>> +}
>> diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
>> new file mode 100644
>> index 0000000000..13af421af2
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
>> @@ -0,0 +1,4 @@
>> +GLIBC_2.38 _ZGVnN2v_cos F
>> +GLIBC_2.38 _ZGVnN4v_cosf F
>> +GLIBC_2.38 _ZGVsMxv_cos F
>> +GLIBC_2.38 _ZGVsMxv_cosf F

next prev parent reply	other threads:[~2023-02-09 12:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-07 11:35 Joe Ramsay
2023-02-07 13:11 ` Carlos Seo
2023-02-08 13:11 ` Adhemerval Zanella Netto
2023-02-09 12:43   ` Joe Ramsay [this message]
2023-02-09 12:55     ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6565ff59-1a71-bdca-83f3-1f8b4f8f2e35@arm.com \
    --to=joe.ramsay@arm.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).