public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
To: Christophe Lyon <Christophe.Lyon@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	Richard Earnshaw <Richard.Earnshaw@arm.com>,
	Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Christophe Lyon <Christophe.Lyon@arm.com>
Subject: RE: [PATCH 12/23] arm: [MVE intrinsics] rework vqshlq vshlq
Date: Fri, 5 May 2023 10:58:09 +0000	[thread overview]
Message-ID: <PAXPR08MB69260AF3268042306687D89B93729@PAXPR08MB6926.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <20230505083930.101210-12-christophe.lyon@arm.com>



> -----Original Message-----
> From: Christophe Lyon <christophe.lyon@arm.com>
> Sent: Friday, May 5, 2023 9:39 AM
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>;
> Richard Earnshaw <Richard.Earnshaw@arm.com>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Cc: Christophe Lyon <Christophe.Lyon@arm.com>
> Subject: [PATCH 12/23] arm: [MVE intrinsics] rework vqshlq vshlq
> 
> Implement vqshlq, vshlq using the new MVE builtins framework.
> 

Ok.
Thanks,
Kyrill

> 2022-09-08  Christophe Lyon  <christophe.lyon@arm.com>
> 
> 	gcc/
> 	* config/arm/arm-mve-builtins-base.cc (FUNCTION_WITH_M_N_R):
> New.
> 	(vqshlq, vshlq): New.
> 	* config/arm/arm-mve-builtins-base.def (vqshlq, vshlq): New.
> 	* config/arm/arm-mve-builtins-base.h (vqshlq, vshlq): New.
> 	* config/arm/arm_mve.h (vshlq): Remove.
> 	(vshlq_r): Remove.
> 	(vshlq_n): Remove.
> 	(vshlq_m_r): Remove.
> 	(vshlq_m): Remove.
> 	(vshlq_m_n): Remove.
> 	(vshlq_x): Remove.
> 	(vshlq_x_n): Remove.
> 	(vshlq_s8): Remove.
> 	(vshlq_s16): Remove.
> 	(vshlq_s32): Remove.
> 	(vshlq_u8): Remove.
> 	(vshlq_u16): Remove.
> 	(vshlq_u32): Remove.
> 	(vshlq_r_u8): Remove.
> 	(vshlq_n_u8): Remove.
> 	(vshlq_r_s8): Remove.
> 	(vshlq_n_s8): Remove.
> 	(vshlq_r_u16): Remove.
> 	(vshlq_n_u16): Remove.
> 	(vshlq_r_s16): Remove.
> 	(vshlq_n_s16): Remove.
> 	(vshlq_r_u32): Remove.
> 	(vshlq_n_u32): Remove.
> 	(vshlq_r_s32): Remove.
> 	(vshlq_n_s32): Remove.
> 	(vshlq_m_r_u8): Remove.
> 	(vshlq_m_r_s8): Remove.
> 	(vshlq_m_r_u16): Remove.
> 	(vshlq_m_r_s16): Remove.
> 	(vshlq_m_r_u32): Remove.
> 	(vshlq_m_r_s32): Remove.
> 	(vshlq_m_u8): Remove.
> 	(vshlq_m_s8): Remove.
> 	(vshlq_m_u16): Remove.
> 	(vshlq_m_s16): Remove.
> 	(vshlq_m_u32): Remove.
> 	(vshlq_m_s32): Remove.
> 	(vshlq_m_n_s8): Remove.
> 	(vshlq_m_n_s32): Remove.
> 	(vshlq_m_n_s16): Remove.
> 	(vshlq_m_n_u8): Remove.
> 	(vshlq_m_n_u32): Remove.
> 	(vshlq_m_n_u16): Remove.
> 	(vshlq_x_s8): Remove.
> 	(vshlq_x_s16): Remove.
> 	(vshlq_x_s32): Remove.
> 	(vshlq_x_u8): Remove.
> 	(vshlq_x_u16): Remove.
> 	(vshlq_x_u32): Remove.
> 	(vshlq_x_n_s8): Remove.
> 	(vshlq_x_n_s16): Remove.
> 	(vshlq_x_n_s32): Remove.
> 	(vshlq_x_n_u8): Remove.
> 	(vshlq_x_n_u16): Remove.
> 	(vshlq_x_n_u32): Remove.
> 	(__arm_vshlq_s8): Remove.
> 	(__arm_vshlq_s16): Remove.
> 	(__arm_vshlq_s32): Remove.
> 	(__arm_vshlq_u8): Remove.
> 	(__arm_vshlq_u16): Remove.
> 	(__arm_vshlq_u32): Remove.
> 	(__arm_vshlq_r_u8): Remove.
> 	(__arm_vshlq_n_u8): Remove.
> 	(__arm_vshlq_r_s8): Remove.
> 	(__arm_vshlq_n_s8): Remove.
> 	(__arm_vshlq_r_u16): Remove.
> 	(__arm_vshlq_n_u16): Remove.
> 	(__arm_vshlq_r_s16): Remove.
> 	(__arm_vshlq_n_s16): Remove.
> 	(__arm_vshlq_r_u32): Remove.
> 	(__arm_vshlq_n_u32): Remove.
> 	(__arm_vshlq_r_s32): Remove.
> 	(__arm_vshlq_n_s32): Remove.
> 	(__arm_vshlq_m_r_u8): Remove.
> 	(__arm_vshlq_m_r_s8): Remove.
> 	(__arm_vshlq_m_r_u16): Remove.
> 	(__arm_vshlq_m_r_s16): Remove.
> 	(__arm_vshlq_m_r_u32): Remove.
> 	(__arm_vshlq_m_r_s32): Remove.
> 	(__arm_vshlq_m_u8): Remove.
> 	(__arm_vshlq_m_s8): Remove.
> 	(__arm_vshlq_m_u16): Remove.
> 	(__arm_vshlq_m_s16): Remove.
> 	(__arm_vshlq_m_u32): Remove.
> 	(__arm_vshlq_m_s32): Remove.
> 	(__arm_vshlq_m_n_s8): Remove.
> 	(__arm_vshlq_m_n_s32): Remove.
> 	(__arm_vshlq_m_n_s16): Remove.
> 	(__arm_vshlq_m_n_u8): Remove.
> 	(__arm_vshlq_m_n_u32): Remove.
> 	(__arm_vshlq_m_n_u16): Remove.
> 	(__arm_vshlq_x_s8): Remove.
> 	(__arm_vshlq_x_s16): Remove.
> 	(__arm_vshlq_x_s32): Remove.
> 	(__arm_vshlq_x_u8): Remove.
> 	(__arm_vshlq_x_u16): Remove.
> 	(__arm_vshlq_x_u32): Remove.
> 	(__arm_vshlq_x_n_s8): Remove.
> 	(__arm_vshlq_x_n_s16): Remove.
> 	(__arm_vshlq_x_n_s32): Remove.
> 	(__arm_vshlq_x_n_u8): Remove.
> 	(__arm_vshlq_x_n_u16): Remove.
> 	(__arm_vshlq_x_n_u32): Remove.
> 	(__arm_vshlq): Remove.
> 	(__arm_vshlq_r): Remove.
> 	(__arm_vshlq_n): Remove.
> 	(__arm_vshlq_m_r): Remove.
> 	(__arm_vshlq_m): Remove.
> 	(__arm_vshlq_m_n): Remove.
> 	(__arm_vshlq_x): Remove.
> 	(__arm_vshlq_x_n): Remove.
> 	(vqshlq): Remove.
> 	(vqshlq_r): Remove.
> 	(vqshlq_n): Remove.
> 	(vqshlq_m_r): Remove.
> 	(vqshlq_m_n): Remove.
> 	(vqshlq_m): Remove.
> 	(vqshlq_u8): Remove.
> 	(vqshlq_r_u8): Remove.
> 	(vqshlq_n_u8): Remove.
> 	(vqshlq_s8): Remove.
> 	(vqshlq_r_s8): Remove.
> 	(vqshlq_n_s8): Remove.
> 	(vqshlq_u16): Remove.
> 	(vqshlq_r_u16): Remove.
> 	(vqshlq_n_u16): Remove.
> 	(vqshlq_s16): Remove.
> 	(vqshlq_r_s16): Remove.
> 	(vqshlq_n_s16): Remove.
> 	(vqshlq_u32): Remove.
> 	(vqshlq_r_u32): Remove.
> 	(vqshlq_n_u32): Remove.
> 	(vqshlq_s32): Remove.
> 	(vqshlq_r_s32): Remove.
> 	(vqshlq_n_s32): Remove.
> 	(vqshlq_m_r_u8): Remove.
> 	(vqshlq_m_r_s8): Remove.
> 	(vqshlq_m_r_u16): Remove.
> 	(vqshlq_m_r_s16): Remove.
> 	(vqshlq_m_r_u32): Remove.
> 	(vqshlq_m_r_s32): Remove.
> 	(vqshlq_m_n_s8): Remove.
> 	(vqshlq_m_n_s32): Remove.
> 	(vqshlq_m_n_s16): Remove.
> 	(vqshlq_m_n_u8): Remove.
> 	(vqshlq_m_n_u32): Remove.
> 	(vqshlq_m_n_u16): Remove.
> 	(vqshlq_m_s8): Remove.
> 	(vqshlq_m_s32): Remove.
> 	(vqshlq_m_s16): Remove.
> 	(vqshlq_m_u8): Remove.
> 	(vqshlq_m_u32): Remove.
> 	(vqshlq_m_u16): Remove.
> 	(__arm_vqshlq_u8): Remove.
> 	(__arm_vqshlq_r_u8): Remove.
> 	(__arm_vqshlq_n_u8): Remove.
> 	(__arm_vqshlq_s8): Remove.
> 	(__arm_vqshlq_r_s8): Remove.
> 	(__arm_vqshlq_n_s8): Remove.
> 	(__arm_vqshlq_u16): Remove.
> 	(__arm_vqshlq_r_u16): Remove.
> 	(__arm_vqshlq_n_u16): Remove.
> 	(__arm_vqshlq_s16): Remove.
> 	(__arm_vqshlq_r_s16): Remove.
> 	(__arm_vqshlq_n_s16): Remove.
> 	(__arm_vqshlq_u32): Remove.
> 	(__arm_vqshlq_r_u32): Remove.
> 	(__arm_vqshlq_n_u32): Remove.
> 	(__arm_vqshlq_s32): Remove.
> 	(__arm_vqshlq_r_s32): Remove.
> 	(__arm_vqshlq_n_s32): Remove.
> 	(__arm_vqshlq_m_r_u8): Remove.
> 	(__arm_vqshlq_m_r_s8): Remove.
> 	(__arm_vqshlq_m_r_u16): Remove.
> 	(__arm_vqshlq_m_r_s16): Remove.
> 	(__arm_vqshlq_m_r_u32): Remove.
> 	(__arm_vqshlq_m_r_s32): Remove.
> 	(__arm_vqshlq_m_n_s8): Remove.
> 	(__arm_vqshlq_m_n_s32): Remove.
> 	(__arm_vqshlq_m_n_s16): Remove.
> 	(__arm_vqshlq_m_n_u8): Remove.
> 	(__arm_vqshlq_m_n_u32): Remove.
> 	(__arm_vqshlq_m_n_u16): Remove.
> 	(__arm_vqshlq_m_s8): Remove.
> 	(__arm_vqshlq_m_s32): Remove.
> 	(__arm_vqshlq_m_s16): Remove.
> 	(__arm_vqshlq_m_u8): Remove.
> 	(__arm_vqshlq_m_u32): Remove.
> 	(__arm_vqshlq_m_u16): Remove.
> 	(__arm_vqshlq): Remove.
> 	(__arm_vqshlq_r): Remove.
> 	(__arm_vqshlq_n): Remove.
> 	(__arm_vqshlq_m_r): Remove.
> 	(__arm_vqshlq_m_n): Remove.
> 	(__arm_vqshlq_m): Remove.
> ---
>  gcc/config/arm/arm-mve-builtins-base.cc  |   13 +
>  gcc/config/arm/arm-mve-builtins-base.def |    4 +
>  gcc/config/arm/arm-mve-builtins-base.h   |    2 +
>  gcc/config/arm/arm_mve.h                 | 1552 +---------------------
>  4 files changed, 49 insertions(+), 1522 deletions(-)
> 
> diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-
> mve-builtins-base.cc
> index a74119db917..4bebf86f784 100644
> --- a/gcc/config/arm/arm-mve-builtins-base.cc
> +++ b/gcc/config/arm/arm-mve-builtins-base.cc
> @@ -128,6 +128,17 @@ namespace arm_mve {
>      UNSPEC##_M_S, UNSPEC##_M_U, -1,
> 	\
>      UNSPEC##_M_N_S, UNSPEC##_M_N_U, -1))
> 
> +  /* Helper for vshl builtins with only unspec codes, _m predicated
> +     and _n and _r overrides.  */
> +#define FUNCTION_WITH_M_N_R(NAME, UNSPEC) FUNCTION
> 	\
> +  (NAME, unspec_mve_function_exact_insn_vshl,
> 	\
> +   (UNSPEC##_S, UNSPEC##_U,						\
> +    UNSPEC##_N_S, UNSPEC##_N_U,
> 	\
> +    UNSPEC##_M_S, UNSPEC##_M_U,
> 	\
> +    UNSPEC##_M_N_S, UNSPEC##_M_N_U,
> 	\
> +    UNSPEC##_M_R_S, UNSPEC##_M_R_U,
> 	\
> +    UNSPEC##_R_S, UNSPEC##_R_U))
> +
>    /* Helper for builtins with only unspec codes, _m predicated
>       overrides, no _n and no floating-point version.  */
>  #define FUNCTION_WITHOUT_N_NO_F(NAME, UNSPEC) FUNCTION
> 		\
> @@ -169,11 +180,13 @@ FUNCTION_WITH_M_N_NO_F (vqaddq, VQADDQ)
>  FUNCTION_WITH_M_N_NO_U_F (vqdmulhq, VQDMULHQ)
>  FUNCTION_WITH_M_N_NO_F (vqrshlq, VQRSHLQ)
>  FUNCTION_WITH_M_N_NO_U_F (vqrdmulhq, VQRDMULHQ)
> +FUNCTION_WITH_M_N_R (vqshlq, VQSHLQ)
>  FUNCTION_WITH_M_N_NO_F (vqsubq, VQSUBQ)
>  FUNCTION (vreinterpretq, vreinterpretq_impl,)
>  FUNCTION_WITHOUT_N_NO_F (vrhaddq, VRHADDQ)
>  FUNCTION_WITHOUT_N_NO_F (vrmulhq, VRMULHQ)
>  FUNCTION_WITH_M_N_NO_F (vrshlq, VRSHLQ)
> +FUNCTION_WITH_M_N_R (vshlq, VSHLQ)
>  FUNCTION_WITH_RTX_M_N (vsubq, MINUS, VSUBQ)
>  FUNCTION (vuninitializedq, vuninitializedq_impl,)
> 
> diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-
> mve-builtins-base.def
> index 9230837fd43..f2e40cda2af 100644
> --- a/gcc/config/arm/arm-mve-builtins-base.def
> +++ b/gcc/config/arm/arm-mve-builtins-base.def
> @@ -32,11 +32,15 @@ DEF_MVE_FUNCTION (vqaddq, binary_opt_n,
> all_integer, m_or_none)
>  DEF_MVE_FUNCTION (vqdmulhq, binary_opt_n, all_signed, m_or_none)
>  DEF_MVE_FUNCTION (vqrdmulhq, binary_opt_n, all_signed, m_or_none)
>  DEF_MVE_FUNCTION (vqrshlq, binary_round_lshift, all_integer, m_or_none)
> +DEF_MVE_FUNCTION (vqshlq, binary_lshift, all_integer, m_or_none)
> +DEF_MVE_FUNCTION (vqshlq, binary_lshift_r, all_integer, m_or_none)
>  DEF_MVE_FUNCTION (vqsubq, binary_opt_n, all_integer, m_or_none)
>  DEF_MVE_FUNCTION (vreinterpretq, unary_convert, reinterpret_integer,
> none)
>  DEF_MVE_FUNCTION (vrhaddq, binary, all_integer, mx_or_none)
>  DEF_MVE_FUNCTION (vrmulhq, binary, all_integer, mx_or_none)
>  DEF_MVE_FUNCTION (vrshlq, binary_round_lshift, all_integer, mx_or_none)
> +DEF_MVE_FUNCTION (vshlq, binary_lshift, all_integer, mx_or_none)
> +DEF_MVE_FUNCTION (vshlq, binary_lshift_r, all_integer, m_or_none) // "_r"
> forms do not support the "x" predicate
>  DEF_MVE_FUNCTION (vsubq, binary_opt_n, all_integer, mx_or_none)
>  DEF_MVE_FUNCTION (vuninitializedq, inherent, all_integer_with_64, none)
>  #undef REQUIRES_FLOAT
> diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-
> mve-builtins-base.h
> index d9d45d1925a..5b62de6a922 100644
> --- a/gcc/config/arm/arm-mve-builtins-base.h
> +++ b/gcc/config/arm/arm-mve-builtins-base.h
> @@ -37,11 +37,13 @@ extern const function_base *const vqaddq;
>  extern const function_base *const vqdmulhq;
>  extern const function_base *const vqrdmulhq;
>  extern const function_base *const vqrshlq;
> +extern const function_base *const vqshlq;
>  extern const function_base *const vqsubq;
>  extern const function_base *const vreinterpretq;
>  extern const function_base *const vrhaddq;
>  extern const function_base *const vrmulhq;
>  extern const function_base *const vrshlq;
> +extern const function_base *const vshlq;
>  extern const function_base *const vsubq;
>  extern const function_base *const vuninitializedq;
> 
> diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
> index 175d9955c33..ad67dcfd024 100644
> --- a/gcc/config/arm/arm_mve.h
> +++ b/gcc/config/arm/arm_mve.h
> @@ -60,7 +60,6 @@
>  #define vshrq(__a, __imm) __arm_vshrq(__a, __imm)
>  #define vaddlvq_p(__a, __p) __arm_vaddlvq_p(__a, __p)
>  #define vcmpneq(__a, __b) __arm_vcmpneq(__a, __b)
> -#define vshlq(__a, __b) __arm_vshlq(__a, __b)
>  #define vornq(__a, __b) __arm_vornq(__a, __b)
>  #define vmulltq_int(__a, __b) __arm_vmulltq_int(__a, __b)
>  #define vmullbq_int(__a, __b) __arm_vmullbq_int(__a, __b)
> @@ -77,17 +76,12 @@
>  #define vbicq(__a, __b) __arm_vbicq(__a, __b)
>  #define vaddvq_p(__a, __p) __arm_vaddvq_p(__a, __p)
>  #define vaddvaq(__a, __b) __arm_vaddvaq(__a, __b)
> -#define vshlq_r(__a, __b) __arm_vshlq_r(__a, __b)
> -#define vqshlq(__a, __b) __arm_vqshlq(__a, __b)
> -#define vqshlq_r(__a, __b) __arm_vqshlq_r(__a, __b)
>  #define vminavq(__a, __b) __arm_vminavq(__a, __b)
>  #define vminaq(__a, __b) __arm_vminaq(__a, __b)
>  #define vmaxavq(__a, __b) __arm_vmaxavq(__a, __b)
>  #define vmaxaq(__a, __b) __arm_vmaxaq(__a, __b)
>  #define vbrsrq(__a, __b) __arm_vbrsrq(__a, __b)
> -#define vshlq_n(__a, __imm) __arm_vshlq_n(__a, __imm)
>  #define vrshrq(__a, __imm) __arm_vrshrq(__a, __imm)
> -#define vqshlq_n(__a, __imm) __arm_vqshlq_n(__a, __imm)
>  #define vcmpltq(__a, __b) __arm_vcmpltq(__a, __b)
>  #define vcmpleq(__a, __b) __arm_vcmpleq(__a, __b)
>  #define vcmpgtq(__a, __b) __arm_vcmpgtq(__a, __b)
> @@ -148,8 +142,6 @@
>  #define vaddvaq_p(__a, __b, __p) __arm_vaddvaq_p(__a, __b, __p)
>  #define vsriq(__a, __b, __imm) __arm_vsriq(__a, __b, __imm)
>  #define vsliq(__a, __b, __imm) __arm_vsliq(__a, __b, __imm)
> -#define vshlq_m_r(__a, __b, __p) __arm_vshlq_m_r(__a, __b, __p)
> -#define vqshlq_m_r(__a, __b, __p) __arm_vqshlq_m_r(__a, __b, __p)
>  #define vminavq_p(__a, __b, __p) __arm_vminavq_p(__a, __b, __p)
>  #define vminaq_m(__a, __b, __p) __arm_vminaq_m(__a, __b, __p)
>  #define vmaxavq_p(__a, __b, __p) __arm_vmaxavq_p(__a, __b, __p)
> @@ -216,7 +208,6 @@
>  #define vsriq_m(__a, __b, __imm, __p) __arm_vsriq_m(__a, __b, __imm,
> __p)
>  #define vqshluq_m(__inactive, __a, __imm, __p)
> __arm_vqshluq_m(__inactive, __a, __imm, __p)
>  #define vabavq_p(__a, __b, __c, __p) __arm_vabavq_p(__a, __b, __c, __p)
> -#define vshlq_m(__inactive, __a, __b, __p) __arm_vshlq_m(__inactive, __a,
> __b, __p)
>  #define vbicq_m(__inactive, __a, __b, __p) __arm_vbicq_m(__inactive, __a,
> __b, __p)
>  #define vbrsrq_m(__inactive, __a, __b, __p) __arm_vbrsrq_m(__inactive,
> __a, __b, __p)
>  #define vcaddq_rot270_m(__inactive, __a, __b, __p)
> __arm_vcaddq_rot270_m(__inactive, __a, __b, __p)
> @@ -246,10 +237,7 @@
>  #define vqrdmlashq_m(__a, __b, __c, __p) __arm_vqrdmlashq_m(__a, __b,
> __c, __p)
>  #define vqrdmlsdhq_m(__inactive, __a, __b, __p)
> __arm_vqrdmlsdhq_m(__inactive, __a, __b, __p)
>  #define vqrdmlsdhxq_m(__inactive, __a, __b, __p)
> __arm_vqrdmlsdhxq_m(__inactive, __a, __b, __p)
> -#define vqshlq_m_n(__inactive, __a, __imm, __p)
> __arm_vqshlq_m_n(__inactive, __a, __imm, __p)
> -#define vqshlq_m(__inactive, __a, __b, __p) __arm_vqshlq_m(__inactive,
> __a, __b, __p)
>  #define vrshrq_m(__inactive, __a, __imm, __p) __arm_vrshrq_m(__inactive,
> __a, __imm, __p)
> -#define vshlq_m_n(__inactive, __a, __imm, __p)
> __arm_vshlq_m_n(__inactive, __a, __imm, __p)
>  #define vshrq_m(__inactive, __a, __imm, __p) __arm_vshrq_m(__inactive,
> __a, __imm, __p)
>  #define vsliq_m(__a, __b, __imm, __p) __arm_vsliq_m(__a, __b, __imm,
> __p)
>  #define vmlaldavaq_p(__a, __b, __c, __p) __arm_vmlaldavaq_p(__a, __b,
> __c, __p)
> @@ -376,8 +364,6 @@
>  #define vrev64q_x(__a, __p) __arm_vrev64q_x(__a, __p)
>  #define vshllbq_x(__a, __imm, __p) __arm_vshllbq_x(__a, __imm, __p)
>  #define vshlltq_x(__a, __imm, __p) __arm_vshlltq_x(__a, __imm, __p)
> -#define vshlq_x(__a, __b, __p) __arm_vshlq_x(__a, __b, __p)
> -#define vshlq_x_n(__a, __imm, __p) __arm_vshlq_x_n(__a, __imm, __p)
>  #define vrshrq_x(__a, __imm, __p) __arm_vrshrq_x(__a, __imm, __p)
>  #define vshrq_x(__a, __imm, __p) __arm_vshrq_x(__a, __imm, __p)
>  #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out)
> @@ -623,12 +609,6 @@
>  #define vcmpneq_u8(__a, __b) __arm_vcmpneq_u8(__a, __b)
>  #define vcmpneq_u16(__a, __b) __arm_vcmpneq_u16(__a, __b)
>  #define vcmpneq_u32(__a, __b) __arm_vcmpneq_u32(__a, __b)
> -#define vshlq_s8(__a, __b) __arm_vshlq_s8(__a, __b)
> -#define vshlq_s16(__a, __b) __arm_vshlq_s16(__a, __b)
> -#define vshlq_s32(__a, __b) __arm_vshlq_s32(__a, __b)
> -#define vshlq_u8(__a, __b) __arm_vshlq_u8(__a, __b)
> -#define vshlq_u16(__a, __b) __arm_vshlq_u16(__a, __b)
> -#define vshlq_u32(__a, __b) __arm_vshlq_u32(__a, __b)
>  #define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b)
>  #define vmulltq_int_u8(__a, __b) __arm_vmulltq_int_u8(__a, __b)
>  #define vmullbq_int_u8(__a, __b) __arm_vmullbq_int_u8(__a, __b)
> @@ -649,17 +629,12 @@
>  #define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b)
>  #define vaddvq_p_u8(__a, __p) __arm_vaddvq_p_u8(__a, __p)
>  #define vaddvaq_u8(__a, __b) __arm_vaddvaq_u8(__a, __b)
> -#define vshlq_r_u8(__a, __b) __arm_vshlq_r_u8(__a, __b)
> -#define vqshlq_u8(__a, __b) __arm_vqshlq_u8(__a, __b)
> -#define vqshlq_r_u8(__a, __b) __arm_vqshlq_r_u8(__a, __b)
>  #define vminavq_s8(__a, __b) __arm_vminavq_s8(__a, __b)
>  #define vminaq_s8(__a, __b) __arm_vminaq_s8(__a, __b)
>  #define vmaxavq_s8(__a, __b) __arm_vmaxavq_s8(__a, __b)
>  #define vmaxaq_s8(__a, __b) __arm_vmaxaq_s8(__a, __b)
>  #define vbrsrq_n_u8(__a, __b) __arm_vbrsrq_n_u8(__a, __b)
> -#define vshlq_n_u8(__a,  __imm) __arm_vshlq_n_u8(__a,  __imm)
>  #define vrshrq_n_u8(__a,  __imm) __arm_vrshrq_n_u8(__a,  __imm)
> -#define vqshlq_n_u8(__a,  __imm) __arm_vqshlq_n_u8(__a,  __imm)
>  #define vcmpneq_n_s8(__a, __b) __arm_vcmpneq_n_s8(__a, __b)
>  #define vcmpltq_s8(__a, __b) __arm_vcmpltq_s8(__a, __b)
>  #define vcmpltq_n_s8(__a, __b) __arm_vcmpltq_n_s8(__a, __b)
> @@ -673,9 +648,6 @@
>  #define vcmpeqq_n_s8(__a, __b) __arm_vcmpeqq_n_s8(__a, __b)
>  #define vqshluq_n_s8(__a,  __imm) __arm_vqshluq_n_s8(__a,  __imm)
>  #define vaddvq_p_s8(__a, __p) __arm_vaddvq_p_s8(__a, __p)
> -#define vshlq_r_s8(__a, __b) __arm_vshlq_r_s8(__a, __b)
> -#define vqshlq_s8(__a, __b) __arm_vqshlq_s8(__a, __b)
> -#define vqshlq_r_s8(__a, __b) __arm_vqshlq_r_s8(__a, __b)
>  #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b)
>  #define vmulltq_int_s8(__a, __b) __arm_vmulltq_int_s8(__a, __b)
>  #define vmullbq_int_s8(__a, __b) __arm_vmullbq_int_s8(__a, __b)
> @@ -694,9 +666,7 @@
>  #define vbrsrq_n_s8(__a, __b) __arm_vbrsrq_n_s8(__a, __b)
>  #define vbicq_s8(__a, __b) __arm_vbicq_s8(__a, __b)
>  #define vaddvaq_s8(__a, __b) __arm_vaddvaq_s8(__a, __b)
> -#define vshlq_n_s8(__a,  __imm) __arm_vshlq_n_s8(__a,  __imm)
>  #define vrshrq_n_s8(__a,  __imm) __arm_vrshrq_n_s8(__a,  __imm)
> -#define vqshlq_n_s8(__a,  __imm) __arm_vqshlq_n_s8(__a,  __imm)
>  #define vornq_u16(__a, __b) __arm_vornq_u16(__a, __b)
>  #define vmulltq_int_u16(__a, __b) __arm_vmulltq_int_u16(__a, __b)
>  #define vmullbq_int_u16(__a, __b) __arm_vmullbq_int_u16(__a, __b)
> @@ -717,17 +687,12 @@
>  #define vbicq_u16(__a, __b) __arm_vbicq_u16(__a, __b)
>  #define vaddvq_p_u16(__a, __p) __arm_vaddvq_p_u16(__a, __p)
>  #define vaddvaq_u16(__a, __b) __arm_vaddvaq_u16(__a, __b)
> -#define vshlq_r_u16(__a, __b) __arm_vshlq_r_u16(__a, __b)
> -#define vqshlq_u16(__a, __b) __arm_vqshlq_u16(__a, __b)
> -#define vqshlq_r_u16(__a, __b) __arm_vqshlq_r_u16(__a, __b)
>  #define vminavq_s16(__a, __b) __arm_vminavq_s16(__a, __b)
>  #define vminaq_s16(__a, __b) __arm_vminaq_s16(__a, __b)
>  #define vmaxavq_s16(__a, __b) __arm_vmaxavq_s16(__a, __b)
>  #define vmaxaq_s16(__a, __b) __arm_vmaxaq_s16(__a, __b)
>  #define vbrsrq_n_u16(__a, __b) __arm_vbrsrq_n_u16(__a, __b)
> -#define vshlq_n_u16(__a,  __imm) __arm_vshlq_n_u16(__a,  __imm)
>  #define vrshrq_n_u16(__a,  __imm) __arm_vrshrq_n_u16(__a,  __imm)
> -#define vqshlq_n_u16(__a,  __imm) __arm_vqshlq_n_u16(__a,  __imm)
>  #define vcmpneq_n_s16(__a, __b) __arm_vcmpneq_n_s16(__a, __b)
>  #define vcmpltq_s16(__a, __b) __arm_vcmpltq_s16(__a, __b)
>  #define vcmpltq_n_s16(__a, __b) __arm_vcmpltq_n_s16(__a, __b)
> @@ -741,9 +706,6 @@
>  #define vcmpeqq_n_s16(__a, __b) __arm_vcmpeqq_n_s16(__a, __b)
>  #define vqshluq_n_s16(__a,  __imm) __arm_vqshluq_n_s16(__a,  __imm)
>  #define vaddvq_p_s16(__a, __p) __arm_vaddvq_p_s16(__a, __p)
> -#define vshlq_r_s16(__a, __b) __arm_vshlq_r_s16(__a, __b)
> -#define vqshlq_s16(__a, __b) __arm_vqshlq_s16(__a, __b)
> -#define vqshlq_r_s16(__a, __b) __arm_vqshlq_r_s16(__a, __b)
>  #define vornq_s16(__a, __b) __arm_vornq_s16(__a, __b)
>  #define vmulltq_int_s16(__a, __b) __arm_vmulltq_int_s16(__a, __b)
>  #define vmullbq_int_s16(__a, __b) __arm_vmullbq_int_s16(__a, __b)
> @@ -762,9 +724,7 @@
>  #define vbrsrq_n_s16(__a, __b) __arm_vbrsrq_n_s16(__a, __b)
>  #define vbicq_s16(__a, __b) __arm_vbicq_s16(__a, __b)
>  #define vaddvaq_s16(__a, __b) __arm_vaddvaq_s16(__a, __b)
> -#define vshlq_n_s16(__a,  __imm) __arm_vshlq_n_s16(__a,  __imm)
>  #define vrshrq_n_s16(__a,  __imm) __arm_vrshrq_n_s16(__a,  __imm)
> -#define vqshlq_n_s16(__a,  __imm) __arm_vqshlq_n_s16(__a,  __imm)
>  #define vornq_u32(__a, __b) __arm_vornq_u32(__a, __b)
>  #define vmulltq_int_u32(__a, __b) __arm_vmulltq_int_u32(__a, __b)
>  #define vmullbq_int_u32(__a, __b) __arm_vmullbq_int_u32(__a, __b)
> @@ -785,17 +745,12 @@
>  #define vbicq_u32(__a, __b) __arm_vbicq_u32(__a, __b)
>  #define vaddvq_p_u32(__a, __p) __arm_vaddvq_p_u32(__a, __p)
>  #define vaddvaq_u32(__a, __b) __arm_vaddvaq_u32(__a, __b)
> -#define vshlq_r_u32(__a, __b) __arm_vshlq_r_u32(__a, __b)
> -#define vqshlq_u32(__a, __b) __arm_vqshlq_u32(__a, __b)
> -#define vqshlq_r_u32(__a, __b) __arm_vqshlq_r_u32(__a, __b)
>  #define vminavq_s32(__a, __b) __arm_vminavq_s32(__a, __b)
>  #define vminaq_s32(__a, __b) __arm_vminaq_s32(__a, __b)
>  #define vmaxavq_s32(__a, __b) __arm_vmaxavq_s32(__a, __b)
>  #define vmaxaq_s32(__a, __b) __arm_vmaxaq_s32(__a, __b)
>  #define vbrsrq_n_u32(__a, __b) __arm_vbrsrq_n_u32(__a, __b)
> -#define vshlq_n_u32(__a,  __imm) __arm_vshlq_n_u32(__a,  __imm)
>  #define vrshrq_n_u32(__a,  __imm) __arm_vrshrq_n_u32(__a,  __imm)
> -#define vqshlq_n_u32(__a,  __imm) __arm_vqshlq_n_u32(__a,  __imm)
>  #define vcmpneq_n_s32(__a, __b) __arm_vcmpneq_n_s32(__a, __b)
>  #define vcmpltq_s32(__a, __b) __arm_vcmpltq_s32(__a, __b)
>  #define vcmpltq_n_s32(__a, __b) __arm_vcmpltq_n_s32(__a, __b)
> @@ -809,9 +764,6 @@
>  #define vcmpeqq_n_s32(__a, __b) __arm_vcmpeqq_n_s32(__a, __b)
>  #define vqshluq_n_s32(__a,  __imm) __arm_vqshluq_n_s32(__a,  __imm)
>  #define vaddvq_p_s32(__a, __p) __arm_vaddvq_p_s32(__a, __p)
> -#define vshlq_r_s32(__a, __b) __arm_vshlq_r_s32(__a, __b)
> -#define vqshlq_s32(__a, __b) __arm_vqshlq_s32(__a, __b)
> -#define vqshlq_r_s32(__a, __b) __arm_vqshlq_r_s32(__a, __b)
>  #define vornq_s32(__a, __b) __arm_vornq_s32(__a, __b)
>  #define vmulltq_int_s32(__a, __b) __arm_vmulltq_int_s32(__a, __b)
>  #define vmullbq_int_s32(__a, __b) __arm_vmullbq_int_s32(__a, __b)
> @@ -830,9 +782,7 @@
>  #define vbrsrq_n_s32(__a, __b) __arm_vbrsrq_n_s32(__a, __b)
>  #define vbicq_s32(__a, __b) __arm_vbicq_s32(__a, __b)
>  #define vaddvaq_s32(__a, __b) __arm_vaddvaq_s32(__a, __b)
> -#define vshlq_n_s32(__a,  __imm) __arm_vshlq_n_s32(__a,  __imm)
>  #define vrshrq_n_s32(__a,  __imm) __arm_vrshrq_n_s32(__a,  __imm)
> -#define vqshlq_n_s32(__a,  __imm) __arm_vqshlq_n_s32(__a,  __imm)
>  #define vqmovntq_u16(__a, __b) __arm_vqmovntq_u16(__a, __b)
>  #define vqmovnbq_u16(__a, __b) __arm_vqmovnbq_u16(__a, __b)
>  #define vmulltq_poly_p8(__a, __b) __arm_vmulltq_poly_p8(__a, __b)
> @@ -1013,8 +963,6 @@
>  #define vaddvaq_p_u8(__a, __b, __p) __arm_vaddvaq_p_u8(__a, __b, __p)
>  #define vsriq_n_u8(__a, __b,  __imm) __arm_vsriq_n_u8(__a, __b,  __imm)
>  #define vsliq_n_u8(__a, __b,  __imm) __arm_vsliq_n_u8(__a, __b,  __imm)
> -#define vshlq_m_r_u8(__a, __b, __p) __arm_vshlq_m_r_u8(__a, __b, __p)
> -#define vqshlq_m_r_u8(__a, __b, __p) __arm_vqshlq_m_r_u8(__a, __b,
> __p)
>  #define vminavq_p_s8(__a, __b, __p) __arm_vminavq_p_s8(__a, __b, __p)
>  #define vminaq_m_s8(__a, __b, __p) __arm_vminaq_m_s8(__a, __b, __p)
>  #define vmaxavq_p_s8(__a, __b, __p) __arm_vmaxavq_p_s8(__a, __b, __p)
> @@ -1031,9 +979,7 @@
>  #define vcmpgeq_m_n_s8(__a, __b, __p) __arm_vcmpgeq_m_n_s8(__a,
> __b, __p)
>  #define vcmpeqq_m_s8(__a, __b, __p) __arm_vcmpeqq_m_s8(__a, __b,
> __p)
>  #define vcmpeqq_m_n_s8(__a, __b, __p) __arm_vcmpeqq_m_n_s8(__a,
> __b, __p)
> -#define vshlq_m_r_s8(__a, __b, __p) __arm_vshlq_m_r_s8(__a, __b, __p)
>  #define vrev64q_m_s8(__inactive, __a, __p)
> __arm_vrev64q_m_s8(__inactive, __a, __p)
> -#define vqshlq_m_r_s8(__a, __b, __p) __arm_vqshlq_m_r_s8(__a, __b, __p)
>  #define vqnegq_m_s8(__inactive, __a, __p) __arm_vqnegq_m_s8(__inactive,
> __a, __p)
>  #define vqabsq_m_s8(__inactive, __a, __p) __arm_vqabsq_m_s8(__inactive,
> __a, __p)
>  #define vnegq_m_s8(__inactive, __a, __p) __arm_vnegq_m_s8(__inactive,
> __a, __p)
> @@ -1092,8 +1038,6 @@
>  #define vaddvaq_p_u16(__a, __b, __p) __arm_vaddvaq_p_u16(__a, __b,
> __p)
>  #define vsriq_n_u16(__a, __b,  __imm) __arm_vsriq_n_u16(__a, __b,
> __imm)
>  #define vsliq_n_u16(__a, __b,  __imm) __arm_vsliq_n_u16(__a, __b,
> __imm)
> -#define vshlq_m_r_u16(__a, __b, __p) __arm_vshlq_m_r_u16(__a, __b,
> __p)
> -#define vqshlq_m_r_u16(__a, __b, __p) __arm_vqshlq_m_r_u16(__a, __b,
> __p)
>  #define vminavq_p_s16(__a, __b, __p) __arm_vminavq_p_s16(__a, __b,
> __p)
>  #define vminaq_m_s16(__a, __b, __p) __arm_vminaq_m_s16(__a, __b, __p)
>  #define vmaxavq_p_s16(__a, __b, __p) __arm_vmaxavq_p_s16(__a, __b,
> __p)
> @@ -1110,9 +1054,7 @@
>  #define vcmpgeq_m_n_s16(__a, __b, __p) __arm_vcmpgeq_m_n_s16(__a,
> __b, __p)
>  #define vcmpeqq_m_s16(__a, __b, __p) __arm_vcmpeqq_m_s16(__a, __b,
> __p)
>  #define vcmpeqq_m_n_s16(__a, __b, __p) __arm_vcmpeqq_m_n_s16(__a,
> __b, __p)
> -#define vshlq_m_r_s16(__a, __b, __p) __arm_vshlq_m_r_s16(__a, __b, __p)
>  #define vrev64q_m_s16(__inactive, __a, __p)
> __arm_vrev64q_m_s16(__inactive, __a, __p)
> -#define vqshlq_m_r_s16(__a, __b, __p) __arm_vqshlq_m_r_s16(__a, __b,
> __p)
>  #define vqnegq_m_s16(__inactive, __a, __p)
> __arm_vqnegq_m_s16(__inactive, __a, __p)
>  #define vqabsq_m_s16(__inactive, __a, __p)
> __arm_vqabsq_m_s16(__inactive, __a, __p)
>  #define vnegq_m_s16(__inactive, __a, __p) __arm_vnegq_m_s16(__inactive,
> __a, __p)
> @@ -1171,8 +1113,6 @@
>  #define vaddvaq_p_u32(__a, __b, __p) __arm_vaddvaq_p_u32(__a, __b,
> __p)
>  #define vsriq_n_u32(__a, __b,  __imm) __arm_vsriq_n_u32(__a, __b,
> __imm)
>  #define vsliq_n_u32(__a, __b,  __imm) __arm_vsliq_n_u32(__a, __b,
> __imm)
> -#define vshlq_m_r_u32(__a, __b, __p) __arm_vshlq_m_r_u32(__a, __b,
> __p)
> -#define vqshlq_m_r_u32(__a, __b, __p) __arm_vqshlq_m_r_u32(__a, __b,
> __p)
>  #define vminavq_p_s32(__a, __b, __p) __arm_vminavq_p_s32(__a, __b,
> __p)
>  #define vminaq_m_s32(__a, __b, __p) __arm_vminaq_m_s32(__a, __b, __p)
>  #define vmaxavq_p_s32(__a, __b, __p) __arm_vmaxavq_p_s32(__a, __b,
> __p)
> @@ -1189,9 +1129,7 @@
>  #define vcmpgeq_m_n_s32(__a, __b, __p) __arm_vcmpgeq_m_n_s32(__a,
> __b, __p)
>  #define vcmpeqq_m_s32(__a, __b, __p) __arm_vcmpeqq_m_s32(__a, __b,
> __p)
>  #define vcmpeqq_m_n_s32(__a, __b, __p) __arm_vcmpeqq_m_n_s32(__a,
> __b, __p)
> -#define vshlq_m_r_s32(__a, __b, __p) __arm_vshlq_m_r_s32(__a, __b, __p)
>  #define vrev64q_m_s32(__inactive, __a, __p)
> __arm_vrev64q_m_s32(__inactive, __a, __p)
> -#define vqshlq_m_r_s32(__a, __b, __p) __arm_vqshlq_m_r_s32(__a, __b,
> __p)
>  #define vqnegq_m_s32(__inactive, __a, __p)
> __arm_vqnegq_m_s32(__inactive, __a, __p)
>  #define vqabsq_m_s32(__inactive, __a, __p)
> __arm_vqabsq_m_s32(__inactive, __a, __p)
>  #define vnegq_m_s32(__inactive, __a, __p) __arm_vnegq_m_s32(__inactive,
> __a, __p)
> @@ -1429,26 +1367,20 @@
>  #define vqshluq_m_n_s8(__inactive, __a,  __imm, __p)
> __arm_vqshluq_m_n_s8(__inactive, __a,  __imm, __p)
>  #define vabavq_p_s8(__a, __b, __c, __p) __arm_vabavq_p_s8(__a, __b, __c,
> __p)
>  #define vsriq_m_n_u8(__a, __b,  __imm, __p) __arm_vsriq_m_n_u8(__a,
> __b,  __imm, __p)
> -#define vshlq_m_u8(__inactive, __a, __b, __p)
> __arm_vshlq_m_u8(__inactive, __a, __b, __p)
>  #define vabavq_p_u8(__a, __b, __c, __p) __arm_vabavq_p_u8(__a, __b, __c,
> __p)
> -#define vshlq_m_s8(__inactive, __a, __b, __p)
> __arm_vshlq_m_s8(__inactive, __a, __b, __p)
>  #define vcvtq_m_n_f16_s16(__inactive, __a,  __imm6, __p)
> __arm_vcvtq_m_n_f16_s16(__inactive, __a,  __imm6, __p)
>  #define vsriq_m_n_s16(__a, __b,  __imm, __p) __arm_vsriq_m_n_s16(__a,
> __b,  __imm, __p)
>  #define vcvtq_m_n_f32_u32(__inactive, __a,  __imm6, __p)
> __arm_vcvtq_m_n_f32_u32(__inactive, __a,  __imm6, __p)
>  #define vqshluq_m_n_s16(__inactive, __a,  __imm, __p)
> __arm_vqshluq_m_n_s16(__inactive, __a,  __imm, __p)
>  #define vabavq_p_s16(__a, __b, __c, __p) __arm_vabavq_p_s16(__a, __b,
> __c, __p)
>  #define vsriq_m_n_u16(__a, __b,  __imm, __p) __arm_vsriq_m_n_u16(__a,
> __b,  __imm, __p)
> -#define vshlq_m_u16(__inactive, __a, __b, __p)
> __arm_vshlq_m_u16(__inactive, __a, __b, __p)
>  #define vabavq_p_u16(__a, __b, __c, __p) __arm_vabavq_p_u16(__a, __b,
> __c, __p)
> -#define vshlq_m_s16(__inactive, __a, __b, __p)
> __arm_vshlq_m_s16(__inactive, __a, __b, __p)
>  #define vcvtq_m_n_f32_s32(__inactive, __a,  __imm6, __p)
> __arm_vcvtq_m_n_f32_s32(__inactive, __a,  __imm6, __p)
>  #define vsriq_m_n_s32(__a, __b,  __imm, __p) __arm_vsriq_m_n_s32(__a,
> __b,  __imm, __p)
>  #define vqshluq_m_n_s32(__inactive, __a,  __imm, __p)
> __arm_vqshluq_m_n_s32(__inactive, __a,  __imm, __p)
>  #define vabavq_p_s32(__a, __b, __c, __p) __arm_vabavq_p_s32(__a, __b,
> __c, __p)
>  #define vsriq_m_n_u32(__a, __b,  __imm, __p) __arm_vsriq_m_n_u32(__a,
> __b,  __imm, __p)
> -#define vshlq_m_u32(__inactive, __a, __b, __p)
> __arm_vshlq_m_u32(__inactive, __a, __b, __p)
>  #define vabavq_p_u32(__a, __b, __c, __p) __arm_vabavq_p_u32(__a, __b,
> __c, __p)
> -#define vshlq_m_s32(__inactive, __a, __b, __p)
> __arm_vshlq_m_s32(__inactive, __a, __b, __p)
>  #define vbicq_m_s8(__inactive, __a, __b, __p)
> __arm_vbicq_m_s8(__inactive, __a, __b, __p)
>  #define vbicq_m_s32(__inactive, __a, __b, __p)
> __arm_vbicq_m_s32(__inactive, __a, __b, __p)
>  #define vbicq_m_s16(__inactive, __a, __b, __p)
> __arm_vbicq_m_s16(__inactive, __a, __b, __p)
> @@ -1572,30 +1504,12 @@
>  #define vqrdmlsdhxq_m_s8(__inactive, __a, __b, __p)
> __arm_vqrdmlsdhxq_m_s8(__inactive, __a, __b, __p)
>  #define vqrdmlsdhxq_m_s32(__inactive, __a, __b, __p)
> __arm_vqrdmlsdhxq_m_s32(__inactive, __a, __b, __p)
>  #define vqrdmlsdhxq_m_s16(__inactive, __a, __b, __p)
> __arm_vqrdmlsdhxq_m_s16(__inactive, __a, __b, __p)
> -#define vqshlq_m_n_s8(__inactive, __a,  __imm, __p)
> __arm_vqshlq_m_n_s8(__inactive, __a,  __imm, __p)
> -#define vqshlq_m_n_s32(__inactive, __a,  __imm, __p)
> __arm_vqshlq_m_n_s32(__inactive, __a,  __imm, __p)
> -#define vqshlq_m_n_s16(__inactive, __a,  __imm, __p)
> __arm_vqshlq_m_n_s16(__inactive, __a,  __imm, __p)
> -#define vqshlq_m_n_u8(__inactive, __a,  __imm, __p)
> __arm_vqshlq_m_n_u8(__inactive, __a,  __imm, __p)
> -#define vqshlq_m_n_u32(__inactive, __a,  __imm, __p)
> __arm_vqshlq_m_n_u32(__inactive, __a,  __imm, __p)
> -#define vqshlq_m_n_u16(__inactive, __a,  __imm, __p)
> __arm_vqshlq_m_n_u16(__inactive, __a,  __imm, __p)
> -#define vqshlq_m_s8(__inactive, __a, __b, __p)
> __arm_vqshlq_m_s8(__inactive, __a, __b, __p)
> -#define vqshlq_m_s32(__inactive, __a, __b, __p)
> __arm_vqshlq_m_s32(__inactive, __a, __b, __p)
> -#define vqshlq_m_s16(__inactive, __a, __b, __p)
> __arm_vqshlq_m_s16(__inactive, __a, __b, __p)
> -#define vqshlq_m_u8(__inactive, __a, __b, __p)
> __arm_vqshlq_m_u8(__inactive, __a, __b, __p)
> -#define vqshlq_m_u32(__inactive, __a, __b, __p)
> __arm_vqshlq_m_u32(__inactive, __a, __b, __p)
> -#define vqshlq_m_u16(__inactive, __a, __b, __p)
> __arm_vqshlq_m_u16(__inactive, __a, __b, __p)
>  #define vrshrq_m_n_s8(__inactive, __a,  __imm, __p)
> __arm_vrshrq_m_n_s8(__inactive, __a,  __imm, __p)
>  #define vrshrq_m_n_s32(__inactive, __a,  __imm, __p)
> __arm_vrshrq_m_n_s32(__inactive, __a,  __imm, __p)
>  #define vrshrq_m_n_s16(__inactive, __a,  __imm, __p)
> __arm_vrshrq_m_n_s16(__inactive, __a,  __imm, __p)
>  #define vrshrq_m_n_u8(__inactive, __a,  __imm, __p)
> __arm_vrshrq_m_n_u8(__inactive, __a,  __imm, __p)
>  #define vrshrq_m_n_u32(__inactive, __a,  __imm, __p)
> __arm_vrshrq_m_n_u32(__inactive, __a,  __imm, __p)
>  #define vrshrq_m_n_u16(__inactive, __a,  __imm, __p)
> __arm_vrshrq_m_n_u16(__inactive, __a,  __imm, __p)
> -#define vshlq_m_n_s8(__inactive, __a,  __imm, __p)
> __arm_vshlq_m_n_s8(__inactive, __a,  __imm, __p)
> -#define vshlq_m_n_s32(__inactive, __a,  __imm, __p)
> __arm_vshlq_m_n_s32(__inactive, __a,  __imm, __p)
> -#define vshlq_m_n_s16(__inactive, __a,  __imm, __p)
> __arm_vshlq_m_n_s16(__inactive, __a,  __imm, __p)
> -#define vshlq_m_n_u8(__inactive, __a,  __imm, __p)
> __arm_vshlq_m_n_u8(__inactive, __a,  __imm, __p)
> -#define vshlq_m_n_u32(__inactive, __a,  __imm, __p)
> __arm_vshlq_m_n_u32(__inactive, __a,  __imm, __p)
> -#define vshlq_m_n_u16(__inactive, __a,  __imm, __p)
> __arm_vshlq_m_n_u16(__inactive, __a,  __imm, __p)
>  #define vshrq_m_n_s8(__inactive, __a,  __imm, __p)
> __arm_vshrq_m_n_s8(__inactive, __a,  __imm, __p)
>  #define vshrq_m_n_s32(__inactive, __a,  __imm, __p)
> __arm_vshrq_m_n_s32(__inactive, __a,  __imm, __p)
>  #define vshrq_m_n_s16(__inactive, __a,  __imm, __p)
> __arm_vshrq_m_n_s16(__inactive, __a,  __imm, __p)
> @@ -2146,18 +2060,6 @@
>  #define vshlltq_x_n_s16(__a,  __imm, __p) __arm_vshlltq_x_n_s16(__a,
> __imm, __p)
>  #define vshlltq_x_n_u8(__a,  __imm, __p) __arm_vshlltq_x_n_u8(__a,
> __imm, __p)
>  #define vshlltq_x_n_u16(__a,  __imm, __p) __arm_vshlltq_x_n_u16(__a,
> __imm, __p)
> -#define vshlq_x_s8(__a, __b, __p) __arm_vshlq_x_s8(__a, __b, __p)
> -#define vshlq_x_s16(__a, __b, __p) __arm_vshlq_x_s16(__a, __b, __p)
> -#define vshlq_x_s32(__a, __b, __p) __arm_vshlq_x_s32(__a, __b, __p)
> -#define vshlq_x_u8(__a, __b, __p) __arm_vshlq_x_u8(__a, __b, __p)
> -#define vshlq_x_u16(__a, __b, __p) __arm_vshlq_x_u16(__a, __b, __p)
> -#define vshlq_x_u32(__a, __b, __p) __arm_vshlq_x_u32(__a, __b, __p)
> -#define vshlq_x_n_s8(__a,  __imm, __p) __arm_vshlq_x_n_s8(__a,  __imm,
> __p)
> -#define vshlq_x_n_s16(__a,  __imm, __p) __arm_vshlq_x_n_s16(__a,
> __imm, __p)
> -#define vshlq_x_n_s32(__a,  __imm, __p) __arm_vshlq_x_n_s32(__a,
> __imm, __p)
> -#define vshlq_x_n_u8(__a,  __imm, __p) __arm_vshlq_x_n_u8(__a,  __imm,
> __p)
> -#define vshlq_x_n_u16(__a,  __imm, __p) __arm_vshlq_x_n_u16(__a,
> __imm, __p)
> -#define vshlq_x_n_u32(__a,  __imm, __p) __arm_vshlq_x_n_u32(__a,
> __imm, __p)
>  #define vrshrq_x_n_s8(__a,  __imm, __p) __arm_vrshrq_x_n_s8(__a,
> __imm, __p)
>  #define vrshrq_x_n_s16(__a,  __imm, __p) __arm_vrshrq_x_n_s16(__a,
> __imm, __p)
>  #define vrshrq_x_n_s32(__a,  __imm, __p) __arm_vrshrq_x_n_s32(__a,
> __imm, __p)
> @@ -3000,48 +2902,6 @@ __arm_vcmpneq_u32 (uint32x4_t __a, uint32x4_t
> __b)
>    return __builtin_mve_vcmpneq_v4si ((int32x4_t)__a, (int32x4_t)__b);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_s8 (int8x16_t __a, int8x16_t __b)
> -{
> -  return __builtin_mve_vshlq_sv16qi (__a, __b);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_s16 (int16x8_t __a, int16x8_t __b)
> -{
> -  return __builtin_mve_vshlq_sv8hi (__a, __b);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_s32 (int32x4_t __a, int32x4_t __b)
> -{
> -  return __builtin_mve_vshlq_sv4si (__a, __b);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_u8 (uint8x16_t __a, int8x16_t __b)
> -{
> -  return __builtin_mve_vshlq_uv16qi (__a, __b);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_u16 (uint16x8_t __a, int16x8_t __b)
> -{
> -  return __builtin_mve_vshlq_uv8hi (__a, __b);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_u32 (uint32x4_t __a, int32x4_t __b)
> -{
> -  return __builtin_mve_vshlq_uv4si (__a, __b);
> -}
> -
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq_u8 (uint8x16_t __a, uint8x16_t __b)
> @@ -3184,27 +3044,6 @@ __arm_vaddvaq_u8 (uint32_t __a, uint8x16_t
> __b)
>    return __builtin_mve_vaddvaq_uv16qi (__a, __b);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r_u8 (uint8x16_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vshlq_r_uv16qi (__a, __b);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_u8 (uint8x16_t __a, int8x16_t __b)
> -{
> -  return __builtin_mve_vqshlq_uv16qi (__a, __b);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r_u8 (uint8x16_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vqshlq_r_uv16qi (__a, __b);
> -}
> -
>  __extension__ extern __inline uint8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_s8 (uint8_t __a, int8x16_t __b)
> @@ -3240,13 +3079,6 @@ __arm_vbrsrq_n_u8 (uint8x16_t __a, int32_t __b)
>    return __builtin_mve_vbrsrq_n_uv16qi (__a, __b);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n_u8 (uint8x16_t __a, const int __imm)
> -{
> -  return __builtin_mve_vshlq_n_uv16qi (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_n_u8 (uint8x16_t __a, const int __imm)
> @@ -3254,13 +3086,6 @@ __arm_vrshrq_n_u8 (uint8x16_t __a, const int
> __imm)
>    return __builtin_mve_vrshrq_n_uv16qi (__a, __imm);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n_u8 (uint8x16_t __a, const int __imm)
> -{
> -  return __builtin_mve_vqshlq_n_uv16qi (__a, __imm);
> -}
> -
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcmpneq_n_s8 (int8x16_t __a, int8_t __b)
> @@ -3352,27 +3177,6 @@ __arm_vaddvq_p_s8 (int8x16_t __a,
> mve_pred16_t __p)
>    return __builtin_mve_vaddvq_p_sv16qi (__a, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r_s8 (int8x16_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vshlq_r_sv16qi (__a, __b);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_s8 (int8x16_t __a, int8x16_t __b)
> -{
> -  return __builtin_mve_vqshlq_sv16qi (__a, __b);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r_s8 (int8x16_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vqshlq_r_sv16qi (__a, __b);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq_s8 (int8x16_t __a, int8x16_t __b)
> @@ -3499,13 +3303,6 @@ __arm_vaddvaq_s8 (int32_t __a, int8x16_t __b)
>    return __builtin_mve_vaddvaq_sv16qi (__a, __b);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n_s8 (int8x16_t __a, const int __imm)
> -{
> -  return __builtin_mve_vshlq_n_sv16qi (__a, __imm);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_n_s8 (int8x16_t __a, const int __imm)
> @@ -3513,13 +3310,6 @@ __arm_vrshrq_n_s8 (int8x16_t __a, const int
> __imm)
>    return __builtin_mve_vrshrq_n_sv16qi (__a, __imm);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n_s8 (int8x16_t __a, const int __imm)
> -{
> -  return __builtin_mve_vqshlq_n_sv16qi (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b)
> @@ -3662,27 +3452,6 @@ __arm_vaddvaq_u16 (uint32_t __a, uint16x8_t
> __b)
>    return __builtin_mve_vaddvaq_uv8hi (__a, __b);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r_u16 (uint16x8_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vshlq_r_uv8hi (__a, __b);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_u16 (uint16x8_t __a, int16x8_t __b)
> -{
> -  return __builtin_mve_vqshlq_uv8hi (__a, __b);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r_u16 (uint16x8_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vqshlq_r_uv8hi (__a, __b);
> -}
> -
>  __extension__ extern __inline uint16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_s16 (uint16_t __a, int16x8_t __b)
> @@ -3718,13 +3487,6 @@ __arm_vbrsrq_n_u16 (uint16x8_t __a, int32_t
> __b)
>    return __builtin_mve_vbrsrq_n_uv8hi (__a, __b);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n_u16 (uint16x8_t __a, const int __imm)
> -{
> -  return __builtin_mve_vshlq_n_uv8hi (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_n_u16 (uint16x8_t __a, const int __imm)
> @@ -3732,13 +3494,6 @@ __arm_vrshrq_n_u16 (uint16x8_t __a, const int
> __imm)
>    return __builtin_mve_vrshrq_n_uv8hi (__a, __imm);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n_u16 (uint16x8_t __a, const int __imm)
> -{
> -  return __builtin_mve_vqshlq_n_uv8hi (__a, __imm);
> -}
> -
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcmpneq_n_s16 (int16x8_t __a, int16_t __b)
> @@ -3830,27 +3585,6 @@ __arm_vaddvq_p_s16 (int16x8_t __a,
> mve_pred16_t __p)
>    return __builtin_mve_vaddvq_p_sv8hi (__a, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r_s16 (int16x8_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vshlq_r_sv8hi (__a, __b);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_s16 (int16x8_t __a, int16x8_t __b)
> -{
> -  return __builtin_mve_vqshlq_sv8hi (__a, __b);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r_s16 (int16x8_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vqshlq_r_sv8hi (__a, __b);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq_s16 (int16x8_t __a, int16x8_t __b)
> @@ -3977,13 +3711,6 @@ __arm_vaddvaq_s16 (int32_t __a, int16x8_t __b)
>    return __builtin_mve_vaddvaq_sv8hi (__a, __b);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n_s16 (int16x8_t __a, const int __imm)
> -{
> -  return __builtin_mve_vshlq_n_sv8hi (__a, __imm);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_n_s16 (int16x8_t __a, const int __imm)
> @@ -3991,13 +3718,6 @@ __arm_vrshrq_n_s16 (int16x8_t __a, const int
> __imm)
>    return __builtin_mve_vrshrq_n_sv8hi (__a, __imm);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n_s16 (int16x8_t __a, const int __imm)
> -{
> -  return __builtin_mve_vqshlq_n_sv8hi (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b)
> @@ -4140,27 +3860,6 @@ __arm_vaddvaq_u32 (uint32_t __a, uint32x4_t
> __b)
>    return __builtin_mve_vaddvaq_uv4si (__a, __b);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r_u32 (uint32x4_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vshlq_r_uv4si (__a, __b);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_u32 (uint32x4_t __a, int32x4_t __b)
> -{
> -  return __builtin_mve_vqshlq_uv4si (__a, __b);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r_u32 (uint32x4_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vqshlq_r_uv4si (__a, __b);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_s32 (uint32_t __a, int32x4_t __b)
> @@ -4196,13 +3895,6 @@ __arm_vbrsrq_n_u32 (uint32x4_t __a, int32_t
> __b)
>    return __builtin_mve_vbrsrq_n_uv4si (__a, __b);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n_u32 (uint32x4_t __a, const int __imm)
> -{
> -  return __builtin_mve_vshlq_n_uv4si (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_n_u32 (uint32x4_t __a, const int __imm)
> @@ -4210,13 +3902,6 @@ __arm_vrshrq_n_u32 (uint32x4_t __a, const int
> __imm)
>    return __builtin_mve_vrshrq_n_uv4si (__a, __imm);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n_u32 (uint32x4_t __a, const int __imm)
> -{
> -  return __builtin_mve_vqshlq_n_uv4si (__a, __imm);
> -}
> -
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcmpneq_n_s32 (int32x4_t __a, int32_t __b)
> @@ -4308,27 +3993,6 @@ __arm_vaddvq_p_s32 (int32x4_t __a,
> mve_pred16_t __p)
>    return __builtin_mve_vaddvq_p_sv4si (__a, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r_s32 (int32x4_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vshlq_r_sv4si (__a, __b);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_s32 (int32x4_t __a, int32x4_t __b)
> -{
> -  return __builtin_mve_vqshlq_sv4si (__a, __b);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r_s32 (int32x4_t __a, int32_t __b)
> -{
> -  return __builtin_mve_vqshlq_r_sv4si (__a, __b);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq_s32 (int32x4_t __a, int32x4_t __b)
> @@ -4455,13 +4119,6 @@ __arm_vaddvaq_s32 (int32_t __a, int32x4_t __b)
>    return __builtin_mve_vaddvaq_sv4si (__a, __b);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n_s32 (int32x4_t __a, const int __imm)
> -{
> -  return __builtin_mve_vshlq_n_sv4si (__a, __imm);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_n_s32 (int32x4_t __a, const int __imm)
> @@ -4469,13 +4126,6 @@ __arm_vrshrq_n_s32 (int32x4_t __a, const int
> __imm)
>    return __builtin_mve_vrshrq_n_sv4si (__a, __imm);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n_s32 (int32x4_t __a, const int __imm)
> -{
> -  return __builtin_mve_vqshlq_n_sv4si (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqmovntq_u16 (uint8x16_t __a, uint16x8_t __b)
> @@ -5272,20 +4922,6 @@ __arm_vsliq_n_u8 (uint8x16_t __a, uint8x16_t
> __b, const int __imm)
>    return __builtin_mve_vsliq_n_uv16qi (__a, __b, __imm);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r_u8 (uint8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_r_uv16qi (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r_u8 (uint8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_r_uv16qi (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_p_s8 (uint8_t __a, int8x16_t __b, mve_pred16_t __p)
> @@ -5398,13 +5034,6 @@ __arm_vcmpeqq_m_n_s8 (int8x16_t __a, int8_t
> __b, mve_pred16_t __p)
>    return __builtin_mve_vcmpeqq_m_n_sv16qi (__a, __b, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r_s8 (int8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_r_sv16qi (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrev64q_m_s8 (int8x16_t __inactive, int8x16_t __a, mve_pred16_t
> __p)
> @@ -5412,13 +5041,6 @@ __arm_vrev64q_m_s8 (int8x16_t __inactive,
> int8x16_t __a, mve_pred16_t __p)
>    return __builtin_mve_vrev64q_m_sv16qi (__inactive, __a, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r_s8 (int8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_r_sv16qi (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqnegq_m_s8 (int8x16_t __inactive, int8x16_t __a, mve_pred16_t
> __p)
> @@ -5826,20 +5448,6 @@ __arm_vsliq_n_u16 (uint16x8_t __a, uint16x8_t
> __b, const int __imm)
>    return __builtin_mve_vsliq_n_uv8hi (__a, __b, __imm);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r_u16 (uint16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_r_uv8hi (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r_u16 (uint16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_r_uv8hi (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_p_s16 (uint16_t __a, int16x8_t __b, mve_pred16_t __p)
> @@ -5952,13 +5560,6 @@ __arm_vcmpeqq_m_n_s16 (int16x8_t __a, int16_t
> __b, mve_pred16_t __p)
>    return __builtin_mve_vcmpeqq_m_n_sv8hi (__a, __b, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r_s16 (int16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_r_sv8hi (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrev64q_m_s16 (int16x8_t __inactive, int16x8_t __a, mve_pred16_t
> __p)
> @@ -5966,13 +5567,6 @@ __arm_vrev64q_m_s16 (int16x8_t __inactive,
> int16x8_t __a, mve_pred16_t __p)
>    return __builtin_mve_vrev64q_m_sv8hi (__inactive, __a, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r_s16 (int16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_r_sv8hi (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqnegq_m_s16 (int16x8_t __inactive, int16x8_t __a, mve_pred16_t
> __p)
> @@ -6379,20 +5973,6 @@ __arm_vsliq_n_u32 (uint32x4_t __a, uint32x4_t
> __b, const int __imm)
>    return __builtin_mve_vsliq_n_uv4si (__a, __b, __imm);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r_u32 (uint32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_r_uv4si (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r_u32 (uint32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_r_uv4si (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_p_s32 (uint32_t __a, int32x4_t __b, mve_pred16_t __p)
> @@ -6505,13 +6085,6 @@ __arm_vcmpeqq_m_n_s32 (int32x4_t __a, int32_t
> __b, mve_pred16_t __p)
>    return __builtin_mve_vcmpeqq_m_n_sv4si (__a, __b, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r_s32 (int32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_r_sv4si (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrev64q_m_s32 (int32x4_t __inactive, int32x4_t __a, mve_pred16_t
> __p)
> @@ -6519,13 +6092,6 @@ __arm_vrev64q_m_s32 (int32x4_t __inactive,
> int32x4_t __a, mve_pred16_t __p)
>    return __builtin_mve_vrev64q_m_sv4si (__inactive, __a, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r_s32 (int32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_r_sv4si (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqnegq_m_s32 (int32x4_t __inactive, int32x4_t __a, mve_pred16_t
> __p)
> @@ -7527,13 +7093,6 @@ __arm_vsriq_m_n_u8 (uint8x16_t __a, uint8x16_t
> __b, const int __imm, mve_pred16_
>    return __builtin_mve_vsriq_m_n_uv16qi (__a, __b, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_u8 (uint8x16_t __inactive, uint8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_uv16qi (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vabavq_p_u8 (uint32_t __a, uint8x16_t __b, uint8x16_t __c,
> mve_pred16_t __p)
> @@ -7541,13 +7100,6 @@ __arm_vabavq_p_u8 (uint32_t __a, uint8x16_t
> __b, uint8x16_t __c, mve_pred16_t __
>    return __builtin_mve_vabavq_p_uv16qi (__a, __b, __c, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_sv16qi (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vsriq_m_n_s16 (int16x8_t __a, int16x8_t __b, const int __imm,
> mve_pred16_t __p)
> @@ -7576,13 +7128,6 @@ __arm_vsriq_m_n_u16 (uint16x8_t __a,
> uint16x8_t __b, const int __imm, mve_pred16
>    return __builtin_mve_vsriq_m_n_uv8hi (__a, __b, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_uv8hi (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vabavq_p_u16 (uint32_t __a, uint16x8_t __b, uint16x8_t __c,
> mve_pred16_t __p)
> @@ -7590,13 +7135,6 @@ __arm_vabavq_p_u16 (uint32_t __a, uint16x8_t
> __b, uint16x8_t __c, mve_pred16_t _
>    return __builtin_mve_vabavq_p_uv8hi (__a, __b, __c, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_sv8hi (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vsriq_m_n_s32 (int32x4_t __a, int32x4_t __b, const int __imm,
> mve_pred16_t __p)
> @@ -7625,13 +7163,6 @@ __arm_vsriq_m_n_u32 (uint32x4_t __a,
> uint32x4_t __b, const int __imm, mve_pred16
>    return __builtin_mve_vsriq_m_n_uv4si (__a, __b, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_uv4si (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vabavq_p_u32 (uint32_t __a, uint32x4_t __b, uint32x4_t __c,
> mve_pred16_t __p)
> @@ -7639,13 +7170,6 @@ __arm_vabavq_p_u32 (uint32_t __a, uint32x4_t
> __b, uint32x4_t __c, mve_pred16_t _
>    return __builtin_mve_vabavq_p_uv4si (__a, __b, __c, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_sv4si (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vbicq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> @@ -8507,90 +8031,6 @@ __arm_vqrdmlsdhxq_m_s16 (int16x8_t __inactive,
> int16x8_t __a, int16x8_t __b, mve
>    return __builtin_mve_vqrdmlsdhxq_m_sv8hi (__inactive, __a, __b, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_n_sv16qi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_n_sv4si (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_n_sv8hi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_n_uv16qi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_n_uv4si (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_n_uv8hi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_sv16qi (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_sv4si (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_sv8hi (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_u8 (uint8x16_t __inactive, uint8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_uv16qi (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_uv4si (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vqshlq_m_uv8hi (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> @@ -8633,48 +8073,6 @@ __arm_vrshrq_m_n_u16 (uint16x8_t __inactive,
> uint16x8_t __a, const int __imm, mv
>    return __builtin_mve_vrshrq_m_n_uv8hi (__inactive, __a, __imm, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_sv16qi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_sv4si (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_sv8hi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_uv16qi (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_uv4si (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, const int
> __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_uv8hi (__inactive, __a, __imm, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vshrq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> @@ -11981,163 +11379,79 @@ __arm_vrev64q_x_s32 (int32x4_t __a,
> mve_pred16_t __p)
> 
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vrev64q_x_u8 (uint8x16_t __a, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vrev64q_m_uv16qi (__arm_vuninitializedq_u8 (),
> __a, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vrev64q_x_u16 (uint16x8_t __a, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vrev64q_m_uv8hi (__arm_vuninitializedq_u16 (),
> __a, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vrev64q_x_u32 (uint32x4_t __a, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vrev64q_m_uv4si (__arm_vuninitializedq_u32 (), __a,
> __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshllbq_x_n_s8 (int8x16_t __a, const int __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshllbq_m_n_sv16qi (__arm_vuninitializedq_s16 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshllbq_x_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshllbq_m_n_sv8hi (__arm_vuninitializedq_s32 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshllbq_x_n_u8 (uint8x16_t __a, const int __imm, mve_pred16_t
> __p)
> -{
> -  return __builtin_mve_vshllbq_m_n_uv16qi (__arm_vuninitializedq_u16 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshllbq_x_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t
> __p)
> -{
> -  return __builtin_mve_vshllbq_m_n_uv8hi (__arm_vuninitializedq_u32 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlltq_x_n_s8 (int8x16_t __a, const int __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlltq_m_n_sv16qi (__arm_vuninitializedq_s16 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlltq_x_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlltq_m_n_sv8hi (__arm_vuninitializedq_s32 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlltq_x_n_u8 (uint8x16_t __a, const int __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlltq_m_n_uv16qi (__arm_vuninitializedq_u16 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlltq_x_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t
> __p)
> -{
> -  return __builtin_mve_vshlltq_m_n_uv8hi (__arm_vuninitializedq_u32 (),
> __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p)
> +__arm_vrev64q_x_u8 (uint8x16_t __a, mve_pred16_t __p)
>  {
> -  return __builtin_mve_vshlq_m_sv16qi (__arm_vuninitializedq_s8 (), __a,
> __b, __p);
> +  return __builtin_mve_vrev64q_m_uv16qi (__arm_vuninitializedq_u8 (),
> __a, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> +__extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p)
> +__arm_vrev64q_x_u16 (uint16x8_t __a, mve_pred16_t __p)
>  {
> -  return __builtin_mve_vshlq_m_sv8hi (__arm_vuninitializedq_s16 (), __a,
> __b, __p);
> +  return __builtin_mve_vrev64q_m_uv8hi (__arm_vuninitializedq_u16 (),
> __a, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> +__extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p)
> +__arm_vrev64q_x_u32 (uint32x4_t __a, mve_pred16_t __p)
>  {
> -  return __builtin_mve_vshlq_m_sv4si (__arm_vuninitializedq_s32 (), __a,
> __b, __p);
> +  return __builtin_mve_vrev64q_m_uv4si (__arm_vuninitializedq_u32 (),
> __a, __p);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> +__extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_u8 (uint8x16_t __a, int8x16_t __b, mve_pred16_t __p)
> +__arm_vshllbq_x_n_s8 (int8x16_t __a, const int __imm, mve_pred16_t __p)
>  {
> -  return __builtin_mve_vshlq_m_uv16qi (__arm_vuninitializedq_u8 (), __a,
> __b, __p);
> +  return __builtin_mve_vshllbq_m_n_sv16qi (__arm_vuninitializedq_s16 (),
> __a, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> +__extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_u16 (uint16x8_t __a, int16x8_t __b, mve_pred16_t __p)
> +__arm_vshllbq_x_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t
> __p)
>  {
> -  return __builtin_mve_vshlq_m_uv8hi (__arm_vuninitializedq_u16 (), __a,
> __b, __p);
> +  return __builtin_mve_vshllbq_m_n_sv8hi (__arm_vuninitializedq_s32 (),
> __a, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> +__extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_u32 (uint32x4_t __a, int32x4_t __b, mve_pred16_t __p)
> +__arm_vshllbq_x_n_u8 (uint8x16_t __a, const int __imm, mve_pred16_t
> __p)
>  {
> -  return __builtin_mve_vshlq_m_uv4si (__arm_vuninitializedq_u32 (), __a,
> __b, __p);
> +  return __builtin_mve_vshllbq_m_n_uv16qi (__arm_vuninitializedq_u16 (),
> __a, __imm, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> +__extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n_s8 (int8x16_t __a, const int __imm, mve_pred16_t __p)
> +__arm_vshllbq_x_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t
> __p)
>  {
> -  return __builtin_mve_vshlq_m_n_sv16qi (__arm_vuninitializedq_s8 (), __a,
> __imm, __p);
> +  return __builtin_mve_vshllbq_m_n_uv8hi (__arm_vuninitializedq_u32 (),
> __a, __imm, __p);
>  }
> 
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t __p)
> +__arm_vshlltq_x_n_s8 (int8x16_t __a, const int __imm, mve_pred16_t __p)
>  {
> -  return __builtin_mve_vshlq_m_n_sv8hi (__arm_vuninitializedq_s16 (), __a,
> __imm, __p);
> +  return __builtin_mve_vshlltq_m_n_sv16qi (__arm_vuninitializedq_s16 (),
> __a, __imm, __p);
>  }
> 
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n_s32 (int32x4_t __a, const int __imm, mve_pred16_t __p)
> -{
> -  return __builtin_mve_vshlq_m_n_sv4si (__arm_vuninitializedq_s32 (), __a,
> __imm, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n_u8 (uint8x16_t __a, const int __imm, mve_pred16_t __p)
> +__arm_vshlltq_x_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t __p)
>  {
> -  return __builtin_mve_vshlq_m_n_uv16qi (__arm_vuninitializedq_u8 (),
> __a, __imm, __p);
> +  return __builtin_mve_vshlltq_m_n_sv8hi (__arm_vuninitializedq_s32 (),
> __a, __imm, __p);
>  }
> 
>  __extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t __p)
> +__arm_vshlltq_x_n_u8 (uint8x16_t __a, const int __imm, mve_pred16_t
> __p)
>  {
> -  return __builtin_mve_vshlq_m_n_uv8hi (__arm_vuninitializedq_u16 (),
> __a, __imm, __p);
> +  return __builtin_mve_vshlltq_m_n_uv16qi (__arm_vuninitializedq_u16 (),
> __a, __imm, __p);
>  }
> 
>  __extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n_u32 (uint32x4_t __a, const int __imm, mve_pred16_t __p)
> +__arm_vshlltq_x_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t
> __p)
>  {
> -  return __builtin_mve_vshlq_m_n_uv4si (__arm_vuninitializedq_u32 (), __a,
> __imm, __p);
> +  return __builtin_mve_vshlltq_m_n_uv8hi (__arm_vuninitializedq_u32 (),
> __a, __imm, __p);
>  }
> 
>  __extension__ extern __inline int8x16_t
> @@ -16275,48 +15589,6 @@ __arm_vcmpneq (uint32x4_t __a, uint32x4_t
> __b)
>   return __arm_vcmpneq_u32 (__a, __b);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq (int8x16_t __a, int8x16_t __b)
> -{
> - return __arm_vshlq_s8 (__a, __b);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq (int16x8_t __a, int16x8_t __b)
> -{
> - return __arm_vshlq_s16 (__a, __b);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq (int32x4_t __a, int32x4_t __b)
> -{
> - return __arm_vshlq_s32 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq (uint8x16_t __a, int8x16_t __b)
> -{
> - return __arm_vshlq_u8 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq (uint16x8_t __a, int16x8_t __b)
> -{
> - return __arm_vshlq_u16 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq (uint32x4_t __a, int32x4_t __b)
> -{
> - return __arm_vshlq_u32 (__a, __b);
> -}
> -
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq (uint8x16_t __a, uint8x16_t __b)
> @@ -16457,27 +15729,6 @@ __arm_vaddvaq (uint32_t __a, uint8x16_t __b)
>   return __arm_vaddvaq_u8 (__a, __b);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r (uint8x16_t __a, int32_t __b)
> -{
> - return __arm_vshlq_r_u8 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq (uint8x16_t __a, int8x16_t __b)
> -{
> - return __arm_vqshlq_u8 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r (uint8x16_t __a, int32_t __b)
> -{
> - return __arm_vqshlq_r_u8 (__a, __b);
> -}
> -
>  __extension__ extern __inline uint8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq (uint8_t __a, int8x16_t __b)
> @@ -16513,13 +15764,6 @@ __arm_vbrsrq (uint8x16_t __a, int32_t __b)
>   return __arm_vbrsrq_n_u8 (__a, __b);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n (uint8x16_t __a, const int __imm)
> -{
> - return __arm_vshlq_n_u8 (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq (uint8x16_t __a, const int __imm)
> @@ -16527,13 +15771,6 @@ __arm_vrshrq (uint8x16_t __a, const int __imm)
>   return __arm_vrshrq_n_u8 (__a, __imm);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n (uint8x16_t __a, const int __imm)
> -{
> - return __arm_vqshlq_n_u8 (__a, __imm);
> -}
> -
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcmpneq (int8x16_t __a, int8_t __b)
> @@ -16625,27 +15862,6 @@ __arm_vaddvq_p (int8x16_t __a,
> mve_pred16_t __p)
>   return __arm_vaddvq_p_s8 (__a, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r (int8x16_t __a, int32_t __b)
> -{
> - return __arm_vshlq_r_s8 (__a, __b);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq (int8x16_t __a, int8x16_t __b)
> -{
> - return __arm_vqshlq_s8 (__a, __b);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r (int8x16_t __a, int32_t __b)
> -{
> - return __arm_vqshlq_r_s8 (__a, __b);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq (int8x16_t __a, int8x16_t __b)
> @@ -16772,13 +15988,6 @@ __arm_vaddvaq (int32_t __a, int8x16_t __b)
>   return __arm_vaddvaq_s8 (__a, __b);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n (int8x16_t __a, const int __imm)
> -{
> - return __arm_vshlq_n_s8 (__a, __imm);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq (int8x16_t __a, const int __imm)
> @@ -16786,13 +15995,6 @@ __arm_vrshrq (int8x16_t __a, const int __imm)
>   return __arm_vrshrq_n_s8 (__a, __imm);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n (int8x16_t __a, const int __imm)
> -{
> - return __arm_vqshlq_n_s8 (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq (uint16x8_t __a, uint16x8_t __b)
> @@ -16933,27 +16135,6 @@ __arm_vaddvaq (uint32_t __a, uint16x8_t __b)
>   return __arm_vaddvaq_u16 (__a, __b);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r (uint16x8_t __a, int32_t __b)
> -{
> - return __arm_vshlq_r_u16 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq (uint16x8_t __a, int16x8_t __b)
> -{
> - return __arm_vqshlq_u16 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r (uint16x8_t __a, int32_t __b)
> -{
> - return __arm_vqshlq_r_u16 (__a, __b);
> -}
> -
>  __extension__ extern __inline uint16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq (uint16_t __a, int16x8_t __b)
> @@ -16989,13 +16170,6 @@ __arm_vbrsrq (uint16x8_t __a, int32_t __b)
>   return __arm_vbrsrq_n_u16 (__a, __b);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n (uint16x8_t __a, const int __imm)
> -{
> - return __arm_vshlq_n_u16 (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq (uint16x8_t __a, const int __imm)
> @@ -17003,13 +16177,6 @@ __arm_vrshrq (uint16x8_t __a, const int __imm)
>   return __arm_vrshrq_n_u16 (__a, __imm);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n (uint16x8_t __a, const int __imm)
> -{
> - return __arm_vqshlq_n_u16 (__a, __imm);
> -}
> -
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcmpneq (int16x8_t __a, int16_t __b)
> @@ -17101,27 +16268,6 @@ __arm_vaddvq_p (int16x8_t __a,
> mve_pred16_t __p)
>   return __arm_vaddvq_p_s16 (__a, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r (int16x8_t __a, int32_t __b)
> -{
> - return __arm_vshlq_r_s16 (__a, __b);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq (int16x8_t __a, int16x8_t __b)
> -{
> - return __arm_vqshlq_s16 (__a, __b);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r (int16x8_t __a, int32_t __b)
> -{
> - return __arm_vqshlq_r_s16 (__a, __b);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq (int16x8_t __a, int16x8_t __b)
> @@ -17248,13 +16394,6 @@ __arm_vaddvaq (int32_t __a, int16x8_t __b)
>   return __arm_vaddvaq_s16 (__a, __b);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n (int16x8_t __a, const int __imm)
> -{
> - return __arm_vshlq_n_s16 (__a, __imm);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq (int16x8_t __a, const int __imm)
> @@ -17262,13 +16401,6 @@ __arm_vrshrq (int16x8_t __a, const int __imm)
>   return __arm_vrshrq_n_s16 (__a, __imm);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n (int16x8_t __a, const int __imm)
> -{
> - return __arm_vqshlq_n_s16 (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq (uint32x4_t __a, uint32x4_t __b)
> @@ -17409,27 +16541,6 @@ __arm_vaddvaq (uint32_t __a, uint32x4_t __b)
>   return __arm_vaddvaq_u32 (__a, __b);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r (uint32x4_t __a, int32_t __b)
> -{
> - return __arm_vshlq_r_u32 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq (uint32x4_t __a, int32x4_t __b)
> -{
> - return __arm_vqshlq_u32 (__a, __b);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r (uint32x4_t __a, int32_t __b)
> -{
> - return __arm_vqshlq_r_u32 (__a, __b);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq (uint32_t __a, int32x4_t __b)
> @@ -17465,13 +16576,6 @@ __arm_vbrsrq (uint32x4_t __a, int32_t __b)
>   return __arm_vbrsrq_n_u32 (__a, __b);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n (uint32x4_t __a, const int __imm)
> -{
> - return __arm_vshlq_n_u32 (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq (uint32x4_t __a, const int __imm)
> @@ -17479,13 +16583,6 @@ __arm_vrshrq (uint32x4_t __a, const int __imm)
>   return __arm_vrshrq_n_u32 (__a, __imm);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n (uint32x4_t __a, const int __imm)
> -{
> - return __arm_vqshlq_n_u32 (__a, __imm);
> -}
> -
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcmpneq (int32x4_t __a, int32_t __b)
> @@ -17577,27 +16674,6 @@ __arm_vaddvq_p (int32x4_t __a,
> mve_pred16_t __p)
>   return __arm_vaddvq_p_s32 (__a, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_r (int32x4_t __a, int32_t __b)
> -{
> - return __arm_vshlq_r_s32 (__a, __b);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq (int32x4_t __a, int32x4_t __b)
> -{
> - return __arm_vqshlq_s32 (__a, __b);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_r (int32x4_t __a, int32_t __b)
> -{
> - return __arm_vqshlq_r_s32 (__a, __b);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vornq (int32x4_t __a, int32x4_t __b)
> @@ -17724,13 +16800,6 @@ __arm_vaddvaq (int32_t __a, int32x4_t __b)
>   return __arm_vaddvaq_s32 (__a, __b);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_n (int32x4_t __a, const int __imm)
> -{
> - return __arm_vshlq_n_s32 (__a, __imm);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq (int32x4_t __a, const int __imm)
> @@ -17738,13 +16807,6 @@ __arm_vrshrq (int32x4_t __a, const int __imm)
>   return __arm_vrshrq_n_s32 (__a, __imm);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_n (int32x4_t __a, const int __imm)
> -{
> - return __arm_vqshlq_n_s32 (__a, __imm);
> -}
> -
>  __extension__ extern __inline uint8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqmovntq (uint8x16_t __a, uint16x8_t __b)
> @@ -18501,20 +17563,6 @@ __arm_vsliq (uint8x16_t __a, uint8x16_t __b,
> const int __imm)
>   return __arm_vsliq_n_u8 (__a, __b, __imm);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r (uint8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_r_u8 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r (uint8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_r_u8 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_p (uint8_t __a, int8x16_t __b, mve_pred16_t __p)
> @@ -18627,13 +17675,6 @@ __arm_vcmpeqq_m (int8x16_t __a, int8_t __b,
> mve_pred16_t __p)
>   return __arm_vcmpeqq_m_n_s8 (__a, __b, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r (int8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_r_s8 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrev64q_m (int8x16_t __inactive, int8x16_t __a, mve_pred16_t __p)
> @@ -18641,13 +17682,6 @@ __arm_vrev64q_m (int8x16_t __inactive,
> int8x16_t __a, mve_pred16_t __p)
>   return __arm_vrev64q_m_s8 (__inactive, __a, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r (int8x16_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_r_s8 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqnegq_m (int8x16_t __inactive, int8x16_t __a, mve_pred16_t __p)
> @@ -19054,20 +18088,6 @@ __arm_vsliq (uint16x8_t __a, uint16x8_t __b,
> const int __imm)
>   return __arm_vsliq_n_u16 (__a, __b, __imm);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r (uint16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_r_u16 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r (uint16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_r_u16 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_p (uint16_t __a, int16x8_t __b, mve_pred16_t __p)
> @@ -19175,16 +18195,9 @@ __arm_vcmpeqq_m (int16x8_t __a, int16x8_t
> __b, mve_pred16_t __p)
> 
>  __extension__ extern __inline mve_pred16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vcmpeqq_m (int16x8_t __a, int16_t __b, mve_pred16_t __p)
> -{
> - return __arm_vcmpeqq_m_n_s16 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r (int16x8_t __a, int32_t __b, mve_pred16_t __p)
> +__arm_vcmpeqq_m (int16x8_t __a, int16_t __b, mve_pred16_t __p)
>  {
> - return __arm_vshlq_m_r_s16 (__a, __b, __p);
> + return __arm_vcmpeqq_m_n_s16 (__a, __b, __p);
>  }
> 
>  __extension__ extern __inline int16x8_t
> @@ -19194,13 +18207,6 @@ __arm_vrev64q_m (int16x8_t __inactive,
> int16x8_t __a, mve_pred16_t __p)
>   return __arm_vrev64q_m_s16 (__inactive, __a, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r (int16x8_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_r_s16 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqnegq_m (int16x8_t __inactive, int16x8_t __a, mve_pred16_t __p)
> @@ -19607,20 +18613,6 @@ __arm_vsliq (uint32x4_t __a, uint32x4_t __b,
> const int __imm)
>   return __arm_vsliq_n_u32 (__a, __b, __imm);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r (uint32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_r_u32 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r (uint32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_r_u32 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vminavq_p (uint32_t __a, int32x4_t __b, mve_pred16_t __p)
> @@ -19733,13 +18725,6 @@ __arm_vcmpeqq_m (int32x4_t __a, int32_t __b,
> mve_pred16_t __p)
>   return __arm_vcmpeqq_m_n_s32 (__a, __b, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_r (int32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_r_s32 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrev64q_m (int32x4_t __inactive, int32x4_t __a, mve_pred16_t __p)
> @@ -19747,13 +18732,6 @@ __arm_vrev64q_m (int32x4_t __inactive,
> int32x4_t __a, mve_pred16_t __p)
>   return __arm_vrev64q_m_s32 (__inactive, __a, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_r (int32x4_t __a, int32_t __b, mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_r_s32 (__a, __b, __p);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vqnegq_m (int32x4_t __inactive, int32x4_t __a, mve_pred16_t __p)
> @@ -20755,13 +19733,6 @@ __arm_vsriq_m (uint8x16_t __a, uint8x16_t
> __b, const int __imm, mve_pred16_t __p
>   return __arm_vsriq_m_n_u8 (__a, __b, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m (uint8x16_t __inactive, uint8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_u8 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vabavq_p (uint32_t __a, uint8x16_t __b, uint8x16_t __c,
> mve_pred16_t __p)
> @@ -20769,13 +19740,6 @@ __arm_vabavq_p (uint32_t __a, uint8x16_t __b,
> uint8x16_t __c, mve_pred16_t __p)
>   return __arm_vabavq_p_u8 (__a, __b, __c, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_s8 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vsriq_m (int16x8_t __a, int16x8_t __b, const int __imm,
> mve_pred16_t __p)
> @@ -20804,13 +19768,6 @@ __arm_vsriq_m (uint16x8_t __a, uint16x8_t
> __b, const int __imm, mve_pred16_t __p
>   return __arm_vsriq_m_n_u16 (__a, __b, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m (uint16x8_t __inactive, uint16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_u16 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vabavq_p (uint32_t __a, uint16x8_t __b, uint16x8_t __c,
> mve_pred16_t __p)
> @@ -20818,13 +19775,6 @@ __arm_vabavq_p (uint32_t __a, uint16x8_t __b,
> uint16x8_t __c, mve_pred16_t __p)
>   return __arm_vabavq_p_u16 (__a, __b, __c, __p);
>  }
> 
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_s16 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vsriq_m (int32x4_t __a, int32x4_t __b, const int __imm,
> mve_pred16_t __p)
> @@ -20853,13 +19803,6 @@ __arm_vsriq_m (uint32x4_t __a, uint32x4_t
> __b, const int __imm, mve_pred16_t __p
>   return __arm_vsriq_m_n_u32 (__a, __b, __imm, __p);
>  }
> 
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m (uint32x4_t __inactive, uint32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_u32 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline uint32_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vabavq_p (uint32_t __a, uint32x4_t __b, uint32x4_t __c,
> mve_pred16_t __p)
> @@ -20867,13 +19810,6 @@ __arm_vabavq_p (uint32_t __a, uint32x4_t __b,
> uint32x4_t __c, mve_pred16_t __p)
>   return __arm_vabavq_p_u32 (__a, __b, __c, __p);
>  }
> 
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_s32 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vbicq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> @@ -21735,90 +20671,6 @@ __arm_vqrdmlsdhxq_m (int16x8_t __inactive,
> int16x8_t __a, int16x8_t __b, mve_pre
>   return __arm_vqrdmlsdhxq_m_s16 (__inactive, __a, __b, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_n_s8 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n (int32x4_t __inactive, int32x4_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_n_s32 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n (int16x8_t __inactive, int16x8_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_n_s16 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n (uint8x16_t __inactive, uint8x16_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_n_u8 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n (uint32x4_t __inactive, uint32x4_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_n_u32 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m_n (uint16x8_t __inactive, uint16x8_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_n_u16 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_s8 (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_s32 (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_s16 (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m (uint8x16_t __inactive, uint8x16_t __a, int8x16_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_u8 (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m (uint32x4_t __inactive, uint32x4_t __a, int32x4_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_u32 (__inactive, __a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vqshlq_m (uint16x8_t __inactive, uint16x8_t __a, int16x8_t __b,
> mve_pred16_t __p)
> -{
> - return __arm_vqshlq_m_u16 (__inactive, __a, __b, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_m (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> @@ -21861,48 +20713,6 @@ __arm_vrshrq_m (uint16x8_t __inactive,
> uint16x8_t __a, const int __imm, mve_pred
>   return __arm_vrshrq_m_n_u16 (__inactive, __a, __imm, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_n_s8 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n (int32x4_t __inactive, int32x4_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_n_s32 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n (int16x8_t __inactive, int16x8_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_n_s16 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n (uint8x16_t __inactive, uint8x16_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_n_u8 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n (uint32x4_t __inactive, uint32x4_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_n_u32 (__inactive, __a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_m_n (uint16x8_t __inactive, uint16x8_t __a, const int __imm,
> mve_pred16_t __p)
> -{
> - return __arm_vshlq_m_n_u16 (__inactive, __a, __imm, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vshrq_m (int8x16_t __inactive, int8x16_t __a, const int __imm,
> mve_pred16_t __p)
> @@ -24787,90 +23597,6 @@ __arm_vshlltq_x (uint16x8_t __a, const int
> __imm, mve_pred16_t __p)
>   return __arm_vshlltq_x_n_u16 (__a, __imm, __p);
>  }
> 
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_s8 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_s16 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_s32 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x (uint8x16_t __a, int8x16_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_u8 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x (uint16x8_t __a, int16x8_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_u16 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x (uint32x4_t __a, int32x4_t __b, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_u32 (__a, __b, __p);
> -}
> -
> -__extension__ extern __inline int8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n (int8x16_t __a, const int __imm, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_n_s8 (__a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n (int16x8_t __a, const int __imm, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_n_s16 (__a, __imm, __p);
> -}
> -
> -__extension__ extern __inline int32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n (int32x4_t __a, const int __imm, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_n_s32 (__a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint8x16_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n (uint8x16_t __a, const int __imm, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_n_u8 (__a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint16x8_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n (uint16x8_t __a, const int __imm, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_n_u16 (__a, __imm, __p);
> -}
> -
> -__extension__ extern __inline uint32x4_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -__arm_vshlq_x_n (uint32x4_t __a, const int __imm, mve_pred16_t __p)
> -{
> - return __arm_vshlq_x_n_u32 (__a, __imm, __p);
> -}
> -
>  __extension__ extern __inline int8x16_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vrshrq_x (int8x16_t __a, const int __imm, mve_pred16_t __p)
> @@ -28165,16 +26891,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_f16_u16
> (__ARM_mve_coerce(__p0, uint16x8_t)), \
>    int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_f32_u32
> (__ARM_mve_coerce(__p0, uint32x4_t)));})
> 
> -#define __arm_vshlq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_s16 (__ARM_mve_coerce(__p0, int16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_s32 (__ARM_mve_coerce(__p0, int32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_u8 (__ARM_mve_coerce(__p0, uint8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)));})
> -
>  #define __arm_vshrq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
>    int (*)[__ARM_mve_type_int8x16_t]: __arm_vshrq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> @@ -28434,24 +27150,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_fp_n][__ARM_mve_type_float16x8_t]:
> __arm_vminnmvq_f16 (__ARM_mve_coerce2(p0, double),
> __ARM_mve_coerce(__p1, float16x8_t)), \
>    int (*)[__ARM_mve_type_fp_n][__ARM_mve_type_float32x4_t]:
> __arm_vminnmvq_f32 (__ARM_mve_coerce2(p0, double),
> __ARM_mve_coerce(__p1, float32x4_t)));})
> 
> -#define __arm_vshlq_r(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
> -#define __arm_vshlq_n(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_n_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_n_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_n_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_n_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_n_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
>  #define __arm_vshlltq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
>    int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlltq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> @@ -28490,34 +27188,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshluq_n_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
>    int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshluq_n_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1));})
> 
> -#define __arm_vqshlq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vqshlq_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vqshlq_s16 (__ARM_mve_coerce(__p0, int16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vqshlq_s32 (__ARM_mve_coerce(__p0, int32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vqshlq_u8 (__ARM_mve_coerce(__p0, uint8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vqshlq_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vqshlq_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)));})
> -
> -#define __arm_vqshlq_r(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshlq_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshlq_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshlq_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vqshlq_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vqshlq_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vqshlq_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
> -#define __arm_vqshlq_n(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshlq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshlq_n_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshlq_n_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vqshlq_n_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vqshlq_n_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vqshlq_n_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
>  #define __arm_vmlaldavxq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> @@ -28756,24 +27426,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]:
> __arm_vsliq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, uint16x8_t), p2), \
>    int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]:
> __arm_vsliq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, uint32x4_t), p2));})
> 
> -#define __arm_vshlq_m_r(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_m_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_m_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_m_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_m_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_m_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_m_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));})
> -
> -#define __arm_vqshlq_m_r(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshlq_m_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshlq_m_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshlq_m_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vqshlq_m_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vqshlq_m_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vqshlq_m_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));})
> -
>  #define __arm_vqrdmlsdhxq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    __typeof(p2) __p2 = (p2); \
> @@ -30170,44 +28822,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]:
> __arm_vcmpneq_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, uint16x8_t)), \
>    int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]:
> __arm_vcmpneq_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, uint32x4_t)));})
> 
> -#define __arm_vshlq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_s16 (__ARM_mve_coerce(__p0, int16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_s32 (__ARM_mve_coerce(__p0, int32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_u8 (__ARM_mve_coerce(__p0, uint8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)));})
> -
> -#define __arm_vshlq_r(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
> -#define __arm_vqshlq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vqshlq_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vqshlq_s16 (__ARM_mve_coerce(__p0, int16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vqshlq_s32 (__ARM_mve_coerce(__p0, int32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vqshlq_u8 (__ARM_mve_coerce(__p0, uint8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t)), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vqshlq_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t)), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vqshlq_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t)));})
> -
> -#define __arm_vqshlq_r(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshlq_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshlq_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshlq_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vqshlq_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vqshlq_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vqshlq_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
>  #define __arm_vqshluq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
>    int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshluq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> @@ -30223,24 +28837,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16x8_t]: __arm_vrshrq_n_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
>    int (*)[__ARM_mve_type_uint32x4_t]: __arm_vrshrq_n_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> 
> -#define __arm_vshlq_n(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_n_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_n_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_n_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_n_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_n_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
> -#define __arm_vqshlq_n(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshlq_n_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshlq_n_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshlq_n_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vqshlq_n_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vqshlq_n_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vqshlq_n_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1));})
> -
>  #define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> @@ -30588,15 +29184,6 @@ extern void *__ARM_undef;
>    int
> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve
> _type_int16x8_t]: __arm_vqrdmlsdhxq_s16 (__ARM_mve_coerce(__p0,
> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2,
> int16x8_t)), \
>    int
> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve
> _type_int32x4_t]: __arm_vqrdmlsdhxq_s32 (__ARM_mve_coerce(__p0,
> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2,
> int32x4_t)));})
> 
> -#define __arm_vqshlq_m_r(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vqshlq_m_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vqshlq_m_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vqshlq_m_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vqshlq_m_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vqshlq_m_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vqshlq_m_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));})
> -
>  #define __arm_vrev64q_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> @@ -30607,15 +29194,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]:
> __arm_vrev64q_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, uint16x8_t), p2), \
>    int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]:
> __arm_vrev64q_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, uint32x4_t), p2));})
> 
> -#define __arm_vshlq_m_r(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_m_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_m_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_m_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_m_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_m_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_m_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));})
> -
>  #define __arm_vsliq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> @@ -31514,16 +30092,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_int8x16_t]: __arm_vrev16q_x_s8
> (__ARM_mve_coerce(__p1, int8x16_t), p2), \
>    int (*)[__ARM_mve_type_uint8x16_t]: __arm_vrev16q_x_u8
> (__ARM_mve_coerce(__p1, uint8x16_t), p2));})
> 
> -#define __arm_vshlq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \
> -  __typeof(p2) __p2 = (p2); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t),
> __ARM_mve_coerce(__p2, int8x16_t), p3), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t),
> __ARM_mve_coerce(__p2, int16x8_t), p3), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t),
> __ARM_mve_coerce(__p2, int32x4_t), p3), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t),
> __ARM_mve_coerce(__p2, int8x16_t), p3), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t),
> __ARM_mve_coerce(__p2, int16x8_t), p3), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t),
> __ARM_mve_coerce(__p2, int32x4_t), p3));})
> -
>  #define __arm_vrshrq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \
>    int (*)[__ARM_mve_type_int8x16_t]: __arm_vrshrq_x_n_s8
> (__ARM_mve_coerce(__p1, int8x16_t), p2, p3), \
> @@ -31547,15 +30115,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlltq_x_n_u8
> (__ARM_mve_coerce(__p1, uint8x16_t), p2, p3), \
>    int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlltq_x_n_u16
> (__ARM_mve_coerce(__p1, uint16x8_t), p2, p3));})
> 
> -#define __arm_vshlq_x_n(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_x_n_s8
> (__ARM_mve_coerce(__p1, int8x16_t), p2, p3), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_x_n_s16
> (__ARM_mve_coerce(__p1, int16x8_t), p2, p3), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_x_n_s32
> (__ARM_mve_coerce(__p1, int32x4_t), p2, p3), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_x_n_u8
> (__ARM_mve_coerce(__p1, uint8x16_t), p2, p3), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_x_n_u16
> (__ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_x_n_u32
> (__ARM_mve_coerce(__p1, uint32x4_t), p2, p3));})
> -
>  #define __arm_vdwdupq_x_u8(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \
>    int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_x_n_u8 ((uint32_t)
> __p1, p2, p3, p4), \
> @@ -31771,27 +30330,6 @@ extern void *__ARM_undef;
>    int
> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve
> _type_int_n]: __arm_vqdmlashq_m_n_s16 (__ARM_mve_coerce(__p0,
> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce3(p2,
> int), p3), \
>    int
> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve
> _type_int_n]: __arm_vqdmlashq_m_n_s32 (__ARM_mve_coerce(__p0,
> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce3(p2,
> int), p3));})
> 
> -#define __arm_vqshlq_m_n(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vqshlq_m_n_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vqshlq_m_n_s16 (__ARM_mve_coerce(__p0, int16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vqshlq_m_n_s32 (__ARM_mve_coerce(__p0, int32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]:
> __arm_vqshlq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t),
> __ARM_mve_coerce(__p1, uint8x16_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]:
> __arm_vqshlq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, uint16x8_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]:
> __arm_vqshlq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, uint32x4_t),  p2, p3));})
> -
> -#define __arm_vqshlq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  __typeof(p2) __p2 = (p2); \
> -  _Generic( (int
> (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typ
> eid(__p2)])0, \
> -  int
> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve
> _type_int8x16_t]: __arm_vqshlq_m_s8 (__ARM_mve_coerce(__p0,
> int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2,
> int8x16_t), p3), \
> -  int
> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve
> _type_int16x8_t]: __arm_vqshlq_m_s16 (__ARM_mve_coerce(__p0,
> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2,
> int16x8_t), p3), \
> -  int
> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve
> _type_int32x4_t]: __arm_vqshlq_m_s32 (__ARM_mve_coerce(__p0,
> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2,
> int32x4_t), p3), \
> -  int
> (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_m
> ve_type_int8x16_t]: __arm_vqshlq_m_u8 (__ARM_mve_coerce(__p0,
> uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t),
> __ARM_mve_coerce(__p2, int8x16_t), p3), \
> -  int
> (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_m
> ve_type_int16x8_t]: __arm_vqshlq_m_u16 (__ARM_mve_coerce(__p0,
> uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t),
> __ARM_mve_coerce(__p2, int16x8_t), p3), \
> -  int
> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m
> ve_type_int32x4_t]: __arm_vqshlq_m_u32 (__ARM_mve_coerce(__p0,
> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t),
> __ARM_mve_coerce(__p2, int32x4_t), p3));})
> -
>  #define __arm_vrshrq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> @@ -32044,36 +30582,6 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vqshluq_m_n_s16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t), p2, p3), \
>    int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vqshluq_m_n_s32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t), p2, p3));})
> 
> -#define __arm_vshlq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  __typeof(p2) __p2 = (p2); \
> -  _Generic( (int
> (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typ
> eid(__p2)])0, \
> -  int
> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve
> _type_int8x16_t]: __arm_vshlq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t),
> p3), \
> -  int
> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve
> _type_int16x8_t]: __arm_vshlq_m_s16 (__ARM_mve_coerce(__p0,
> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2,
> int16x8_t), p3), \
> -  int
> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve
> _type_int32x4_t]: __arm_vshlq_m_s32 (__ARM_mve_coerce(__p0,
> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2,
> int32x4_t), p3), \
> -  int
> (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_m
> ve_type_int8x16_t]: __arm_vshlq_m_u8 (__ARM_mve_coerce(__p0,
> uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t),
> __ARM_mve_coerce(__p2, int8x16_t), p3), \
> -  int
> (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_m
> ve_type_int16x8_t]: __arm_vshlq_m_u16 (__ARM_mve_coerce(__p0,
> uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t),
> __ARM_mve_coerce(__p2, int16x8_t), p3), \
> -  int
> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m
> ve_type_int32x4_t]: __arm_vshlq_m_u32 (__ARM_mve_coerce(__p0,
> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t),
> __ARM_mve_coerce(__p2, int32x4_t), p3));})
> -
> -#define __arm_vshlq_m_n(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \
> -  __typeof(p1) __p1 = (p1); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> -  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]:
> __arm_vshlq_m_n_s8 (__ARM_mve_coerce(__p0, int8x16_t),
> __ARM_mve_coerce(__p1, int8x16_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]:
> __arm_vshlq_m_n_s16 (__ARM_mve_coerce(__p0, int16x8_t),
> __ARM_mve_coerce(__p1, int16x8_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]:
> __arm_vshlq_m_n_s32 (__ARM_mve_coerce(__p0, int32x4_t),
> __ARM_mve_coerce(__p1, int32x4_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]:
> __arm_vshlq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t),
> __ARM_mve_coerce(__p1, uint8x16_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]:
> __arm_vshlq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t),
> __ARM_mve_coerce(__p1, uint16x8_t),  p2, p3), \
> -  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]:
> __arm_vshlq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t),
> __ARM_mve_coerce(__p1, uint32x4_t),  p2, p3));})
> -
> -#define __arm_vshlq_m_r(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> -  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> -  int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlq_m_r_s8
> (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlq_m_r_s16
> (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlq_m_r_s32
> (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlq_m_r_u8
> (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlq_m_r_u16
> (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \
> -  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlq_m_r_u32
> (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));})
> -
>  #define __arm_vsriq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \
>    __typeof(p1) __p1 = (p1); \
>    _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> \
> --
> 2.34.1


  reply	other threads:[~2023-05-05 10:58 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05  8:39 [PATCH 01/23] arm: [MVE intrinsics] add binary_round_lshift shape Christophe Lyon
2023-05-05  8:39 ` [PATCH 02/23] arm: [MVE intrinsics] factorize vqrshlq vrshlq Christophe Lyon
2023-05-05  9:58   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 03/23] arm: [MVE intrinsics] rework vrshlq vqrshlq Christophe Lyon
2023-05-05  9:59   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 04/23] arm: [MVE intrinsics] factorize vqshlq vshlq Christophe Lyon
2023-05-05 10:00   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 05/23] arm: [MVE intrinsics] rework vqrdmulhq Christophe Lyon
2023-05-05 10:01   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 06/23] arm: [MVE intrinsics] factorize vabdq Christophe Lyon
2023-05-05 10:48   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 07/23] arm: [MVE intrinsics] rework vabdq Christophe Lyon
2023-05-05 10:49   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 08/23] arm: [MVE intrinsics] add binary_lshift shape Christophe Lyon
2023-05-05 10:51   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 09/23] arm: [MVE intrinsics] add support for MODE_r Christophe Lyon
2023-05-05 10:55   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 10/23] arm: [MVE intrinsics] add binary_lshift_r shape Christophe Lyon
2023-05-05 10:56   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 11/23] arm: [MVE intrinsics] add unspec_mve_function_exact_insn_vshl Christophe Lyon
2023-05-05 10:56   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 12/23] arm: [MVE intrinsics] rework vqshlq vshlq Christophe Lyon
2023-05-05 10:58   ` Kyrylo Tkachov [this message]
2023-05-05  8:39 ` [PATCH 13/23] arm: [MVE intrinsics] factorize vmaxq vminq Christophe Lyon
2023-05-05 10:58   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 14/23] arm: [MVE intrinsics] rework " Christophe Lyon
2023-05-05 10:59   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 15/23] arm: [MVE intrinsics] add binary_rshift_narrow shape Christophe Lyon
2023-05-05 11:00   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 16/23] arm: [MVE intrinsics] factorize vshrntq vshrnbq vrshrnbq vrshrntq vqshrnbq vqshrntq vqrshrnbq vqrshrntq Christophe Lyon
2023-05-05 11:00   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 17/23] arm: [MVE intrinsics] rework vshrnbq vshrntq " Christophe Lyon
2023-05-05 11:02   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 18/23] arm: [MVE intrinsics] add binary_rshift_narrow_unsigned shape Christophe Lyon
2023-05-05 11:03   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 19/23] arm: [MVE intrinsics] factorize vqrshrunb vqrshrunt vqshrunb vqshrunt Christophe Lyon
2023-05-05 11:04   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 20/23] arm: [MVE intrinsics] rework vqrshrunbq vqrshruntq vqshrunbq vqshruntq Christophe Lyon
2023-05-05 11:05   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 21/23] arm: [MVE intrinsics] add binary_rshift shape Christophe Lyon
2023-05-05 11:05   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 22/23] arm: [MVE intrinsics] factorize vsrhrq vrshrq Christophe Lyon
2023-05-05 11:06   ` Kyrylo Tkachov
2023-05-05  8:39 ` [PATCH 23/23] arm: [MVE intrinsics] rework vshrq vrshrq Christophe Lyon
2023-05-05 11:07   ` Kyrylo Tkachov
2023-05-05  9:55 ` [PATCH 01/23] arm: [MVE intrinsics] add binary_round_lshift shape Kyrylo Tkachov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PAXPR08MB69260AF3268042306687D89B93729@PAXPR08MB6926.eurprd08.prod.outlook.com \
    --to=kyrylo.tkachov@arm.com \
    --cc=Christophe.Lyon@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).