From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113616 invoked by alias); 7 Jan 2020 10:33:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113603 invoked by uid 89); 7 Jan 2020 10:33:46 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.5 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=structural X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 07 Jan 2020 10:33:42 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 24494328; Tue, 7 Jan 2020 02:33:41 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 885813F534; Tue, 7 Jan 2020 02:33:40 -0800 (PST) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org Subject: Re: Add a compatible_vector_types_p target hook References: <171FF151-2751-445A-A0EC-AE8C8B8E67B4@gmail.com> <0D72EEA4-FCFB-4CBD-85B6-F43994F95780@gmail.com> <7B2DC3F4-2537-46F0-B2A0-4628B0748E79@gmail.com> <95B33174-7456-462A-9778-0EEF46E2AE7B@gmail.com> Date: Tue, 07 Jan 2020 10:33:00 -0000 In-Reply-To: (Richard Sandiford's message of "Mon, 16 Dec 2019 15:59:38 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2020-01/txt/msg00217.txt.bz2 Richard Sandiford writes: > Richard Biener writes: >> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford wrote: >>>Richard Biener writes: >>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford >>> wrote: >>>>>Richard Biener writes: >>>>>>>>>The AArch64 port emits an error if calls pass values of SVE type >>>to >>>>>>>an >>>>>>>>>unprototyped function. To do that we need to know whether the >>>>>value >>>>>>>>>really is an SVE type rathr than a plain vector. >>>>>>>>> >>>>>>>>>For varags the ABI is the same for 256 bits+. But we'll have the >>>>>>>>>same problem there once we support -msve-vector-bits=128, since >>>the >>>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian. >>>>>>>> >>>>>>>> But then why don't you have different modes? >>>>>>> >>>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE >>>>>>>difference. But from a vector value POV, a vector of 4 ints is a >>>>>>>vector >>>>>>>of 4 ints, so even distinguishing based on the mode is artificial. >>>>>> >>>>>> True. >>>>>> >>>>>>>SVE is AFAIK the first target to have different modes for >>>potentially >>>>>>>the "same" vector type, and I had to add new infrastructure to >>>allow >>>>>>>targets to define multiple modes of the same size. So the fact >>>that >>>>>>>gimple distinguishes otherwise identical vectors based on mode is a >>>>>>>relatively recent thing. AFAIK it just fell out in the wash rather >>>>>>>than being deliberately planned. It happens to be convenient in >>>this >>>>>>>context, but it hasn't been important until now. >>>>>>> >>>>>>>The hook doesn't seem any worse than distinguishing based on the >>>>>mode. >>>>>>>Another way to avoid this would have been to define separate SVE >>>>>modes >>>>>>>for the predefined vectors. The big downside of that is that we'd >>>>>end >>>>>>>up doubling the number of SVE patterns. >>>>>>> >>>>>>>Extra on-the-side metadata is going to be easy to drop >>>accidentally, >>>>>>>and this is something we need for correctness rather than >>>>>optimisation. >>>>>> >>>>>> Still selecting the ABI during call expansion only and based on >>>>>values types at that point is fragile. >>>>> >>>>>Agreed. But it's fragile in general, not just for this case. >>>Changing >>>>>something as fundamental as that would be a lot of work and seems >>>>>likely >>>>>to introduce accidental ABI breakage. >>>>> >>>>>> The frontend are in charge of specifying the actual argument type >>>and >>>>>> at that point the target may fix the ABI. The ABI can be recorded >>>in >>>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward >>>>>> ways for varargs functions (in full generality that would mean >>>>>> attaching varargs ABI meta to each call). >>>>>> >>>>>> The alternative is to have an actual argument type vector >>>associated >>>>>> with each call. >>>>> >>>>>I think multiple pieces of gimple code would then have to cope with >>>>>that >>>>>as a special case. E.g. if: >>>>> >>>>> void foo (int, ...); >>>>> >>>>> type1 a; >>>>> b = VIEW_CONVERT_EXPR (a); >>>>> if (a) >>>>> foo (1, a); >>>>> else >>>>> foo (1, b); >>>>> >>>>>gets converted to: >>>>> >>>>> if (a) >>>>> foo (1, a); >>>>> else >>>>> foo (1, a); >>>>> >>>>>on the basis that type1 and type2 are "the same" despite having >>>>>different calling conventions, we have to be sure that the calls >>>>>are not treated as equivalent: >>>>> >>>>> foo (1, a); >>>>> >>>>>Things like IPA clones would also need to handle this specially. >>>>>Anything that generates new calls based on old ones will need >>>>>to copy this information too. >>>>> >>>>>This also sounds like it would be fragile and seems a bit too >>>>>invasive for stage 3. >>>> >>>> But we are already relying on this to work (fntype non-propagation) >>>because function pointer conversions are dropped on the floor. >>>> >>>> The real change would be introducing (per call) fntype for calls to >>>unprototyped functions and somehow dealing with varargs. >>> >>>It looks like this itself relies on useless_type_conversion_p, >>>is that right? E.g. we have things like: >>> >>>bool >>>func_checker::compare_gimple_call (gcall *s1, gcall *s2) >>>{ >>> ... >>> tree fntype1 = gimple_call_fntype (s1); >>> tree fntype2 = gimple_call_fntype (s2); >>> if ((fntype1 && !fntype2) >>> || (!fntype1 && fntype2) >>> || (fntype1 && !types_compatible_p (fntype1, fntype2))) >>>return return_false_with_msg ("call function types are not >>>compatible"); >>> >>>and useless_type_conversion_p has: >>> >>> else if ((TREE_CODE (inner_type) == FUNCTION_TYPE >>> || TREE_CODE (inner_type) == METHOD_TYPE) >>> && TREE_CODE (inner_type) == TREE_CODE (outer_type)) >>> { >>> tree outer_parm, inner_parm; >>> >>> /* If the return types are not compatible bail out. */ >>> if (!useless_type_conversion_p (TREE_TYPE (outer_type), >>> TREE_TYPE (inner_type))) >>> return false; >>> >>> /* Method types should belong to a compatible base class. */ >>> if (TREE_CODE (inner_type) == METHOD_TYPE >>> && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type), >>> TYPE_METHOD_BASETYPE (inner_type))) >>> return false; >>> >>> /* A conversion to an unprototyped argument list is ok. */ >>> if (!prototype_p (outer_type)) >>> return true; >>> >>> /* If the unqualified argument types are compatible the conversion >>> is useless. */ >>> if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type)) >>> return true; >>> >>> for (outer_parm = TYPE_ARG_TYPES (outer_type), >>> inner_parm = TYPE_ARG_TYPES (inner_type); >>> outer_parm && inner_parm; >>> outer_parm = TREE_CHAIN (outer_parm), >>> inner_parm = TREE_CHAIN (inner_parm)) >>> if (!useless_type_conversion_p >>> (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)), >>> TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm)))) >>> return false; >>> >>>So it looks like we'd still need to distinguish the vector types in >>>useless_type_conversion_p even if we went the fntype route. The >>>difference >>>is that the fntype route would give us the option of only >>>distinguishing >>>the vectors for return and argument types and not in general. >>> >>>But if we are going to have to distinguish the vectors here anyway >>>in some form, could we go with the patch as-is for stage 3 and leave >>>restricting this to just return and argument types as a follow-on >>>optimisation? >> >> How does this get around the LTO canonical type merging machinery? That is, how are those types streamed and how are they identified by the backend? Just by means of being pointer equal to some statically built type in the backend? >> Or does the type have some attribute on it or on the component? How does the middle end build a related type with the same ABI, like a vector with the half number of elements? > > Hmm... > > At the moment it's based on pointer equality between the TYPE_MAIN_VARIANT > and statically-built types. We predefine the only available SVE "ABI types" > and there's no way to create "new" ones. > > But you're right that that doesn't work for LTO -- in general, not just > for this conversion patch -- because no streamed types end up as ABI types. > So we'll need an attribute after all, with the ABI decisions keyed off that > rather than TYPE_MAIN_VARIANT pointer equality. Will fix... Now fixed :-) > Once that's fixed, the fact that we use SET_TYPE_STRUCTURAL_EQUALITY > for the ABI types means that the types remain distinct from "normal" > vector types even for TYPE_CANONICAL purposes, since: > > As a special case, if TYPE_CANONICAL is NULL_TREE, and thus > TYPE_STRUCTURAL_EQUALITY_P is true, then it cannot > be used for comparison against other types. Instead, the type is > said to require structural equality checks, described in > TYPE_STRUCTURAL_EQUALITY_P. > [...] > #define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type_common.canonical) > /* Indicates that the type node requires structural equality > checks. The compiler will need to look at the composition of the > type to determine whether it is equal to another type, rather than > just comparing canonical type pointers. For instance, we would need > to look at the return and parameter types of a FUNCTION_TYPE > node. */ > #define TYPE_STRUCTURAL_EQUALITY_P(NODE) (TYPE_CANONICAL (NODE) == NULL_TREE) > > We also have: > > /* Return ture if get_alias_set care about TYPE_CANONICAL of given type. > We don't define the types for pointers, arrays and vectors. The reason is > that pointers are handled specially: ptr_type_node accesses conflict with > accesses to all other pointers. This is done by alias.c. > Because alias sets of arrays and vectors are the same as types of their > elements, we can't compute canonical type either. Otherwise we could go > form void *[10] to int *[10] (because they are equivalent for canonical type > machinery) and get wrong TBAA. */ > > inline bool > canonical_type_used_p (const_tree t) > { > return !(POINTER_TYPE_P (t) > || TREE_CODE (t) == ARRAY_TYPE > || TREE_CODE (t) == VECTOR_TYPE); > } > > So with the attribute added (needed anyway), the patch does seem to > work for LTO too. Given the above, is the patch OK? I agree it isn't very elegant, but at the moment we have no choice but to distinguish the vector types at some point during gimple. Thanks, Richard 2020-01-07 Richard Sandiford gcc/ * target.def (compatible_vector_types_p): New target hook. * hooks.h (hook_bool_const_tree_const_tree_true): Declare. * hooks.c (hook_bool_const_tree_const_tree_true): New function. * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook. * doc/tm.texi: Regenerate. * gimple-expr.c: Include target.h. (useless_type_conversion_p): Use targetm.compatible_vector_types_p. * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New function. (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define. * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred): Use the original predicate if it already has a suitable type. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test. * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise. Index: gcc/target.def =================================================================== --- gcc/target.def 2020-01-06 12:57:55.753930730 +0000 +++ gcc/target.def 2020-01-07 10:24:01.546344751 +0000 @@ -3411,6 +3411,29 @@ must have move patterns for this mode.", hook_bool_mode_false) DEFHOOK +(compatible_vector_types_p, + "Return true if there is no target-specific reason for treating\n\ +vector types @var{type1} and @var{type2} as distinct types. The caller\n\ +has already checked for target-independent reasons, meaning that the\n\ +types are known to have the same mode, to have the same number of elements,\n\ +and to have what the caller considers to be compatible element types.\n\ +\n\ +The main reason for defining this hook is to reject pairs of types\n\ +that are handled differently by the target's calling convention.\n\ +For example, when a new @var{N}-bit vector architecture is added\n\ +to a target, the target may want to handle normal @var{N}-bit\n\ +@code{VECTOR_TYPE} arguments and return values in the same way as\n\ +before, to maintain backwards compatibility. However, it may also\n\ +provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\ +and returned in a more efficient way. It is then important to maintain\n\ +a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\ +architecture-specific ones.\n\ +\n\ +The default implementation returns true, which is correct for most targets.", + bool, (const_tree type1, const_tree type2), + hook_bool_const_tree_const_tree_true) + +DEFHOOK (vector_alignment, "This hook can be used to define the alignment for a vector of type\n\ @var{type}, in order to comply with a platform ABI. The default is to\n\ Index: gcc/hooks.h =================================================================== --- gcc/hooks.h 2020-01-06 12:57:54.749937335 +0000 +++ gcc/hooks.h 2020-01-07 10:24:01.542344777 +0000 @@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal extern bool hook_bool_uint_mode_true (unsigned int, machine_mode); extern bool hook_bool_tree_false (tree); extern bool hook_bool_const_tree_false (const_tree); +extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree); extern bool hook_bool_tree_true (tree); extern bool hook_bool_const_tree_true (const_tree); extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *); Index: gcc/hooks.c =================================================================== --- gcc/hooks.c 2020-01-06 12:57:54.745937361 +0000 +++ gcc/hooks.c 2020-01-07 10:24:01.542344777 +0000 @@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree) } bool +hook_bool_const_tree_const_tree_true (const_tree, const_tree) +{ + return true; +} + +bool hook_bool_tree_true (tree) { return true; Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in 2020-01-06 12:57:53.657944518 +0000 +++ gcc/doc/tm.texi.in 2020-01-07 10:24:01.542344777 +0000 @@ -3365,6 +3365,8 @@ stack. @hook TARGET_VECTOR_MODE_SUPPORTED_P +@hook TARGET_COMPATIBLE_VECTOR_TYPES_P + @hook TARGET_ARRAY_MODE @hook TARGET_ARRAY_MODE_SUPPORTED_P Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi 2020-01-06 12:57:53.649944570 +0000 +++ gcc/doc/tm.texi 2020-01-07 10:24:01.542344777 +0000 @@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}. must have move patterns for this mode. @end deftypefn +@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2}) +Return true if there is no target-specific reason for treating +vector types @var{type1} and @var{type2} as distinct types. The caller +has already checked for target-independent reasons, meaning that the +types are known to have the same mode, to have the same number of elements, +and to have what the caller considers to be compatible element types. + +The main reason for defining this hook is to reject pairs of types +that are handled differently by the target's calling convention. +For example, when a new @var{N}-bit vector architecture is added +to a target, the target may want to handle normal @var{N}-bit +@code{VECTOR_TYPE} arguments and return values in the same way as +before, to maintain backwards compatibility. However, it may also +provide new, architecture-specific @code{VECTOR_TYPE}s that are passed +and returned in a more efficient way. It is then important to maintain +a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new +architecture-specific ones. + +The default implementation returns true, which is correct for most targets. +@end deftypefn + @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems}) Return the mode that GCC should use for an array that has @var{nelems} elements, with each element having mode @var{mode}. Index: gcc/gimple-expr.c =================================================================== --- gcc/gimple-expr.c 2020-01-06 12:58:10.545833431 +0000 +++ gcc/gimple-expr.c 2020-01-07 10:24:01.542344777 +0000 @@ -37,6 +37,7 @@ Software Foundation; either version 3, o #include "tree-pass.h" #include "stringpool.h" #include "attribs.h" +#include "target.h" /* ----- Type related ----- */ @@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty /* Recurse for vector types with the same number of subparts. */ else if (TREE_CODE (inner_type) == VECTOR_TYPE - && TREE_CODE (outer_type) == VECTOR_TYPE - && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type)) - return useless_type_conversion_p (TREE_TYPE (outer_type), - TREE_TYPE (inner_type)); + && TREE_CODE (outer_type) == VECTOR_TYPE) + return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type), + TYPE_VECTOR_SUBPARTS (outer_type)) + && useless_type_conversion_p (TREE_TYPE (outer_type), + TREE_TYPE (inner_type)) + && targetm.compatible_vector_types_p (inner_type, outer_type)); else if (TREE_CODE (inner_type) == ARRAY_TYPE && TREE_CODE (outer_type) == ARRAY_TYPE) Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c 2020-01-07 10:18:06.572651552 +0000 +++ gcc/config/aarch64/aarch64.c 2020-01-07 10:24:01.538344801 +0000 @@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype) return default_function_abi; } +/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ + +static bool +aarch64_compatible_vector_types_p (const_tree type1, const_tree type2) +{ + return (aarch64_sve::builtin_type_p (type1) + == aarch64_sve::builtin_type_p (type2)); +} + /* Return true if we should emit CFI for register REGNO. */ static bool @@ -22099,6 +22108,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P #undef TARGET_VECTOR_MODE_SUPPORTED_P #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p +#undef TARGET_COMPATIBLE_VECTOR_TYPES_P +#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p + #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \ aarch64_builtin_support_vector_misalignment Index: gcc/config/aarch64/aarch64-sve-builtins.cc =================================================================== --- gcc/config/aarch64/aarch64-sve-builtins.cc 2020-01-07 10:21:17.575410530 +0000 +++ gcc/config/aarch64/aarch64-sve-builtins.cc 2020-01-07 10:24:01.534344828 +0000 @@ -2265,9 +2265,13 @@ tree gimple_folder::convert_pred (gimple_seq &stmts, tree vectype, unsigned int argno) { - tree predtype = truth_type_for (vectype); tree pred = gimple_call_arg (call, argno); - return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred); + if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)), + TYPE_VECTOR_SUBPARTS (vectype))) + return pred; + + return gimple_build (&stmts, VIEW_CONVERT_EXPR, + truth_type_for (vectype), pred); } /* Return a pointer to the address in a contiguous load or store, Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c 2020-01-07 10:24:01.546344751 +0000 @@ -0,0 +1,99 @@ +/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */ + +#include + +typedef float16_t float16x16_t __attribute__((vector_size (32))); +typedef float32_t float32x8_t __attribute__((vector_size (32))); +typedef float64_t float64x4_t __attribute__((vector_size (32))); +typedef int8_t int8x32_t __attribute__((vector_size (32))); +typedef int16_t int16x16_t __attribute__((vector_size (32))); +typedef int32_t int32x8_t __attribute__((vector_size (32))); +typedef int64_t int64x4_t __attribute__((vector_size (32))); +typedef uint8_t uint8x32_t __attribute__((vector_size (32))); +typedef uint16_t uint16x16_t __attribute__((vector_size (32))); +typedef uint32_t uint32x8_t __attribute__((vector_size (32))); +typedef uint64_t uint64x4_t __attribute__((vector_size (32))); + +void float16_callee (float16x16_t); +void float32_callee (float32x8_t); +void float64_callee (float64x4_t); +void int8_callee (int8x32_t); +void int16_callee (int16x16_t); +void int32_callee (int32x8_t); +void int64_callee (int64x4_t); +void uint8_callee (uint8x32_t); +void uint16_callee (uint16x16_t); +void uint32_callee (uint32x8_t); +void uint64_callee (uint64x4_t); + +void +float16_caller (void) +{ + float16_callee (svdup_f16 (1.0)); +} + +void +float32_caller (void) +{ + float32_callee (svdup_f32 (2.0)); +} + +void +float64_caller (void) +{ + float64_callee (svdup_f64 (3.0)); +} + +void +int8_caller (void) +{ + int8_callee (svindex_s8 (0, 1)); +} + +void +int16_caller (void) +{ + int16_callee (svindex_s16 (0, 2)); +} + +void +int32_caller (void) +{ + int32_callee (svindex_s32 (0, 3)); +} + +void +int64_caller (void) +{ + int64_callee (svindex_s64 (0, 4)); +} + +void +uint8_caller (void) +{ + uint8_callee (svindex_u8 (1, 1)); +} + +void +uint16_caller (void) +{ + uint16_callee (svindex_u16 (1, 2)); +} + +void +uint32_caller (void) +{ + uint32_callee (svindex_u32 (1, 3)); +} + +void +uint64_caller (void) +{ + uint64_callee (svindex_u64 (1, 4)); +} + +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */ +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */ +/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c 2020-01-07 10:24:01.546344751 +0000 @@ -0,0 +1,99 @@ +/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */ + +#include + +typedef float16_t float16x16_t __attribute__((vector_size (32))); +typedef float32_t float32x8_t __attribute__((vector_size (32))); +typedef float64_t float64x4_t __attribute__((vector_size (32))); +typedef int8_t int8x32_t __attribute__((vector_size (32))); +typedef int16_t int16x16_t __attribute__((vector_size (32))); +typedef int32_t int32x8_t __attribute__((vector_size (32))); +typedef int64_t int64x4_t __attribute__((vector_size (32))); +typedef uint8_t uint8x32_t __attribute__((vector_size (32))); +typedef uint16_t uint16x16_t __attribute__((vector_size (32))); +typedef uint32_t uint32x8_t __attribute__((vector_size (32))); +typedef uint64_t uint64x4_t __attribute__((vector_size (32))); + +void float16_callee (svfloat16_t); +void float32_callee (svfloat32_t); +void float64_callee (svfloat64_t); +void int8_callee (svint8_t); +void int16_callee (svint16_t); +void int32_callee (svint32_t); +void int64_callee (svint64_t); +void uint8_callee (svuint8_t); +void uint16_callee (svuint16_t); +void uint32_callee (svuint32_t); +void uint64_callee (svuint64_t); + +void +float16_caller (float16x16_t arg) +{ + float16_callee (arg); +} + +void +float32_caller (float32x8_t arg) +{ + float32_callee (arg); +} + +void +float64_caller (float64x4_t arg) +{ + float64_callee (arg); +} + +void +int8_caller (int8x32_t arg) +{ + int8_callee (arg); +} + +void +int16_caller (int16x16_t arg) +{ + int16_callee (arg); +} + +void +int32_caller (int32x8_t arg) +{ + int32_callee (arg); +} + +void +int64_caller (int64x4_t arg) +{ + int64_callee (arg); +} + +void +uint8_caller (uint8x32_t arg) +{ + uint8_callee (arg); +} + +void +uint16_caller (uint16x16_t arg) +{ + uint16_callee (arg); +} + +void +uint32_caller (uint32x8_t arg) +{ + uint32_callee (arg); +} + +void +uint64_caller (uint64x4_t arg) +{ + uint64_callee (arg); +} + +/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */ +/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */ +/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */ +/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */ +/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */