From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6250 invoked by alias); 12 Dec 2019 17:20:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 6241 invoked by uid 89); 12 Dec 2019 17:20:30 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-wr1-f45.google.com Received: from mail-wr1-f45.google.com (HELO mail-wr1-f45.google.com) (209.85.221.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Dec 2019 17:20:26 +0000 Received: by mail-wr1-f45.google.com with SMTP id t2so3629488wrr.1 for ; Thu, 12 Dec 2019 09:20:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:cc:from:message-id; bh=BovZ2YRBGWUsog0/0T4pNLhAOPgYaE9oigsFNpgrV5M=; b=MaedVofCGNJkJkHt38PoFbL2t6xRuA33bmmBK7y780yCPeX/bkpC1cG3/G2GMvPDuw AQhPOcthZ0XRTUjPR/aPdKfqJU4Uvdj7UBJgIiaTTkrYLHqNgE7a2tU7KVTC5cOKavda hzbAvo4/wDtTGRbgQg9ksJ7pCF7DjwG6TzzwRrlxu8a0y81z3BHNu4SQljnsuxO52bco tfC0wfmuGG2RdqlqUHSEQqk/32wgdLOjkOhGW9TDzMb4/D/X49Xg3XIdyS0/LcXRZGH4 jgIQrRxCadqPyaTjMnIAgGlxLcg+fn36mfsxwoXd+j5E/4mMawny1DBubwllnTy6j5ts 7GHg== Return-Path: Received: from [192.168.178.32] (x5f720648.dyn.telefonica.de. [95.114.6.72]) by smtp.gmail.com with ESMTPSA id 16sm6895754wmi.0.2019.12.12.09.20.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 12 Dec 2019 09:20:23 -0800 (PST) Date: Thu, 12 Dec 2019 17:20:00 -0000 User-Agent: K-9 Mail for Android In-Reply-To: References: <171FF151-2751-445A-A0EC-AE8C8B8E67B4@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: Add a compatible_vector_types_p target hook To: Richard Sandiford CC: gcc-patches@gcc.gnu.org From: Richard Biener Message-ID: <0D72EEA4-FCFB-4CBD-85B6-F43994F95780@gmail.com> X-IsSubscribed: yes X-SW-Source: 2019-12/txt/msg00904.txt.bz2 On December 12, 2019 5:44:25 PM GMT+01:00, Richard Sandiford wrote: >Richard Biener writes: >> On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford > wrote: >>>One problem with adding an N-bit vector extension to an existing >>>architecture is to decide how N-bit vectors should be passed to >>>functions and returned from functions. Allowing all N-bit vector >>>types to be passed in registers breaks backwards compatibility, >>>since N-bit vectors could be used (and emulated) before the vector >>>extension was added. But always passing N-bit vectors on the >>>stack would be inefficient for things like vector libm functions. >>> >>>For SVE we took the compromise position of predefining new SVE vector >>>types that are distinct from all existing vector types, including >>>GNU-style vectors. The new types are passed and returned in an >>>efficient way while existing vector types are passed and returned >>>in the traditional way. In the right circumstances, the two types >>>are inter-convertible. >>> >>>The SVE types are created using: >>> >>> vectype =3D build_distinct_type_copy (vectype); >>> SET_TYPE_STRUCTURAL_EQUALITY (vectype); >>> TYPE_ARTIFICIAL (vectype) =3D 1; >>> >>>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR >>>to convert from one type to the other. However, the distinction can >>>be lost during gimple, which treats two vector types with the same >>>mode, number of elements, and element type as equivalent. And for >>>most targets that's the right thing to do. >> >> And why's that a problem? The difference appears only in the function >call ABI which is determined by the function signature rather than >types or modes of the actual arguments?=20 > >We use the type of the actual arguments when deciding how arguments >should be passed to functions: > >/* I counts args in order (to be) pushed; ARGPOS counts in order >written. */ > for (argpos =3D 0; argpos < num_actuals; i--, argpos++) > { > tree type =3D TREE_TYPE (args[i].tree_value); > [...] > /* See if this argument should be passed by invisible reference. */ > function_arg_info arg (type, argpos < n_named_args); > >And it has to be that way for calls to unprototyped functions, >or for varargs. So even for varargs the passing is different? Also we have CALL_EXPR_FNTYPE= which you could populate specially even for unprototyped or varargs functi= ons. I realize we now look at the type of values but you have to realize that di= fferences that are not relevant for values are discarded. Artificially pre= serving such non-real differences everywhere(!) while it only matters at ca= ll boundaries doesn't look correct.=20 >The AArch64 port emits an error if calls pass values of SVE type to an >unprototyped function. To do that we need to know whether the value >really is an SVE type rathr than a plain vector. > >For varags the ABI is the same for 256 bits+. But we'll have the >same problem there once we support -msve-vector-bits=3D128, since the >layout of SVE and Advanced SIMD vectors differ for big-endian. But then why don't you have different modes? Richard.=20 >Thanks, >Richard > >> >> Richard.=20 >> >>>This patch therefore adds a hook that lets the target choose >>>whether such vector types are indeed equivalent. >>> >>>Note that the new tests fail for -mabi=3Dilp32 in the same way as other >>>ACLE-based tests. I'm still planning to fix that as a follow-on. >>> >>>Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? >>> >>>Richard >>> >>> >>>2019-12-12 Richard Sandiford >>> >>>gcc/ >>> * target.def (compatible_vector_types_p): New target hook. >>> * hooks.h (hook_bool_const_tree_const_tree_true): Declare. >>> * hooks.c (hook_bool_const_tree_const_tree_true): New function. >>> * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook. >>> * doc/tm.texi: Regenerate. >>> * gimple-expr.c: Include target.h. >>> (useless_type_conversion_p): Use targetm.compatible_vector_types_p. >>> * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New >>> function. >>> (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define. >>> * config/aarch64/aarch64-sve-builtins.cc >>>(gimple_folder::convert_pred): >>> Use the original predicate if it already has a suitable type. >>> >>>gcc/testsuite/ >>> * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test. >>> * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise. >>> >>>Index: gcc/target.def >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/target.def 2019-11-30 18:48:18.531984101 +0000 >>>+++ gcc/target.def 2019-12-12 15:07:43.960415368 +0000 >>>@@ -3411,6 +3411,29 @@ must have move patterns for this mode.", >>> hook_bool_mode_false) >>>=20 >>> DEFHOOK >>>+(compatible_vector_types_p, >>>+ "Return true if there is no target-specific reason for treating\n\ >>>+vector types @var{type1} and @var{type2} as distinct types. The >>>caller\n\ >>>+has already checked for target-independent reasons, meaning that >>>the\n\ >>>+types are known to have the same mode, to have the same number of >>>elements,\n\ >>>+and to have what the caller considers to be compatible element >>>types.\n\ >>>+\n\ >>>+The main reason for defining this hook is to reject pairs of >types\n\ >>>+that are handled differently by the target's calling convention.\n\ >>>+For example, when a new @var{N}-bit vector architecture is added\n\ >>>+to a target, the target may want to handle normal @var{N}-bit\n\ >>>+@code{VECTOR_TYPE} arguments and return values in the same way as\n\ >>>+before, to maintain backwards compatibility. However, it may >also\n\ >>>+provide new, architecture-specific @code{VECTOR_TYPE}s that are >>>passed\n\ >>>+and returned in a more efficient way. It is then important to >>>maintain\n\ >>>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the >>>new\n\ >>>+architecture-specific ones.\n\ >>>+\n\ >>>+The default implementation returns true, which is correct for most >>>targets.", >>>+ bool, (const_tree type1, const_tree type2), >>>+ hook_bool_const_tree_const_tree_true) >>>+ >>>+DEFHOOK >>> (vector_alignment, >>> "This hook can be used to define the alignment for a vector of >type\n\ >>>@var{type}, in order to comply with a platform ABI. The default is >>>to\n\ >>>Index: gcc/hooks.h >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/hooks.h 2019-11-04 21:13:57.727755548 +0000 >>>+++ gcc/hooks.h 2019-12-12 15:07:43.960415368 +0000 >>>@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal >>> extern bool hook_bool_uint_mode_true (unsigned int, machine_mode); >>> extern bool hook_bool_tree_false (tree); >>> extern bool hook_bool_const_tree_false (const_tree); >>>+extern bool hook_bool_const_tree_const_tree_true (const_tree, >>>const_tree); >>> extern bool hook_bool_tree_true (tree); >>> extern bool hook_bool_const_tree_true (const_tree); >>> extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *); >>>Index: gcc/hooks.c >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/hooks.c 2019-11-04 21:13:57.727755548 +0000 >>>+++ gcc/hooks.c 2019-12-12 15:07:43.960415368 +0000 >>>@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree) >>> } >>>=20 >>> bool >>>+hook_bool_const_tree_const_tree_true (const_tree, const_tree) >>>+{ >>>+ return true; >>>+} >>>+ >>>+bool >>> hook_bool_tree_true (tree) >>> { >>> return true; >>>Index: gcc/doc/tm.texi.in >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/doc/tm.texi.in 2019-11-30 18:48:18.523984157 +0000 >>>+++ gcc/doc/tm.texi.in 2019-12-12 15:07:43.956415393 +0000 >>>@@ -3365,6 +3365,8 @@ stack. >>>=20 >>> @hook TARGET_VECTOR_MODE_SUPPORTED_P >>>=20 >>>+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P >>>+ >>> @hook TARGET_ARRAY_MODE >>>=20 >>> @hook TARGET_ARRAY_MODE_SUPPORTED_P >>>Index: gcc/doc/tm.texi >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/doc/tm.texi 2019-11-30 18:48:18.507984271 +0000 >>>+++ gcc/doc/tm.texi 2019-12-12 15:07:43.952415419 +0000 >>>@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}. >>> must have move patterns for this mode. >>> @end deftypefn >>>=20 >>>+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P >>>(const_tree @var{type1}, const_tree @var{type2}) >>>+Return true if there is no target-specific reason for treating >>>+vector types @var{type1} and @var{type2} as distinct types. The >>>caller >>>+has already checked for target-independent reasons, meaning that the >>>+types are known to have the same mode, to have the same number of >>>elements, >>>+and to have what the caller considers to be compatible element >types. >>>+ >>>+The main reason for defining this hook is to reject pairs of types >>>+that are handled differently by the target's calling convention. >>>+For example, when a new @var{N}-bit vector architecture is added >>>+to a target, the target may want to handle normal @var{N}-bit >>>+@code{VECTOR_TYPE} arguments and return values in the same way as >>>+before, to maintain backwards compatibility. However, it may also >>>+provide new, architecture-specific @code{VECTOR_TYPE}s that are >passed >>>+and returned in a more efficient way. It is then important to >>>maintain >>>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new >>>+architecture-specific ones. >>>+ >>>+The default implementation returns true, which is correct for most >>>targets. >>>+@end deftypefn >>>+ >>>@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE >>>(machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems}) >>> Return the mode that GCC should use for an array that has >>> @var{nelems} elements, with each element having mode @var{mode}. >>>Index: gcc/gimple-expr.c >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/gimple-expr.c 2019-10-08 09:23:31.902529513 +0100 >>>+++ gcc/gimple-expr.c 2019-12-12 15:07:43.956415393 +0000 >>>@@ -37,6 +37,7 @@ Software Foundation; either version 3, o >>> #include "tree-pass.h" >>> #include "stringpool.h" >>> #include "attribs.h" >>>+#include "target.h" >>>=20 >>> /* ----- Type related ----- */ >>>=20 >>>@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty >>>=20 >>> /* Recurse for vector types with the same number of subparts. */ >>> else if (TREE_CODE (inner_type) =3D=3D VECTOR_TYPE >>>- && TREE_CODE (outer_type) =3D=3D VECTOR_TYPE >>>- && TYPE_PRECISION (inner_type) =3D=3D TYPE_PRECISION (outer_type)) >>>- return useless_type_conversion_p (TREE_TYPE (outer_type), >>>- TREE_TYPE (inner_type)); >>>+ && TREE_CODE (outer_type) =3D=3D VECTOR_TYPE) >>>+ return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type), >>>+ TYPE_VECTOR_SUBPARTS (outer_type)) >>>+ && useless_type_conversion_p (TREE_TYPE (outer_type), >>>+ TREE_TYPE (inner_type)) >>>+ && targetm.compatible_vector_types_p (inner_type, outer_type)); >>>=20 >>> else if (TREE_CODE (inner_type) =3D=3D ARRAY_TYPE >>> && TREE_CODE (outer_type) =3D=3D ARRAY_TYPE) >>>Index: gcc/config/aarch64/aarch64.c >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/config/aarch64/aarch64.c 2019-12-10 16:45:56.338226712 +0000 >>>+++ gcc/config/aarch64/aarch64.c 2019-12-12 15:07:43.940415503 +0000 >>>@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype) >>> return default_function_abi; >>> } >>>=20 >>>+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ >>>+ >>>+static bool >>>+aarch64_compatible_vector_types_p (const_tree type1, const_tree >type2) >>>+{ >>>+ unsigned int num_zr1 =3D 0, num_pr1 =3D 0, num_zr2 =3D 0, num_pr2 =3D= 0; >>>+ if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1) >>>+ !=3D aarch64_sve_argument_p (type2, &num_zr2, &num_pr2)) >>>+ return false; >>>+ >>>+ gcc_assert (num_zr1 =3D=3D num_zr2 && num_pr1 =3D=3D num_pr2); >>>+ return true; >>>+} >>>+ >>> /* Return true if we should emit CFI for register REGNO. */ >>>=20 >>> static bool >>>@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P >>> #undef TARGET_VECTOR_MODE_SUPPORTED_P >>> #define TARGET_VECTOR_MODE_SUPPORTED_P >aarch64_vector_mode_supported_p >>>=20 >>>+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P >>>+#define TARGET_COMPATIBLE_VECTOR_TYPES_P >>>aarch64_compatible_vector_types_p >>>+ >>> #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT >>> #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \ >>> aarch64_builtin_support_vector_misalignment >>>Index: gcc/config/aarch64/aarch64-sve-builtins.cc >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- gcc/config/aarch64/aarch64-sve-builtins.cc 2019-12-06 >>>18:22:12.072859530 +0000 >>>+++ gcc/config/aarch64/aarch64-sve-builtins.cc 2019-12-12 >>>15:07:43.936415528 +0000 >>>@@ -2251,9 +2251,13 @@ tree >>> gimple_folder::convert_pred (gimple_seq &stmts, tree vectype, >>> unsigned int argno) >>> { >>>- tree predtype =3D truth_type_for (vectype); >>> tree pred =3D gimple_call_arg (call, argno); >>>- return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred); >>>+ if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)), >>>+ TYPE_VECTOR_SUBPARTS (vectype))) >>>+ return pred; >>>+ >>>+ return gimple_build (&stmts, VIEW_CONVERT_EXPR, >>>+ truth_type_for (vectype), pred); >>> } >>>=20 >>> /* Return a pointer to the address in a contiguous load or store, >>>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- /dev/null 2019-09-17 11:41:18.176664108 +0100 >>>+++ >gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c 2019-12-12 >>>15:07:43.972415287 +0000 >>>@@ -0,0 +1,99 @@ >>>+/* { dg-options "-O -msve-vector-bits=3D256 -fomit-frame-pointer" } */ >>>+ >>>+#include >>>+ >>>+typedef float16_t float16x16_t __attribute__((vector_size (32))); >>>+typedef float32_t float32x8_t __attribute__((vector_size (32))); >>>+typedef float64_t float64x4_t __attribute__((vector_size (32))); >>>+typedef int8_t int8x32_t __attribute__((vector_size (32))); >>>+typedef int16_t int16x16_t __attribute__((vector_size (32))); >>>+typedef int32_t int32x8_t __attribute__((vector_size (32))); >>>+typedef int64_t int64x4_t __attribute__((vector_size (32))); >>>+typedef uint8_t uint8x32_t __attribute__((vector_size (32))); >>>+typedef uint16_t uint16x16_t __attribute__((vector_size (32))); >>>+typedef uint32_t uint32x8_t __attribute__((vector_size (32))); >>>+typedef uint64_t uint64x4_t __attribute__((vector_size (32))); >>>+ >>>+void float16_callee (float16x16_t); >>>+void float32_callee (float32x8_t); >>>+void float64_callee (float64x4_t); >>>+void int8_callee (int8x32_t); >>>+void int16_callee (int16x16_t); >>>+void int32_callee (int32x8_t); >>>+void int64_callee (int64x4_t); >>>+void uint8_callee (uint8x32_t); >>>+void uint16_callee (uint16x16_t); >>>+void uint32_callee (uint32x8_t); >>>+void uint64_callee (uint64x4_t); >>>+ >>>+void >>>+float16_caller (void) >>>+{ >>>+ float16_callee (svdup_f16 (1.0)); >>>+} >>>+ >>>+void >>>+float32_caller (void) >>>+{ >>>+ float32_callee (svdup_f32 (2.0)); >>>+} >>>+ >>>+void >>>+float64_caller (void) >>>+{ >>>+ float64_callee (svdup_f64 (3.0)); >>>+} >>>+ >>>+void >>>+int8_caller (void) >>>+{ >>>+ int8_callee (svindex_s8 (0, 1)); >>>+} >>>+ >>>+void >>>+int16_caller (void) >>>+{ >>>+ int16_callee (svindex_s16 (0, 2)); >>>+} >>>+ >>>+void >>>+int32_caller (void) >>>+{ >>>+ int32_callee (svindex_s32 (0, 3)); >>>+} >>>+ >>>+void >>>+int64_caller (void) >>>+{ >>>+ int64_callee (svindex_s64 (0, 4)); >>>+} >>>+ >>>+void >>>+uint8_caller (void) >>>+{ >>>+ uint8_callee (svindex_u8 (1, 1)); >>>+} >>>+ >>>+void >>>+uint16_caller (void) >>>+{ >>>+ uint16_callee (svindex_u16 (1, 2)); >>>+} >>>+ >>>+void >>>+uint32_caller (void) >>>+{ >>>+ uint32_callee (svindex_u32 (1, 3)); >>>+} >>>+ >>>+void >>>+uint64_caller (void) >>>+{ >>>+ uint64_callee (svindex_u64 (1, 4)); >>>+} >>>+ >>>+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], >>>\[x0\]} 2 } } */ >>>+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], >>>\[x0\]} 3 } } */ >>>+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], >>>\[x0\]} 3 } } */ >>>+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], >>>\[x0\]} 3 } } */ >>>+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } >*/ >>>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c >>>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>--- /dev/null 2019-09-17 11:41:18.176664108 +0100 >>>+++ >gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c 2019-12-12 >>>15:07:43.972415287 +0000 >>>@@ -0,0 +1,99 @@ >>>+/* { dg-options "-O -msve-vector-bits=3D256 -fomit-frame-pointer" } */ >>>+ >>>+#include >>>+ >>>+typedef float16_t float16x16_t __attribute__((vector_size (32))); >>>+typedef float32_t float32x8_t __attribute__((vector_size (32))); >>>+typedef float64_t float64x4_t __attribute__((vector_size (32))); >>>+typedef int8_t int8x32_t __attribute__((vector_size (32))); >>>+typedef int16_t int16x16_t __attribute__((vector_size (32))); >>>+typedef int32_t int32x8_t __attribute__((vector_size (32))); >>>+typedef int64_t int64x4_t __attribute__((vector_size (32))); >>>+typedef uint8_t uint8x32_t __attribute__((vector_size (32))); >>>+typedef uint16_t uint16x16_t __attribute__((vector_size (32))); >>>+typedef uint32_t uint32x8_t __attribute__((vector_size (32))); >>>+typedef uint64_t uint64x4_t __attribute__((vector_size (32))); >>>+ >>>+void float16_callee (svfloat16_t); >>>+void float32_callee (svfloat32_t); >>>+void float64_callee (svfloat64_t); >>>+void int8_callee (svint8_t); >>>+void int16_callee (svint16_t); >>>+void int32_callee (svint32_t); >>>+void int64_callee (svint64_t); >>>+void uint8_callee (svuint8_t); >>>+void uint16_callee (svuint16_t); >>>+void uint32_callee (svuint32_t); >>>+void uint64_callee (svuint64_t); >>>+ >>>+void >>>+float16_caller (float16x16_t arg) >>>+{ >>>+ float16_callee (arg); >>>+} >>>+ >>>+void >>>+float32_caller (float32x8_t arg) >>>+{ >>>+ float32_callee (arg); >>>+} >>>+ >>>+void >>>+float64_caller (float64x4_t arg) >>>+{ >>>+ float64_callee (arg); >>>+} >>>+ >>>+void >>>+int8_caller (int8x32_t arg) >>>+{ >>>+ int8_callee (arg); >>>+} >>>+ >>>+void >>>+int16_caller (int16x16_t arg) >>>+{ >>>+ int16_callee (arg); >>>+} >>>+ >>>+void >>>+int32_caller (int32x8_t arg) >>>+{ >>>+ int32_callee (arg); >>>+} >>>+ >>>+void >>>+int64_caller (int64x4_t arg) >>>+{ >>>+ int64_callee (arg); >>>+} >>>+ >>>+void >>>+uint8_caller (uint8x32_t arg) >>>+{ >>>+ uint8_callee (arg); >>>+} >>>+ >>>+void >>>+uint16_caller (uint16x16_t arg) >>>+{ >>>+ uint16_callee (arg); >>>+} >>>+ >>>+void >>>+uint32_caller (uint32x8_t arg) >>>+{ >>>+ uint32_callee (arg); >>>+} >>>+ >>>+void >>>+uint64_caller (uint64x4_t arg) >>>+{ >>>+ uint64_callee (arg); >>>+} >>>+ >>>+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, >\[x0\]} >>>2 } } */ >>>+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, >\[x0\]} >>>3 } } */ >>>+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, >\[x0\]} >>>3 } } */ >>>+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, >\[x0\]} >>>3 } } */ >>>+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */