Add a compatible_vector_types

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Add a compatible_vector_types_p target hook
@ 2019-12-12 15:10 Richard Sandiford
  2019-12-12 16:04 ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2019-12-12 15:10 UTC (permalink / raw)
  To: gcc-patches

One problem with adding an N-bit vector extension to an existing
architecture is to decide how N-bit vectors should be passed to
functions and returned from functions.  Allowing all N-bit vector
types to be passed in registers breaks backwards compatibility,
since N-bit vectors could be used (and emulated) before the vector
extension was added.  But always passing N-bit vectors on the
stack would be inefficient for things like vector libm functions.

For SVE we took the compromise position of predefining new SVE vector
types that are distinct from all existing vector types, including
GNU-style vectors.  The new types are passed and returned in an
efficient way while existing vector types are passed and returned
in the traditional way.  In the right circumstances, the two types
are inter-convertible.

The SVE types are created using:

      vectype = build_distinct_type_copy (vectype);
      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
      TYPE_ARTIFICIAL (vectype) = 1;

The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
to convert from one type to the other.  However, the distinction can
be lost during gimple, which treats two vector types with the same
mode, number of elements, and element type as equivalent.  And for
most targets that's the right thing to do.

This patch therefore adds a hook that lets the target choose
whether such vector types are indeed equivalent.

Note that the new tests fail for -mabi=ilp32 in the same way as other
ACLE-based tests.  I'm still planning to fix that as a follow-on.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2019-12-12  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (compatible_vector_types_p): New target hook.
	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
	* doc/tm.texi: Regenerate.
	* gimple-expr.c: Include target.h.
	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
	function.
	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
	* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
	Use the original predicate if it already has a suitable type.

gcc/testsuite/
	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-11-30 18:48:18.531984101 +0000
+++ gcc/target.def	2019-12-12 15:07:43.960415368 +0000
@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
  hook_bool_mode_false)
 
 DEFHOOK
+(compatible_vector_types_p,
+ "Return true if there is no target-specific reason for treating\n\
+vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
+has already checked for target-independent reasons, meaning that the\n\
+types are known to have the same mode, to have the same number of elements,\n\
+and to have what the caller considers to be compatible element types.\n\
+\n\
+The main reason for defining this hook is to reject pairs of types\n\
+that are handled differently by the target's calling convention.\n\
+For example, when a new @var{N}-bit vector architecture is added\n\
+to a target, the target may want to handle normal @var{N}-bit\n\
+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
+before, to maintain backwards compatibility.  However, it may also\n\
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
+and returned in a more efficient way.  It is then important to maintain\n\
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
+architecture-specific ones.\n\
+\n\
+The default implementation returns true, which is correct for most targets.",
+ bool, (const_tree type1, const_tree type2),
+ hook_bool_const_tree_const_tree_true)
+
+DEFHOOK
 (vector_alignment,
  "This hook can be used to define the alignment for a vector of type\n\
 @var{type}, in order to comply with a platform ABI.  The default is to\n\
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h	2019-11-04 21:13:57.727755548 +0000
+++ gcc/hooks.h	2019-12-12 15:07:43.960415368 +0000
@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
 extern bool hook_bool_const_tree_false (const_tree);
+extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
 extern bool hook_bool_tree_true (tree);
 extern bool hook_bool_const_tree_true (const_tree);
 extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c	2019-11-04 21:13:57.727755548 +0000
+++ gcc/hooks.c	2019-12-12 15:07:43.960415368 +0000
@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
 }
 
 bool
+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
+{
+  return true;
+}
+
+bool
 hook_bool_tree_true (tree)
 {
   return true;
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2019-11-30 18:48:18.523984157 +0000
+++ gcc/doc/tm.texi.in	2019-12-12 15:07:43.956415393 +0000
@@ -3365,6 +3365,8 @@ stack.
 
 @hook TARGET_VECTOR_MODE_SUPPORTED_P
 
+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
+
 @hook TARGET_ARRAY_MODE
 
 @hook TARGET_ARRAY_MODE_SUPPORTED_P
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-11-30 18:48:18.507984271 +0000
+++ gcc/doc/tm.texi	2019-12-12 15:07:43.952415419 +0000
@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
 must have move patterns for this mode.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2})
+Return true if there is no target-specific reason for treating
+vector types @var{type1} and @var{type2} as distinct types.  The caller
+has already checked for target-independent reasons, meaning that the
+types are known to have the same mode, to have the same number of elements,
+and to have what the caller considers to be compatible element types.
+
+The main reason for defining this hook is to reject pairs of types
+that are handled differently by the target's calling convention.
+For example, when a new @var{N}-bit vector architecture is added
+to a target, the target may want to handle normal @var{N}-bit
+@code{VECTOR_TYPE} arguments and return values in the same way as
+before, to maintain backwards compatibility.  However, it may also
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
+and returned in a more efficient way.  It is then important to maintain
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
+architecture-specific ones.
+
+The default implementation returns true, which is correct for most targets.
+@end deftypefn
+
 @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
 Return the mode that GCC should use for an array that has
 @var{nelems} elements, with each element having mode @var{mode}.
Index: gcc/gimple-expr.c
===================================================================
--- gcc/gimple-expr.c	2019-10-08 09:23:31.902529513 +0100
+++ gcc/gimple-expr.c	2019-12-12 15:07:43.956415393 +0000
@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "target.h"
 
 /* ----- Type related -----  */
 
@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
 
   /* Recurse for vector types with the same number of subparts.  */
   else if (TREE_CODE (inner_type) == VECTOR_TYPE
-	   && TREE_CODE (outer_type) == VECTOR_TYPE
-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
-    return useless_type_conversion_p (TREE_TYPE (outer_type),
-				      TREE_TYPE (inner_type));
+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
+		      TYPE_VECTOR_SUBPARTS (outer_type))
+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
+					  TREE_TYPE (inner_type))
+	    && targetm.compatible_vector_types_p (inner_type, outer_type));
 
   else if (TREE_CODE (inner_type) == ARRAY_TYPE
 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-12-10 16:45:56.338226712 +0000
+++ gcc/config/aarch64/aarch64.c	2019-12-12 15:07:43.940415503 +0000
@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype)
   return default_function_abi;
 }
 
+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
+
+static bool
+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
+{
+  unsigned int num_zr1 = 0, num_pr1 = 0, num_zr2 = 0, num_pr2 = 0;
+  if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1)
+      != aarch64_sve_argument_p (type2, &num_zr2, &num_pr2))
+    return false;
+
+  gcc_assert (num_zr1 == num_zr2 && num_pr1 == num_pr2);
+  return true;
+}
+
 /* Return true if we should emit CFI for register REGNO.  */
 
 static bool
@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
 
+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
+#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
+
 #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
   aarch64_builtin_support_vector_misalignment
Index: gcc/config/aarch64/aarch64-sve-builtins.cc
===================================================================
--- gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-06 18:22:12.072859530 +0000
+++ gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-12 15:07:43.936415528 +0000
@@ -2251,9 +2251,13 @@ tree
 gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
 			     unsigned int argno)
 {
-  tree predtype = truth_type_for (vectype);
   tree pred = gimple_call_arg (call, argno);
-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
+		TYPE_VECTOR_SUBPARTS (vectype)))
+    return pred;
+
+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
+		       truth_type_for (vectype), pred);
 }
 
 /* Return a pointer to the address in a contiguous load or store,
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c	2019-12-12 15:07:43.972415287 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (float16x16_t);
+void float32_callee (float32x8_t);
+void float64_callee (float64x4_t);
+void int8_callee (int8x32_t);
+void int16_callee (int16x16_t);
+void int32_callee (int32x8_t);
+void int64_callee (int64x4_t);
+void uint8_callee (uint8x32_t);
+void uint16_callee (uint16x16_t);
+void uint32_callee (uint32x8_t);
+void uint64_callee (uint64x4_t);
+
+void
+float16_caller (void)
+{
+  float16_callee (svdup_f16 (1.0));
+}
+
+void
+float32_caller (void)
+{
+  float32_callee (svdup_f32 (2.0));
+}
+
+void
+float64_caller (void)
+{
+  float64_callee (svdup_f64 (3.0));
+}
+
+void
+int8_caller (void)
+{
+  int8_callee (svindex_s8 (0, 1));
+}
+
+void
+int16_caller (void)
+{
+  int16_callee (svindex_s16 (0, 2));
+}
+
+void
+int32_caller (void)
+{
+  int32_callee (svindex_s32 (0, 3));
+}
+
+void
+int64_caller (void)
+{
+  int64_callee (svindex_s64 (0, 4));
+}
+
+void
+uint8_caller (void)
+{
+  uint8_callee (svindex_u8 (1, 1));
+}
+
+void
+uint16_caller (void)
+{
+  uint16_callee (svindex_u16 (1, 2));
+}
+
+void
+uint32_caller (void)
+{
+  uint32_callee (svindex_u32 (1, 3));
+}
+
+void
+uint64_caller (void)
+{
+  uint64_callee (svindex_u64 (1, 4));
+}
+
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c	2019-12-12 15:07:43.972415287 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (svfloat16_t);
+void float32_callee (svfloat32_t);
+void float64_callee (svfloat64_t);
+void int8_callee (svint8_t);
+void int16_callee (svint16_t);
+void int32_callee (svint32_t);
+void int64_callee (svint64_t);
+void uint8_callee (svuint8_t);
+void uint16_callee (svuint16_t);
+void uint32_callee (svuint32_t);
+void uint64_callee (svuint64_t);
+
+void
+float16_caller (float16x16_t arg)
+{
+  float16_callee (arg);
+}
+
+void
+float32_caller (float32x8_t arg)
+{
+  float32_callee (arg);
+}
+
+void
+float64_caller (float64x4_t arg)
+{
+  float64_callee (arg);
+}
+
+void
+int8_caller (int8x32_t arg)
+{
+  int8_callee (arg);
+}
+
+void
+int16_caller (int16x16_t arg)
+{
+  int16_callee (arg);
+}
+
+void
+int32_caller (int32x8_t arg)
+{
+  int32_callee (arg);
+}
+
+void
+int64_caller (int64x4_t arg)
+{
+  int64_callee (arg);
+}
+
+void
+uint8_caller (uint8x32_t arg)
+{
+  uint8_callee (arg);
+}
+
+void
+uint16_caller (uint16x16_t arg)
+{
+  uint16_callee (arg);
+}
+
+void
+uint32_caller (uint32x8_t arg)
+{
+  uint32_callee (arg);
+}
+
+void
+uint64_caller (uint64x4_t arg)
+{
+  uint64_callee (arg);
+}
+
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-12 15:10 Add a compatible_vector_types_p target hook Richard Sandiford
@ 2019-12-12 16:04 ` Richard Biener
  2019-12-12 16:44   ` Richard Sandiford
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2019-12-12 16:04 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford

On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>One problem with adding an N-bit vector extension to an existing
>architecture is to decide how N-bit vectors should be passed to
>functions and returned from functions.  Allowing all N-bit vector
>types to be passed in registers breaks backwards compatibility,
>since N-bit vectors could be used (and emulated) before the vector
>extension was added.  But always passing N-bit vectors on the
>stack would be inefficient for things like vector libm functions.
>
>For SVE we took the compromise position of predefining new SVE vector
>types that are distinct from all existing vector types, including
>GNU-style vectors.  The new types are passed and returned in an
>efficient way while existing vector types are passed and returned
>in the traditional way.  In the right circumstances, the two types
>are inter-convertible.
>
>The SVE types are created using:
>
>      vectype = build_distinct_type_copy (vectype);
>      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
>      TYPE_ARTIFICIAL (vectype) = 1;
>
>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
>to convert from one type to the other.  However, the distinction can
>be lost during gimple, which treats two vector types with the same
>mode, number of elements, and element type as equivalent.  And for
>most targets that's the right thing to do.

And why's that a problem? The difference appears only in the function call ABI which is determined by the function signature rather than types or modes of the actual arguments? 

Richard. 

>This patch therefore adds a hook that lets the target choose
>whether such vector types are indeed equivalent.
>
>Note that the new tests fail for -mabi=ilp32 in the same way as other
>ACLE-based tests.  I'm still planning to fix that as a follow-on.
>
>Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
>Richard
>
>
>2019-12-12  Richard Sandiford  <richard.sandiford@arm.com>
>
>gcc/
>	* target.def (compatible_vector_types_p): New target hook.
>	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
>	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
>	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
>	* doc/tm.texi: Regenerate.
>	* gimple-expr.c: Include target.h.
>	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
>	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
>	function.
>	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
>	* config/aarch64/aarch64-sve-builtins.cc
>(gimple_folder::convert_pred):
>	Use the original predicate if it already has a suitable type.
>
>gcc/testsuite/
>	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
>	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
>
>Index: gcc/target.def
>===================================================================
>--- gcc/target.def	2019-11-30 18:48:18.531984101 +0000
>+++ gcc/target.def	2019-12-12 15:07:43.960415368 +0000
>@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
>  hook_bool_mode_false)
> 
> DEFHOOK
>+(compatible_vector_types_p,
>+ "Return true if there is no target-specific reason for treating\n\
>+vector types @var{type1} and @var{type2} as distinct types.  The
>caller\n\
>+has already checked for target-independent reasons, meaning that
>the\n\
>+types are known to have the same mode, to have the same number of
>elements,\n\
>+and to have what the caller considers to be compatible element
>types.\n\
>+\n\
>+The main reason for defining this hook is to reject pairs of types\n\
>+that are handled differently by the target's calling convention.\n\
>+For example, when a new @var{N}-bit vector architecture is added\n\
>+to a target, the target may want to handle normal @var{N}-bit\n\
>+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
>+before, to maintain backwards compatibility.  However, it may also\n\
>+provide new, architecture-specific @code{VECTOR_TYPE}s that are
>passed\n\
>+and returned in a more efficient way.  It is then important to
>maintain\n\
>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the
>new\n\
>+architecture-specific ones.\n\
>+\n\
>+The default implementation returns true, which is correct for most
>targets.",
>+ bool, (const_tree type1, const_tree type2),
>+ hook_bool_const_tree_const_tree_true)
>+
>+DEFHOOK
> (vector_alignment,
> "This hook can be used to define the alignment for a vector of type\n\
>@var{type}, in order to comply with a platform ABI.  The default is
>to\n\
>Index: gcc/hooks.h
>===================================================================
>--- gcc/hooks.h	2019-11-04 21:13:57.727755548 +0000
>+++ gcc/hooks.h	2019-12-12 15:07:43.960415368 +0000
>@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
> extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
> extern bool hook_bool_tree_false (tree);
> extern bool hook_bool_const_tree_false (const_tree);
>+extern bool hook_bool_const_tree_const_tree_true (const_tree,
>const_tree);
> extern bool hook_bool_tree_true (tree);
> extern bool hook_bool_const_tree_true (const_tree);
> extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
>Index: gcc/hooks.c
>===================================================================
>--- gcc/hooks.c	2019-11-04 21:13:57.727755548 +0000
>+++ gcc/hooks.c	2019-12-12 15:07:43.960415368 +0000
>@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
> }
> 
> bool
>+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
>+{
>+  return true;
>+}
>+
>+bool
> hook_bool_tree_true (tree)
> {
>   return true;
>Index: gcc/doc/tm.texi.in
>===================================================================
>--- gcc/doc/tm.texi.in	2019-11-30 18:48:18.523984157 +0000
>+++ gcc/doc/tm.texi.in	2019-12-12 15:07:43.956415393 +0000
>@@ -3365,6 +3365,8 @@ stack.
> 
> @hook TARGET_VECTOR_MODE_SUPPORTED_P
> 
>+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
>+
> @hook TARGET_ARRAY_MODE
> 
> @hook TARGET_ARRAY_MODE_SUPPORTED_P
>Index: gcc/doc/tm.texi
>===================================================================
>--- gcc/doc/tm.texi	2019-11-30 18:48:18.507984271 +0000
>+++ gcc/doc/tm.texi	2019-12-12 15:07:43.952415419 +0000
>@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
> must have move patterns for this mode.
> @end deftypefn
> 
>+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P
>(const_tree @var{type1}, const_tree @var{type2})
>+Return true if there is no target-specific reason for treating
>+vector types @var{type1} and @var{type2} as distinct types.  The
>caller
>+has already checked for target-independent reasons, meaning that the
>+types are known to have the same mode, to have the same number of
>elements,
>+and to have what the caller considers to be compatible element types.
>+
>+The main reason for defining this hook is to reject pairs of types
>+that are handled differently by the target's calling convention.
>+For example, when a new @var{N}-bit vector architecture is added
>+to a target, the target may want to handle normal @var{N}-bit
>+@code{VECTOR_TYPE} arguments and return values in the same way as
>+before, to maintain backwards compatibility.  However, it may also
>+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
>+and returned in a more efficient way.  It is then important to
>maintain
>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
>+architecture-specific ones.
>+
>+The default implementation returns true, which is correct for most
>targets.
>+@end deftypefn
>+
>@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE
>(machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
> Return the mode that GCC should use for an array that has
> @var{nelems} elements, with each element having mode @var{mode}.
>Index: gcc/gimple-expr.c
>===================================================================
>--- gcc/gimple-expr.c	2019-10-08 09:23:31.902529513 +0100
>+++ gcc/gimple-expr.c	2019-12-12 15:07:43.956415393 +0000
>@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
> #include "tree-pass.h"
> #include "stringpool.h"
> #include "attribs.h"
>+#include "target.h"
> 
> /* ----- Type related -----  */
> 
>@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
> 
>   /* Recurse for vector types with the same number of subparts.  */
>   else if (TREE_CODE (inner_type) == VECTOR_TYPE
>-	   && TREE_CODE (outer_type) == VECTOR_TYPE
>-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
>-    return useless_type_conversion_p (TREE_TYPE (outer_type),
>-				      TREE_TYPE (inner_type));
>+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
>+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
>+		      TYPE_VECTOR_SUBPARTS (outer_type))
>+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
>+					  TREE_TYPE (inner_type))
>+	    && targetm.compatible_vector_types_p (inner_type, outer_type));
> 
>   else if (TREE_CODE (inner_type) == ARRAY_TYPE
> 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
>Index: gcc/config/aarch64/aarch64.c
>===================================================================
>--- gcc/config/aarch64/aarch64.c	2019-12-10 16:45:56.338226712 +0000
>+++ gcc/config/aarch64/aarch64.c	2019-12-12 15:07:43.940415503 +0000
>@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype)
>   return default_function_abi;
> }
> 
>+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
>+
>+static bool
>+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
>+{
>+  unsigned int num_zr1 = 0, num_pr1 = 0, num_zr2 = 0, num_pr2 = 0;
>+  if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1)
>+      != aarch64_sve_argument_p (type2, &num_zr2, &num_pr2))
>+    return false;
>+
>+  gcc_assert (num_zr1 == num_zr2 && num_pr1 == num_pr2);
>+  return true;
>+}
>+
> /* Return true if we should emit CFI for register REGNO.  */
> 
> static bool
>@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
> #undef TARGET_VECTOR_MODE_SUPPORTED_P
> #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
> 
>+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
>+#define TARGET_COMPATIBLE_VECTOR_TYPES_P
>aarch64_compatible_vector_types_p
>+
> #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
> #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
>   aarch64_builtin_support_vector_misalignment
>Index: gcc/config/aarch64/aarch64-sve-builtins.cc
>===================================================================
>--- gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-06
>18:22:12.072859530 +0000
>+++ gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-12
>15:07:43.936415528 +0000
>@@ -2251,9 +2251,13 @@ tree
> gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
> 			     unsigned int argno)
> {
>-  tree predtype = truth_type_for (vectype);
>   tree pred = gimple_call_arg (call, argno);
>-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
>+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
>+		TYPE_VECTOR_SUBPARTS (vectype)))
>+    return pred;
>+
>+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
>+		       truth_type_for (vectype), pred);
> }
> 
> /* Return a pointer to the address in a contiguous load or store,
>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
>===================================================================
>--- /dev/null	2019-09-17 11:41:18.176664108 +0100
>+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c	2019-12-12
>15:07:43.972415287 +0000
>@@ -0,0 +1,99 @@
>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>+
>+#include <arm_sve.h>
>+
>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>+
>+void float16_callee (float16x16_t);
>+void float32_callee (float32x8_t);
>+void float64_callee (float64x4_t);
>+void int8_callee (int8x32_t);
>+void int16_callee (int16x16_t);
>+void int32_callee (int32x8_t);
>+void int64_callee (int64x4_t);
>+void uint8_callee (uint8x32_t);
>+void uint16_callee (uint16x16_t);
>+void uint32_callee (uint32x8_t);
>+void uint64_callee (uint64x4_t);
>+
>+void
>+float16_caller (void)
>+{
>+  float16_callee (svdup_f16 (1.0));
>+}
>+
>+void
>+float32_caller (void)
>+{
>+  float32_callee (svdup_f32 (2.0));
>+}
>+
>+void
>+float64_caller (void)
>+{
>+  float64_callee (svdup_f64 (3.0));
>+}
>+
>+void
>+int8_caller (void)
>+{
>+  int8_callee (svindex_s8 (0, 1));
>+}
>+
>+void
>+int16_caller (void)
>+{
>+  int16_callee (svindex_s16 (0, 2));
>+}
>+
>+void
>+int32_caller (void)
>+{
>+  int32_callee (svindex_s32 (0, 3));
>+}
>+
>+void
>+int64_caller (void)
>+{
>+  int64_callee (svindex_s64 (0, 4));
>+}
>+
>+void
>+uint8_caller (void)
>+{
>+  uint8_callee (svindex_u8 (1, 1));
>+}
>+
>+void
>+uint16_caller (void)
>+{
>+  uint16_callee (svindex_u16 (1, 2));
>+}
>+
>+void
>+uint32_caller (void)
>+{
>+  uint32_callee (svindex_u32 (1, 3));
>+}
>+
>+void
>+uint64_caller (void)
>+{
>+  uint64_callee (svindex_u64 (1, 4));
>+}
>+
>+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7],
>\[x0\]} 2 } } */
>+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7],
>\[x0\]} 3 } } */
>+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],
>\[x0\]} 3 } } */
>+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7],
>\[x0\]} 3 } } */
>+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
>===================================================================
>--- /dev/null	2019-09-17 11:41:18.176664108 +0100
>+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c	2019-12-12
>15:07:43.972415287 +0000
>@@ -0,0 +1,99 @@
>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>+
>+#include <arm_sve.h>
>+
>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>+
>+void float16_callee (svfloat16_t);
>+void float32_callee (svfloat32_t);
>+void float64_callee (svfloat64_t);
>+void int8_callee (svint8_t);
>+void int16_callee (svint16_t);
>+void int32_callee (svint32_t);
>+void int64_callee (svint64_t);
>+void uint8_callee (svuint8_t);
>+void uint16_callee (svuint16_t);
>+void uint32_callee (svuint32_t);
>+void uint64_callee (svuint64_t);
>+
>+void
>+float16_caller (float16x16_t arg)
>+{
>+  float16_callee (arg);
>+}
>+
>+void
>+float32_caller (float32x8_t arg)
>+{
>+  float32_callee (arg);
>+}
>+
>+void
>+float64_caller (float64x4_t arg)
>+{
>+  float64_callee (arg);
>+}
>+
>+void
>+int8_caller (int8x32_t arg)
>+{
>+  int8_callee (arg);
>+}
>+
>+void
>+int16_caller (int16x16_t arg)
>+{
>+  int16_callee (arg);
>+}
>+
>+void
>+int32_caller (int32x8_t arg)
>+{
>+  int32_callee (arg);
>+}
>+
>+void
>+int64_caller (int64x4_t arg)
>+{
>+  int64_callee (arg);
>+}
>+
>+void
>+uint8_caller (uint8x32_t arg)
>+{
>+  uint8_callee (arg);
>+}
>+
>+void
>+uint16_caller (uint16x16_t arg)
>+{
>+  uint16_callee (arg);
>+}
>+
>+void
>+uint32_caller (uint32x8_t arg)
>+{
>+  uint32_callee (arg);
>+}
>+
>+void
>+uint64_caller (uint64x4_t arg)
>+{
>+  uint64_callee (arg);
>+}
>+
>+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]}
>2 } } */
>+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]}
>3 } } */
>+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]}
>3 } } */
>+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]}
>3 } } */
>+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-12 16:04 ` Richard Biener
@ 2019-12-12 16:44   ` Richard Sandiford
  2019-12-12 17:20     ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2019-12-12 16:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
> On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>>One problem with adding an N-bit vector extension to an existing
>>architecture is to decide how N-bit vectors should be passed to
>>functions and returned from functions.  Allowing all N-bit vector
>>types to be passed in registers breaks backwards compatibility,
>>since N-bit vectors could be used (and emulated) before the vector
>>extension was added.  But always passing N-bit vectors on the
>>stack would be inefficient for things like vector libm functions.
>>
>>For SVE we took the compromise position of predefining new SVE vector
>>types that are distinct from all existing vector types, including
>>GNU-style vectors.  The new types are passed and returned in an
>>efficient way while existing vector types are passed and returned
>>in the traditional way.  In the right circumstances, the two types
>>are inter-convertible.
>>
>>The SVE types are created using:
>>
>>      vectype = build_distinct_type_copy (vectype);
>>      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
>>      TYPE_ARTIFICIAL (vectype) = 1;
>>
>>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
>>to convert from one type to the other.  However, the distinction can
>>be lost during gimple, which treats two vector types with the same
>>mode, number of elements, and element type as equivalent.  And for
>>most targets that's the right thing to do.
>
> And why's that a problem? The difference appears only in the function call ABI which is determined by the function signature rather than types or modes of the actual arguments? 

We use the type of the actual arguments when deciding how arguments
should be passed to functions:

  /* I counts args in order (to be) pushed; ARGPOS counts in order written.  */
  for (argpos = 0; argpos < num_actuals; i--, argpos++)
    {
      tree type = TREE_TYPE (args[i].tree_value);
      [...]
      /* See if this argument should be passed by invisible reference.  */
      function_arg_info arg (type, argpos < n_named_args);

And it has to be that way for calls to unprototyped functions,
or for varargs.

The AArch64 port emits an error if calls pass values of SVE type to an
unprototyped function.  To do that we need to know whether the value
really is an SVE type rathr than a plain vector.

For varags the ABI is the same for 256 bits+.  But we'll have the
same problem there once we support -msve-vector-bits=128, since the
layout of SVE and Advanced SIMD vectors differ for big-endian.

Thanks,
Richard

>
> Richard. 
>
>>This patch therefore adds a hook that lets the target choose
>>whether such vector types are indeed equivalent.
>>
>>Note that the new tests fail for -mabi=ilp32 in the same way as other
>>ACLE-based tests.  I'm still planning to fix that as a follow-on.
>>
>>Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>>Richard
>>
>>
>>2019-12-12  Richard Sandiford  <richard.sandiford@arm.com>
>>
>>gcc/
>>	* target.def (compatible_vector_types_p): New target hook.
>>	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
>>	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
>>	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
>>	* doc/tm.texi: Regenerate.
>>	* gimple-expr.c: Include target.h.
>>	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
>>	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
>>	function.
>>	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
>>	* config/aarch64/aarch64-sve-builtins.cc
>>(gimple_folder::convert_pred):
>>	Use the original predicate if it already has a suitable type.
>>
>>gcc/testsuite/
>>	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
>>	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
>>
>>Index: gcc/target.def
>>===================================================================
>>--- gcc/target.def	2019-11-30 18:48:18.531984101 +0000
>>+++ gcc/target.def	2019-12-12 15:07:43.960415368 +0000
>>@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
>>  hook_bool_mode_false)
>> 
>> DEFHOOK
>>+(compatible_vector_types_p,
>>+ "Return true if there is no target-specific reason for treating\n\
>>+vector types @var{type1} and @var{type2} as distinct types.  The
>>caller\n\
>>+has already checked for target-independent reasons, meaning that
>>the\n\
>>+types are known to have the same mode, to have the same number of
>>elements,\n\
>>+and to have what the caller considers to be compatible element
>>types.\n\
>>+\n\
>>+The main reason for defining this hook is to reject pairs of types\n\
>>+that are handled differently by the target's calling convention.\n\
>>+For example, when a new @var{N}-bit vector architecture is added\n\
>>+to a target, the target may want to handle normal @var{N}-bit\n\
>>+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
>>+before, to maintain backwards compatibility.  However, it may also\n\
>>+provide new, architecture-specific @code{VECTOR_TYPE}s that are
>>passed\n\
>>+and returned in a more efficient way.  It is then important to
>>maintain\n\
>>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the
>>new\n\
>>+architecture-specific ones.\n\
>>+\n\
>>+The default implementation returns true, which is correct for most
>>targets.",
>>+ bool, (const_tree type1, const_tree type2),
>>+ hook_bool_const_tree_const_tree_true)
>>+
>>+DEFHOOK
>> (vector_alignment,
>> "This hook can be used to define the alignment for a vector of type\n\
>>@var{type}, in order to comply with a platform ABI.  The default is
>>to\n\
>>Index: gcc/hooks.h
>>===================================================================
>>--- gcc/hooks.h	2019-11-04 21:13:57.727755548 +0000
>>+++ gcc/hooks.h	2019-12-12 15:07:43.960415368 +0000
>>@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
>> extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
>> extern bool hook_bool_tree_false (tree);
>> extern bool hook_bool_const_tree_false (const_tree);
>>+extern bool hook_bool_const_tree_const_tree_true (const_tree,
>>const_tree);
>> extern bool hook_bool_tree_true (tree);
>> extern bool hook_bool_const_tree_true (const_tree);
>> extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
>>Index: gcc/hooks.c
>>===================================================================
>>--- gcc/hooks.c	2019-11-04 21:13:57.727755548 +0000
>>+++ gcc/hooks.c	2019-12-12 15:07:43.960415368 +0000
>>@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
>> }
>> 
>> bool
>>+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
>>+{
>>+  return true;
>>+}
>>+
>>+bool
>> hook_bool_tree_true (tree)
>> {
>>   return true;
>>Index: gcc/doc/tm.texi.in
>>===================================================================
>>--- gcc/doc/tm.texi.in	2019-11-30 18:48:18.523984157 +0000
>>+++ gcc/doc/tm.texi.in	2019-12-12 15:07:43.956415393 +0000
>>@@ -3365,6 +3365,8 @@ stack.
>> 
>> @hook TARGET_VECTOR_MODE_SUPPORTED_P
>> 
>>+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
>>+
>> @hook TARGET_ARRAY_MODE
>> 
>> @hook TARGET_ARRAY_MODE_SUPPORTED_P
>>Index: gcc/doc/tm.texi
>>===================================================================
>>--- gcc/doc/tm.texi	2019-11-30 18:48:18.507984271 +0000
>>+++ gcc/doc/tm.texi	2019-12-12 15:07:43.952415419 +0000
>>@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
>> must have move patterns for this mode.
>> @end deftypefn
>> 
>>+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P
>>(const_tree @var{type1}, const_tree @var{type2})
>>+Return true if there is no target-specific reason for treating
>>+vector types @var{type1} and @var{type2} as distinct types.  The
>>caller
>>+has already checked for target-independent reasons, meaning that the
>>+types are known to have the same mode, to have the same number of
>>elements,
>>+and to have what the caller considers to be compatible element types.
>>+
>>+The main reason for defining this hook is to reject pairs of types
>>+that are handled differently by the target's calling convention.
>>+For example, when a new @var{N}-bit vector architecture is added
>>+to a target, the target may want to handle normal @var{N}-bit
>>+@code{VECTOR_TYPE} arguments and return values in the same way as
>>+before, to maintain backwards compatibility.  However, it may also
>>+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
>>+and returned in a more efficient way.  It is then important to
>>maintain
>>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
>>+architecture-specific ones.
>>+
>>+The default implementation returns true, which is correct for most
>>targets.
>>+@end deftypefn
>>+
>>@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE
>>(machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
>> Return the mode that GCC should use for an array that has
>> @var{nelems} elements, with each element having mode @var{mode}.
>>Index: gcc/gimple-expr.c
>>===================================================================
>>--- gcc/gimple-expr.c	2019-10-08 09:23:31.902529513 +0100
>>+++ gcc/gimple-expr.c	2019-12-12 15:07:43.956415393 +0000
>>@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
>> #include "tree-pass.h"
>> #include "stringpool.h"
>> #include "attribs.h"
>>+#include "target.h"
>> 
>> /* ----- Type related -----  */
>> 
>>@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
>> 
>>   /* Recurse for vector types with the same number of subparts.  */
>>   else if (TREE_CODE (inner_type) == VECTOR_TYPE
>>-	   && TREE_CODE (outer_type) == VECTOR_TYPE
>>-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
>>-    return useless_type_conversion_p (TREE_TYPE (outer_type),
>>-				      TREE_TYPE (inner_type));
>>+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
>>+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
>>+		      TYPE_VECTOR_SUBPARTS (outer_type))
>>+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
>>+					  TREE_TYPE (inner_type))
>>+	    && targetm.compatible_vector_types_p (inner_type, outer_type));
>> 
>>   else if (TREE_CODE (inner_type) == ARRAY_TYPE
>> 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
>>Index: gcc/config/aarch64/aarch64.c
>>===================================================================
>>--- gcc/config/aarch64/aarch64.c	2019-12-10 16:45:56.338226712 +0000
>>+++ gcc/config/aarch64/aarch64.c	2019-12-12 15:07:43.940415503 +0000
>>@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype)
>>   return default_function_abi;
>> }
>> 
>>+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
>>+
>>+static bool
>>+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
>>+{
>>+  unsigned int num_zr1 = 0, num_pr1 = 0, num_zr2 = 0, num_pr2 = 0;
>>+  if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1)
>>+      != aarch64_sve_argument_p (type2, &num_zr2, &num_pr2))
>>+    return false;
>>+
>>+  gcc_assert (num_zr1 == num_zr2 && num_pr1 == num_pr2);
>>+  return true;
>>+}
>>+
>> /* Return true if we should emit CFI for register REGNO.  */
>> 
>> static bool
>>@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
>> #undef TARGET_VECTOR_MODE_SUPPORTED_P
>> #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
>> 
>>+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
>>+#define TARGET_COMPATIBLE_VECTOR_TYPES_P
>>aarch64_compatible_vector_types_p
>>+
>> #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
>> #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
>>   aarch64_builtin_support_vector_misalignment
>>Index: gcc/config/aarch64/aarch64-sve-builtins.cc
>>===================================================================
>>--- gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-06
>>18:22:12.072859530 +0000
>>+++ gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-12
>>15:07:43.936415528 +0000
>>@@ -2251,9 +2251,13 @@ tree
>> gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
>> 			     unsigned int argno)
>> {
>>-  tree predtype = truth_type_for (vectype);
>>   tree pred = gimple_call_arg (call, argno);
>>-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
>>+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
>>+		TYPE_VECTOR_SUBPARTS (vectype)))
>>+    return pred;
>>+
>>+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
>>+		       truth_type_for (vectype), pred);
>> }
>> 
>> /* Return a pointer to the address in a contiguous load or store,
>>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
>>===================================================================
>>--- /dev/null	2019-09-17 11:41:18.176664108 +0100
>>+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c	2019-12-12
>>15:07:43.972415287 +0000
>>@@ -0,0 +1,99 @@
>>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>>+
>>+#include <arm_sve.h>
>>+
>>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>>+
>>+void float16_callee (float16x16_t);
>>+void float32_callee (float32x8_t);
>>+void float64_callee (float64x4_t);
>>+void int8_callee (int8x32_t);
>>+void int16_callee (int16x16_t);
>>+void int32_callee (int32x8_t);
>>+void int64_callee (int64x4_t);
>>+void uint8_callee (uint8x32_t);
>>+void uint16_callee (uint16x16_t);
>>+void uint32_callee (uint32x8_t);
>>+void uint64_callee (uint64x4_t);
>>+
>>+void
>>+float16_caller (void)
>>+{
>>+  float16_callee (svdup_f16 (1.0));
>>+}
>>+
>>+void
>>+float32_caller (void)
>>+{
>>+  float32_callee (svdup_f32 (2.0));
>>+}
>>+
>>+void
>>+float64_caller (void)
>>+{
>>+  float64_callee (svdup_f64 (3.0));
>>+}
>>+
>>+void
>>+int8_caller (void)
>>+{
>>+  int8_callee (svindex_s8 (0, 1));
>>+}
>>+
>>+void
>>+int16_caller (void)
>>+{
>>+  int16_callee (svindex_s16 (0, 2));
>>+}
>>+
>>+void
>>+int32_caller (void)
>>+{
>>+  int32_callee (svindex_s32 (0, 3));
>>+}
>>+
>>+void
>>+int64_caller (void)
>>+{
>>+  int64_callee (svindex_s64 (0, 4));
>>+}
>>+
>>+void
>>+uint8_caller (void)
>>+{
>>+  uint8_callee (svindex_u8 (1, 1));
>>+}
>>+
>>+void
>>+uint16_caller (void)
>>+{
>>+  uint16_callee (svindex_u16 (1, 2));
>>+}
>>+
>>+void
>>+uint32_caller (void)
>>+{
>>+  uint32_callee (svindex_u32 (1, 3));
>>+}
>>+
>>+void
>>+uint64_caller (void)
>>+{
>>+  uint64_callee (svindex_u64 (1, 4));
>>+}
>>+
>>+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7],
>>\[x0\]} 2 } } */
>>+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7],
>>\[x0\]} 3 } } */
>>+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],
>>\[x0\]} 3 } } */
>>+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7],
>>\[x0\]} 3 } } */
>>+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
>>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
>>===================================================================
>>--- /dev/null	2019-09-17 11:41:18.176664108 +0100
>>+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c	2019-12-12
>>15:07:43.972415287 +0000
>>@@ -0,0 +1,99 @@
>>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>>+
>>+#include <arm_sve.h>
>>+
>>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>>+
>>+void float16_callee (svfloat16_t);
>>+void float32_callee (svfloat32_t);
>>+void float64_callee (svfloat64_t);
>>+void int8_callee (svint8_t);
>>+void int16_callee (svint16_t);
>>+void int32_callee (svint32_t);
>>+void int64_callee (svint64_t);
>>+void uint8_callee (svuint8_t);
>>+void uint16_callee (svuint16_t);
>>+void uint32_callee (svuint32_t);
>>+void uint64_callee (svuint64_t);
>>+
>>+void
>>+float16_caller (float16x16_t arg)
>>+{
>>+  float16_callee (arg);
>>+}
>>+
>>+void
>>+float32_caller (float32x8_t arg)
>>+{
>>+  float32_callee (arg);
>>+}
>>+
>>+void
>>+float64_caller (float64x4_t arg)
>>+{
>>+  float64_callee (arg);
>>+}
>>+
>>+void
>>+int8_caller (int8x32_t arg)
>>+{
>>+  int8_callee (arg);
>>+}
>>+
>>+void
>>+int16_caller (int16x16_t arg)
>>+{
>>+  int16_callee (arg);
>>+}
>>+
>>+void
>>+int32_caller (int32x8_t arg)
>>+{
>>+  int32_callee (arg);
>>+}
>>+
>>+void
>>+int64_caller (int64x4_t arg)
>>+{
>>+  int64_callee (arg);
>>+}
>>+
>>+void
>>+uint8_caller (uint8x32_t arg)
>>+{
>>+  uint8_callee (arg);
>>+}
>>+
>>+void
>>+uint16_caller (uint16x16_t arg)
>>+{
>>+  uint16_callee (arg);
>>+}
>>+
>>+void
>>+uint32_caller (uint32x8_t arg)
>>+{
>>+  uint32_callee (arg);
>>+}
>>+
>>+void
>>+uint64_caller (uint64x4_t arg)
>>+{
>>+  uint64_callee (arg);
>>+}
>>+
>>+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]}
>>2 } } */
>>+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]}
>>3 } } */
>>+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]}
>>3 } } */
>>+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]}
>>3 } } */
>>+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-12 16:44   ` Richard Sandiford
@ 2019-12-12 17:20     ` Richard Biener
  2019-12-12 18:16       ` Richard Sandiford
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2019-12-12 17:20 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On December 12, 2019 5:44:25 PM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>Richard Biener <richard.guenther@gmail.com> writes:
>> On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford
><richard.sandiford@arm.com> wrote:
>>>One problem with adding an N-bit vector extension to an existing
>>>architecture is to decide how N-bit vectors should be passed to
>>>functions and returned from functions.  Allowing all N-bit vector
>>>types to be passed in registers breaks backwards compatibility,
>>>since N-bit vectors could be used (and emulated) before the vector
>>>extension was added.  But always passing N-bit vectors on the
>>>stack would be inefficient for things like vector libm functions.
>>>
>>>For SVE we took the compromise position of predefining new SVE vector
>>>types that are distinct from all existing vector types, including
>>>GNU-style vectors.  The new types are passed and returned in an
>>>efficient way while existing vector types are passed and returned
>>>in the traditional way.  In the right circumstances, the two types
>>>are inter-convertible.
>>>
>>>The SVE types are created using:
>>>
>>>      vectype = build_distinct_type_copy (vectype);
>>>      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
>>>      TYPE_ARTIFICIAL (vectype) = 1;
>>>
>>>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
>>>to convert from one type to the other.  However, the distinction can
>>>be lost during gimple, which treats two vector types with the same
>>>mode, number of elements, and element type as equivalent.  And for
>>>most targets that's the right thing to do.
>>
>> And why's that a problem? The difference appears only in the function
>call ABI which is determined by the function signature rather than
>types or modes of the actual arguments? 
>
>We use the type of the actual arguments when deciding how arguments
>should be passed to functions:
>
>/* I counts args in order (to be) pushed; ARGPOS counts in order
>written.  */
>  for (argpos = 0; argpos < num_actuals; i--, argpos++)
>    {
>      tree type = TREE_TYPE (args[i].tree_value);
>      [...]
>   /* See if this argument should be passed by invisible reference.  */
>      function_arg_info arg (type, argpos < n_named_args);
>
>And it has to be that way for calls to unprototyped functions,
>or for varargs.

So even for varargs the passing is different? Also we have CALL_EXPR_FNTYPE which you could populate specially even for unprototyped or varargs functions.

I realize we now look at the type of values but you have to realize that differences that are not relevant for values are discarded.  Artificially preserving such non-real differences everywhere(!) while it only matters at call boundaries doesn't look correct. 

>The AArch64 port emits an error if calls pass values of SVE type to an
>unprototyped function.  To do that we need to know whether the value
>really is an SVE type rathr than a plain vector.
>
>For varags the ABI is the same for 256 bits+.  But we'll have the
>same problem there once we support -msve-vector-bits=128, since the
>layout of SVE and Advanced SIMD vectors differ for big-endian.

But then why don't you have different modes?

Richard. 

>Thanks,
>Richard
>
>>
>> Richard. 
>>
>>>This patch therefore adds a hook that lets the target choose
>>>whether such vector types are indeed equivalent.
>>>
>>>Note that the new tests fail for -mabi=ilp32 in the same way as other
>>>ACLE-based tests.  I'm still planning to fix that as a follow-on.
>>>
>>>Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>>
>>>Richard
>>>
>>>
>>>2019-12-12  Richard Sandiford  <richard.sandiford@arm.com>
>>>
>>>gcc/
>>>	* target.def (compatible_vector_types_p): New target hook.
>>>	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
>>>	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
>>>	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
>>>	* doc/tm.texi: Regenerate.
>>>	* gimple-expr.c: Include target.h.
>>>	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
>>>	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
>>>	function.
>>>	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
>>>	* config/aarch64/aarch64-sve-builtins.cc
>>>(gimple_folder::convert_pred):
>>>	Use the original predicate if it already has a suitable type.
>>>
>>>gcc/testsuite/
>>>	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
>>>	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
>>>
>>>Index: gcc/target.def
>>>===================================================================
>>>--- gcc/target.def	2019-11-30 18:48:18.531984101 +0000
>>>+++ gcc/target.def	2019-12-12 15:07:43.960415368 +0000
>>>@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
>>>  hook_bool_mode_false)
>>> 
>>> DEFHOOK
>>>+(compatible_vector_types_p,
>>>+ "Return true if there is no target-specific reason for treating\n\
>>>+vector types @var{type1} and @var{type2} as distinct types.  The
>>>caller\n\
>>>+has already checked for target-independent reasons, meaning that
>>>the\n\
>>>+types are known to have the same mode, to have the same number of
>>>elements,\n\
>>>+and to have what the caller considers to be compatible element
>>>types.\n\
>>>+\n\
>>>+The main reason for defining this hook is to reject pairs of
>types\n\
>>>+that are handled differently by the target's calling convention.\n\
>>>+For example, when a new @var{N}-bit vector architecture is added\n\
>>>+to a target, the target may want to handle normal @var{N}-bit\n\
>>>+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
>>>+before, to maintain backwards compatibility.  However, it may
>also\n\
>>>+provide new, architecture-specific @code{VECTOR_TYPE}s that are
>>>passed\n\
>>>+and returned in a more efficient way.  It is then important to
>>>maintain\n\
>>>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the
>>>new\n\
>>>+architecture-specific ones.\n\
>>>+\n\
>>>+The default implementation returns true, which is correct for most
>>>targets.",
>>>+ bool, (const_tree type1, const_tree type2),
>>>+ hook_bool_const_tree_const_tree_true)
>>>+
>>>+DEFHOOK
>>> (vector_alignment,
>>> "This hook can be used to define the alignment for a vector of
>type\n\
>>>@var{type}, in order to comply with a platform ABI.  The default is
>>>to\n\
>>>Index: gcc/hooks.h
>>>===================================================================
>>>--- gcc/hooks.h	2019-11-04 21:13:57.727755548 +0000
>>>+++ gcc/hooks.h	2019-12-12 15:07:43.960415368 +0000
>>>@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
>>> extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
>>> extern bool hook_bool_tree_false (tree);
>>> extern bool hook_bool_const_tree_false (const_tree);
>>>+extern bool hook_bool_const_tree_const_tree_true (const_tree,
>>>const_tree);
>>> extern bool hook_bool_tree_true (tree);
>>> extern bool hook_bool_const_tree_true (const_tree);
>>> extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
>>>Index: gcc/hooks.c
>>>===================================================================
>>>--- gcc/hooks.c	2019-11-04 21:13:57.727755548 +0000
>>>+++ gcc/hooks.c	2019-12-12 15:07:43.960415368 +0000
>>>@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
>>> }
>>> 
>>> bool
>>>+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
>>>+{
>>>+  return true;
>>>+}
>>>+
>>>+bool
>>> hook_bool_tree_true (tree)
>>> {
>>>   return true;
>>>Index: gcc/doc/tm.texi.in
>>>===================================================================
>>>--- gcc/doc/tm.texi.in	2019-11-30 18:48:18.523984157 +0000
>>>+++ gcc/doc/tm.texi.in	2019-12-12 15:07:43.956415393 +0000
>>>@@ -3365,6 +3365,8 @@ stack.
>>> 
>>> @hook TARGET_VECTOR_MODE_SUPPORTED_P
>>> 
>>>+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
>>>+
>>> @hook TARGET_ARRAY_MODE
>>> 
>>> @hook TARGET_ARRAY_MODE_SUPPORTED_P
>>>Index: gcc/doc/tm.texi
>>>===================================================================
>>>--- gcc/doc/tm.texi	2019-11-30 18:48:18.507984271 +0000
>>>+++ gcc/doc/tm.texi	2019-12-12 15:07:43.952415419 +0000
>>>@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
>>> must have move patterns for this mode.
>>> @end deftypefn
>>> 
>>>+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P
>>>(const_tree @var{type1}, const_tree @var{type2})
>>>+Return true if there is no target-specific reason for treating
>>>+vector types @var{type1} and @var{type2} as distinct types.  The
>>>caller
>>>+has already checked for target-independent reasons, meaning that the
>>>+types are known to have the same mode, to have the same number of
>>>elements,
>>>+and to have what the caller considers to be compatible element
>types.
>>>+
>>>+The main reason for defining this hook is to reject pairs of types
>>>+that are handled differently by the target's calling convention.
>>>+For example, when a new @var{N}-bit vector architecture is added
>>>+to a target, the target may want to handle normal @var{N}-bit
>>>+@code{VECTOR_TYPE} arguments and return values in the same way as
>>>+before, to maintain backwards compatibility.  However, it may also
>>>+provide new, architecture-specific @code{VECTOR_TYPE}s that are
>passed
>>>+and returned in a more efficient way.  It is then important to
>>>maintain
>>>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
>>>+architecture-specific ones.
>>>+
>>>+The default implementation returns true, which is correct for most
>>>targets.
>>>+@end deftypefn
>>>+
>>>@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE
>>>(machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
>>> Return the mode that GCC should use for an array that has
>>> @var{nelems} elements, with each element having mode @var{mode}.
>>>Index: gcc/gimple-expr.c
>>>===================================================================
>>>--- gcc/gimple-expr.c	2019-10-08 09:23:31.902529513 +0100
>>>+++ gcc/gimple-expr.c	2019-12-12 15:07:43.956415393 +0000
>>>@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
>>> #include "tree-pass.h"
>>> #include "stringpool.h"
>>> #include "attribs.h"
>>>+#include "target.h"
>>> 
>>> /* ----- Type related -----  */
>>> 
>>>@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
>>> 
>>>   /* Recurse for vector types with the same number of subparts.  */
>>>   else if (TREE_CODE (inner_type) == VECTOR_TYPE
>>>-	   && TREE_CODE (outer_type) == VECTOR_TYPE
>>>-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
>>>-    return useless_type_conversion_p (TREE_TYPE (outer_type),
>>>-				      TREE_TYPE (inner_type));
>>>+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
>>>+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
>>>+		      TYPE_VECTOR_SUBPARTS (outer_type))
>>>+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
>>>+					  TREE_TYPE (inner_type))
>>>+	    && targetm.compatible_vector_types_p (inner_type, outer_type));
>>> 
>>>   else if (TREE_CODE (inner_type) == ARRAY_TYPE
>>> 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
>>>Index: gcc/config/aarch64/aarch64.c
>>>===================================================================
>>>--- gcc/config/aarch64/aarch64.c	2019-12-10 16:45:56.338226712 +0000
>>>+++ gcc/config/aarch64/aarch64.c	2019-12-12 15:07:43.940415503 +0000
>>>@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype)
>>>   return default_function_abi;
>>> }
>>> 
>>>+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
>>>+
>>>+static bool
>>>+aarch64_compatible_vector_types_p (const_tree type1, const_tree
>type2)
>>>+{
>>>+  unsigned int num_zr1 = 0, num_pr1 = 0, num_zr2 = 0, num_pr2 = 0;
>>>+  if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1)
>>>+      != aarch64_sve_argument_p (type2, &num_zr2, &num_pr2))
>>>+    return false;
>>>+
>>>+  gcc_assert (num_zr1 == num_zr2 && num_pr1 == num_pr2);
>>>+  return true;
>>>+}
>>>+
>>> /* Return true if we should emit CFI for register REGNO.  */
>>> 
>>> static bool
>>>@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
>>> #undef TARGET_VECTOR_MODE_SUPPORTED_P
>>> #define TARGET_VECTOR_MODE_SUPPORTED_P
>aarch64_vector_mode_supported_p
>>> 
>>>+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
>>>+#define TARGET_COMPATIBLE_VECTOR_TYPES_P
>>>aarch64_compatible_vector_types_p
>>>+
>>> #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
>>> #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
>>>   aarch64_builtin_support_vector_misalignment
>>>Index: gcc/config/aarch64/aarch64-sve-builtins.cc
>>>===================================================================
>>>--- gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-06
>>>18:22:12.072859530 +0000
>>>+++ gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-12
>>>15:07:43.936415528 +0000
>>>@@ -2251,9 +2251,13 @@ tree
>>> gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
>>> 			     unsigned int argno)
>>> {
>>>-  tree predtype = truth_type_for (vectype);
>>>   tree pred = gimple_call_arg (call, argno);
>>>-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
>>>+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
>>>+		TYPE_VECTOR_SUBPARTS (vectype)))
>>>+    return pred;
>>>+
>>>+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
>>>+		       truth_type_for (vectype), pred);
>>> }
>>> 
>>> /* Return a pointer to the address in a contiguous load or store,
>>>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
>>>===================================================================
>>>--- /dev/null	2019-09-17 11:41:18.176664108 +0100
>>>+++
>gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c	2019-12-12
>>>15:07:43.972415287 +0000
>>>@@ -0,0 +1,99 @@
>>>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>>>+
>>>+#include <arm_sve.h>
>>>+
>>>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>>>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>>>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>>>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>>>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>>>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>>>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>>>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>>>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>>>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>>>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>>>+
>>>+void float16_callee (float16x16_t);
>>>+void float32_callee (float32x8_t);
>>>+void float64_callee (float64x4_t);
>>>+void int8_callee (int8x32_t);
>>>+void int16_callee (int16x16_t);
>>>+void int32_callee (int32x8_t);
>>>+void int64_callee (int64x4_t);
>>>+void uint8_callee (uint8x32_t);
>>>+void uint16_callee (uint16x16_t);
>>>+void uint32_callee (uint32x8_t);
>>>+void uint64_callee (uint64x4_t);
>>>+
>>>+void
>>>+float16_caller (void)
>>>+{
>>>+  float16_callee (svdup_f16 (1.0));
>>>+}
>>>+
>>>+void
>>>+float32_caller (void)
>>>+{
>>>+  float32_callee (svdup_f32 (2.0));
>>>+}
>>>+
>>>+void
>>>+float64_caller (void)
>>>+{
>>>+  float64_callee (svdup_f64 (3.0));
>>>+}
>>>+
>>>+void
>>>+int8_caller (void)
>>>+{
>>>+  int8_callee (svindex_s8 (0, 1));
>>>+}
>>>+
>>>+void
>>>+int16_caller (void)
>>>+{
>>>+  int16_callee (svindex_s16 (0, 2));
>>>+}
>>>+
>>>+void
>>>+int32_caller (void)
>>>+{
>>>+  int32_callee (svindex_s32 (0, 3));
>>>+}
>>>+
>>>+void
>>>+int64_caller (void)
>>>+{
>>>+  int64_callee (svindex_s64 (0, 4));
>>>+}
>>>+
>>>+void
>>>+uint8_caller (void)
>>>+{
>>>+  uint8_callee (svindex_u8 (1, 1));
>>>+}
>>>+
>>>+void
>>>+uint16_caller (void)
>>>+{
>>>+  uint16_callee (svindex_u16 (1, 2));
>>>+}
>>>+
>>>+void
>>>+uint32_caller (void)
>>>+{
>>>+  uint32_callee (svindex_u32 (1, 3));
>>>+}
>>>+
>>>+void
>>>+uint64_caller (void)
>>>+{
>>>+  uint64_callee (svindex_u64 (1, 4));
>>>+}
>>>+
>>>+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7],
>>>\[x0\]} 2 } } */
>>>+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7],
>>>\[x0\]} 3 } } */
>>>+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],
>>>\[x0\]} 3 } } */
>>>+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7],
>>>\[x0\]} 3 } } */
>>>+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } }
>*/
>>>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
>>>===================================================================
>>>--- /dev/null	2019-09-17 11:41:18.176664108 +0100
>>>+++
>gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c	2019-12-12
>>>15:07:43.972415287 +0000
>>>@@ -0,0 +1,99 @@
>>>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>>>+
>>>+#include <arm_sve.h>
>>>+
>>>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>>>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>>>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>>>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>>>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>>>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>>>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>>>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>>>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>>>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>>>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>>>+
>>>+void float16_callee (svfloat16_t);
>>>+void float32_callee (svfloat32_t);
>>>+void float64_callee (svfloat64_t);
>>>+void int8_callee (svint8_t);
>>>+void int16_callee (svint16_t);
>>>+void int32_callee (svint32_t);
>>>+void int64_callee (svint64_t);
>>>+void uint8_callee (svuint8_t);
>>>+void uint16_callee (svuint16_t);
>>>+void uint32_callee (svuint32_t);
>>>+void uint64_callee (svuint64_t);
>>>+
>>>+void
>>>+float16_caller (float16x16_t arg)
>>>+{
>>>+  float16_callee (arg);
>>>+}
>>>+
>>>+void
>>>+float32_caller (float32x8_t arg)
>>>+{
>>>+  float32_callee (arg);
>>>+}
>>>+
>>>+void
>>>+float64_caller (float64x4_t arg)
>>>+{
>>>+  float64_callee (arg);
>>>+}
>>>+
>>>+void
>>>+int8_caller (int8x32_t arg)
>>>+{
>>>+  int8_callee (arg);
>>>+}
>>>+
>>>+void
>>>+int16_caller (int16x16_t arg)
>>>+{
>>>+  int16_callee (arg);
>>>+}
>>>+
>>>+void
>>>+int32_caller (int32x8_t arg)
>>>+{
>>>+  int32_callee (arg);
>>>+}
>>>+
>>>+void
>>>+int64_caller (int64x4_t arg)
>>>+{
>>>+  int64_callee (arg);
>>>+}
>>>+
>>>+void
>>>+uint8_caller (uint8x32_t arg)
>>>+{
>>>+  uint8_callee (arg);
>>>+}
>>>+
>>>+void
>>>+uint16_caller (uint16x16_t arg)
>>>+{
>>>+  uint16_callee (arg);
>>>+}
>>>+
>>>+void
>>>+uint32_caller (uint32x8_t arg)
>>>+{
>>>+  uint32_callee (arg);
>>>+}
>>>+
>>>+void
>>>+uint64_caller (uint64x4_t arg)
>>>+{
>>>+  uint64_callee (arg);
>>>+}
>>>+
>>>+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z,
>\[x0\]}
>>>2 } } */
>>>+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z,
>\[x0\]}
>>>3 } } */
>>>+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z,
>\[x0\]}
>>>3 } } */
>>>+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z,
>\[x0\]}
>>>3 } } */
>>>+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-12 17:20     ` Richard Biener
@ 2019-12-12 18:16       ` Richard Sandiford
  2019-12-13  8:41         ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2019-12-12 18:16 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
> On December 12, 2019 5:44:25 PM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>>Richard Biener <richard.guenther@gmail.com> writes:
>>> On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford
>><richard.sandiford@arm.com> wrote:
>>>>One problem with adding an N-bit vector extension to an existing
>>>>architecture is to decide how N-bit vectors should be passed to
>>>>functions and returned from functions.  Allowing all N-bit vector
>>>>types to be passed in registers breaks backwards compatibility,
>>>>since N-bit vectors could be used (and emulated) before the vector
>>>>extension was added.  But always passing N-bit vectors on the
>>>>stack would be inefficient for things like vector libm functions.
>>>>
>>>>For SVE we took the compromise position of predefining new SVE vector
>>>>types that are distinct from all existing vector types, including
>>>>GNU-style vectors.  The new types are passed and returned in an
>>>>efficient way while existing vector types are passed and returned
>>>>in the traditional way.  In the right circumstances, the two types
>>>>are inter-convertible.
>>>>
>>>>The SVE types are created using:
>>>>
>>>>      vectype = build_distinct_type_copy (vectype);
>>>>      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
>>>>      TYPE_ARTIFICIAL (vectype) = 1;
>>>>
>>>>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
>>>>to convert from one type to the other.  However, the distinction can
>>>>be lost during gimple, which treats two vector types with the same
>>>>mode, number of elements, and element type as equivalent.  And for
>>>>most targets that's the right thing to do.
>>>
>>> And why's that a problem? The difference appears only in the function
>>call ABI which is determined by the function signature rather than
>>types or modes of the actual arguments? 
>>
>>We use the type of the actual arguments when deciding how arguments
>>should be passed to functions:
>>
>>/* I counts args in order (to be) pushed; ARGPOS counts in order
>>written.  */
>>  for (argpos = 0; argpos < num_actuals; i--, argpos++)
>>    {
>>      tree type = TREE_TYPE (args[i].tree_value);
>>      [...]
>>   /* See if this argument should be passed by invisible reference.  */
>>      function_arg_info arg (type, argpos < n_named_args);
>>
>>And it has to be that way for calls to unprototyped functions,
>>or for varargs.
>
> So even for varargs the passing is different? Also we have CALL_EXPR_FNTYPE which you could populate specially even for unprototyped or varargs functions.
>
> I realize we now look at the type of values but you have to realize that differences that are not relevant for values are discarded.  Artificially preserving such non-real differences everywhere(!) while it only matters at call boundaries doesn't look correct. 

But isn't this similar to the way that we preserve the difference
between:

  struct s1 { int i; };
  struct s2 { int i; };

?  They're the same value as far as the target machine is concerned,
but we preserve the difference for other reasons.

>>The AArch64 port emits an error if calls pass values of SVE type to an
>>unprototyped function.  To do that we need to know whether the value
>>really is an SVE type rathr than a plain vector.
>>
>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>same problem there once we support -msve-vector-bits=128, since the
>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>
> But then why don't you have different modes?

Yeah, true, modes will probably help for the Advanced SIMD/SVE
difference.  But from a vector value POV, a vector of 4 ints is a vector
of 4 ints, so even distinguishing based on the mode is artificial.

SVE is AFAIK the first target to have different modes for potentially
the "same" vector type, and I had to add new infrastructure to allow
targets to define multiple modes of the same size.  So the fact that
gimple distinguishes otherwise identical vectors based on mode is a
relatively recent thing.  AFAIK it just fell out in the wash rather
than being deliberately planned.  It happens to be convenient in this
context, but it hasn't been important until now.

The hook doesn't seem any worse than distinguishing based on the mode.
Another way to avoid this would have been to define separate SVE modes
for the predefined vectors.  The big downside of that is that we'd end
up doubling the number of SVE patterns.

Extra on-the-side metadata is going to be easy to drop accidentally,
and this is something we need for correctness rather than optimisation.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-12 18:16       ` Richard Sandiford
@ 2019-12-13  8:41         ` Richard Biener
  2019-12-13  9:12           ` Richard Sandiford
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2019-12-13  8:41 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On December 12, 2019 7:15:36 PM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>Richard Biener <richard.guenther@gmail.com> writes:
>> On December 12, 2019 5:44:25 PM GMT+01:00, Richard Sandiford
><richard.sandiford@arm.com> wrote:
>>>Richard Biener <richard.guenther@gmail.com> writes:
>>>> On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford
>>><richard.sandiford@arm.com> wrote:
>>>>>One problem with adding an N-bit vector extension to an existing
>>>>>architecture is to decide how N-bit vectors should be passed to
>>>>>functions and returned from functions.  Allowing all N-bit vector
>>>>>types to be passed in registers breaks backwards compatibility,
>>>>>since N-bit vectors could be used (and emulated) before the vector
>>>>>extension was added.  But always passing N-bit vectors on the
>>>>>stack would be inefficient for things like vector libm functions.
>>>>>
>>>>>For SVE we took the compromise position of predefining new SVE
>vector
>>>>>types that are distinct from all existing vector types, including
>>>>>GNU-style vectors.  The new types are passed and returned in an
>>>>>efficient way while existing vector types are passed and returned
>>>>>in the traditional way.  In the right circumstances, the two types
>>>>>are inter-convertible.
>>>>>
>>>>>The SVE types are created using:
>>>>>
>>>>>      vectype = build_distinct_type_copy (vectype);
>>>>>      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
>>>>>      TYPE_ARTIFICIAL (vectype) = 1;
>>>>>
>>>>>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
>>>>>to convert from one type to the other.  However, the distinction
>can
>>>>>be lost during gimple, which treats two vector types with the same
>>>>>mode, number of elements, and element type as equivalent.  And for
>>>>>most targets that's the right thing to do.
>>>>
>>>> And why's that a problem? The difference appears only in the
>function
>>>call ABI which is determined by the function signature rather than
>>>types or modes of the actual arguments? 
>>>
>>>We use the type of the actual arguments when deciding how arguments
>>>should be passed to functions:
>>>
>>>/* I counts args in order (to be) pushed; ARGPOS counts in order
>>>written.  */
>>>  for (argpos = 0; argpos < num_actuals; i--, argpos++)
>>>    {
>>>      tree type = TREE_TYPE (args[i].tree_value);
>>>      [...]
>>>   /* See if this argument should be passed by invisible reference. 
>*/
>>>      function_arg_info arg (type, argpos < n_named_args);
>>>
>>>And it has to be that way for calls to unprototyped functions,
>>>or for varargs.
>>
>> So even for varargs the passing is different? Also we have
>CALL_EXPR_FNTYPE which you could populate specially even for
>unprototyped or varargs functions.
>>
>> I realize we now look at the type of values but you have to realize
>that differences that are not relevant for values are discarded. 
>Artificially preserving such non-real differences everywhere(!) while
>it only matters at call boundaries doesn't look correct. 
>
>But isn't this similar to the way that we preserve the difference
>between:
>
>  struct s1 { int i; };
>  struct s2 { int i; };
>
>?  They're the same value as far as the target machine is concerned,
>but we preserve the difference for other reasons.

With LTO we don't. Both get the same TYPE_CANONICAL. 

>>>The AArch64 port emits an error if calls pass values of SVE type to
>an
>>>unprototyped function.  To do that we need to know whether the value
>>>really is an SVE type rathr than a plain vector.
>>>
>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>same problem there once we support -msve-vector-bits=128, since the
>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>
>> But then why don't you have different modes?
>
>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>difference.  But from a vector value POV, a vector of 4 ints is a
>vector
>of 4 ints, so even distinguishing based on the mode is artificial.

True. 

>SVE is AFAIK the first target to have different modes for potentially
>the "same" vector type, and I had to add new infrastructure to allow
>targets to define multiple modes of the same size.  So the fact that
>gimple distinguishes otherwise identical vectors based on mode is a
>relatively recent thing.  AFAIK it just fell out in the wash rather
>than being deliberately planned.  It happens to be convenient in this
>context, but it hasn't been important until now.
>
>The hook doesn't seem any worse than distinguishing based on the mode.
>Another way to avoid this would have been to define separate SVE modes
>for the predefined vectors.  The big downside of that is that we'd end
>up doubling the number of SVE patterns.
>
>Extra on-the-side metadata is going to be easy to drop accidentally,
>and this is something we need for correctness rather than optimisation.

Still selecting the ABI during call expansion only and based on values types at that point is fragile. The frontend are in charge of specifying the actual argument type and at that point the target may fix the ABI. The ABI can be recorded in the calls fntype, either via its TYPE_ARG_TYPES or in more awkward ways for varargs functions (in full generality that would mean attaching varargs ABI meta to each call). 

The alternative is to have an actual argument type vector associated with each call.

Btw, how does STRIP_NOPS preserve the argument type if a conversion happens in the call and generic folding applies? IIRC that puns down to modes as well (which was the reason to make useless_type_conversion_p do the same). 

Sorry for the vague and late answers, I'm already in christmas mode ;) 

Richard. 

>Thanks,
>Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-13  8:41         ` Richard Biener
@ 2019-12-13  9:12           ` Richard Sandiford
  2019-12-13 12:25             ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2019-12-13  9:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
>>>>The AArch64 port emits an error if calls pass values of SVE type to
>>an
>>>>unprototyped function.  To do that we need to know whether the value
>>>>really is an SVE type rathr than a plain vector.
>>>>
>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>same problem there once we support -msve-vector-bits=128, since the
>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>
>>> But then why don't you have different modes?
>>
>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>difference.  But from a vector value POV, a vector of 4 ints is a
>>vector
>>of 4 ints, so even distinguishing based on the mode is artificial.
>
> True. 
>
>>SVE is AFAIK the first target to have different modes for potentially
>>the "same" vector type, and I had to add new infrastructure to allow
>>targets to define multiple modes of the same size.  So the fact that
>>gimple distinguishes otherwise identical vectors based on mode is a
>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>than being deliberately planned.  It happens to be convenient in this
>>context, but it hasn't been important until now.
>>
>>The hook doesn't seem any worse than distinguishing based on the mode.
>>Another way to avoid this would have been to define separate SVE modes
>>for the predefined vectors.  The big downside of that is that we'd end
>>up doubling the number of SVE patterns.
>>
>>Extra on-the-side metadata is going to be easy to drop accidentally,
>>and this is something we need for correctness rather than optimisation.
>
> Still selecting the ABI during call expansion only and based on values types at that point is fragile.

Agreed.  But it's fragile in general, not just for this case.  Changing
something as fundamental as that would be a lot of work and seems likely
to introduce accidental ABI breakage.

> The frontend are in charge of specifying the actual argument type and
> at that point the target may fix the ABI. The ABI can be recorded in
> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
> ways for varargs functions (in full generality that would mean
> attaching varargs ABI meta to each call).
>
> The alternative is to have an actual argument type vector associated
> with each call.

I think multiple pieces of gimple code would then have to cope with that
as a special case.  E.g. if:

   void foo (int, ...);

   type1 a;
   b = VIEW_CONVERT_EXPR<type2> (a);
   if (a)
     foo (1, a);
   else
     foo (1, b);

gets converted to:

   if (a)
     foo (1, a);
   else
     foo (1, a);

on the basis that type1 and type2 are "the same" despite having
different calling conventions, we have to be sure that the calls
are not treated as equivalent:

   foo (1, a);

Things like IPA clones would also need to handle this specially.
Anything that generates new calls based on old ones will need
to copy this information too.

This also sounds like it would be fragile and seems a bit too
invasive for stage 3.

> Btw, how does STRIP_NOPS preserve the argument type if a conversion
> happens in the call and generic folding applies? IIRC that puns down
> to modes as well (which was the reason to make
> useless_type_conversion_p do the same).

These are VIEW_CONVERT_EXPRs, which don't get stripped.

> Sorry for the vague and late answers, I'm already in christmas mode ;) 

Wish I was too :-)

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-13  9:12           ` Richard Sandiford
@ 2019-12-13 12:25             ` Richard Biener
  2019-12-13 13:10               ` Richard Sandiford
  2019-12-14 11:13               ` Richard Sandiford
  0 siblings, 2 replies; 14+ messages in thread
From: Richard Biener @ 2019-12-13 12:25 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>Richard Biener <richard.guenther@gmail.com> writes:
>>>>>The AArch64 port emits an error if calls pass values of SVE type to
>>>an
>>>>>unprototyped function.  To do that we need to know whether the
>value
>>>>>really is an SVE type rathr than a plain vector.
>>>>>
>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>>same problem there once we support -msve-vector-bits=128, since the
>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>>
>>>> But then why don't you have different modes?
>>>
>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>>difference.  But from a vector value POV, a vector of 4 ints is a
>>>vector
>>>of 4 ints, so even distinguishing based on the mode is artificial.
>>
>> True. 
>>
>>>SVE is AFAIK the first target to have different modes for potentially
>>>the "same" vector type, and I had to add new infrastructure to allow
>>>targets to define multiple modes of the same size.  So the fact that
>>>gimple distinguishes otherwise identical vectors based on mode is a
>>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>>than being deliberately planned.  It happens to be convenient in this
>>>context, but it hasn't been important until now.
>>>
>>>The hook doesn't seem any worse than distinguishing based on the
>mode.
>>>Another way to avoid this would have been to define separate SVE
>modes
>>>for the predefined vectors.  The big downside of that is that we'd
>end
>>>up doubling the number of SVE patterns.
>>>
>>>Extra on-the-side metadata is going to be easy to drop accidentally,
>>>and this is something we need for correctness rather than
>optimisation.
>>
>> Still selecting the ABI during call expansion only and based on
>values types at that point is fragile.
>
>Agreed.  But it's fragile in general, not just for this case.  Changing
>something as fundamental as that would be a lot of work and seems
>likely
>to introduce accidental ABI breakage.
>
>> The frontend are in charge of specifying the actual argument type and
>> at that point the target may fix the ABI. The ABI can be recorded in
>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>> ways for varargs functions (in full generality that would mean
>> attaching varargs ABI meta to each call).
>>
>> The alternative is to have an actual argument type vector associated
>> with each call.
>
>I think multiple pieces of gimple code would then have to cope with
>that
>as a special case.  E.g. if:
>
>   void foo (int, ...);
>
>   type1 a;
>   b = VIEW_CONVERT_EXPR<type2> (a);
>   if (a)
>     foo (1, a);
>   else
>     foo (1, b);
>
>gets converted to:
>
>   if (a)
>     foo (1, a);
>   else
>     foo (1, a);
>
>on the basis that type1 and type2 are "the same" despite having
>different calling conventions, we have to be sure that the calls
>are not treated as equivalent:
>
>   foo (1, a);
>
>Things like IPA clones would also need to handle this specially.
>Anything that generates new calls based on old ones will need
>to copy this information too.
>
>This also sounds like it would be fragile and seems a bit too
>invasive for stage 3.

But we are already relying on this to work (fntype non-propagation) because function pointer conversions are dropped on the floor. 

The real change would be introducing (per call) fntype for calls to unprototyped functions and somehow dealing with varargs. 

>> Btw, how does STRIP_NOPS preserve the argument type if a conversion
>> happens in the call and generic folding applies? IIRC that puns down
>> to modes as well (which was the reason to make
>> useless_type_conversion_p do the same).
>
>These are VIEW_CONVERT_EXPRs, which don't get stripped.

Ah, OK. Guess you're lucky there then. 

>> Sorry for the vague and late answers, I'm already in christmas mode
>;) 
>
>Wish I was too :-)

Only a few days left ;) 

Richard. 

>Thanks,
>Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-13 12:25             ` Richard Biener
@ 2019-12-13 13:10               ` Richard Sandiford
  2019-12-14 11:13               ` Richard Sandiford
  1 sibling, 0 replies; 14+ messages in thread
From: Richard Sandiford @ 2019-12-13 13:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
>>> The frontend are in charge of specifying the actual argument type and
>>> at that point the target may fix the ABI. The ABI can be recorded in
>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>>> ways for varargs functions (in full generality that would mean
>>> attaching varargs ABI meta to each call).
>>>
>>> The alternative is to have an actual argument type vector associated
>>> with each call.
>>
>>I think multiple pieces of gimple code would then have to cope with
>>that
>>as a special case.  E.g. if:
>>
>>   void foo (int, ...);
>>
>>   type1 a;
>>   b = VIEW_CONVERT_EXPR<type2> (a);
>>   if (a)
>>     foo (1, a);
>>   else
>>     foo (1, b);
>>
>>gets converted to:
>>
>>   if (a)
>>     foo (1, a);
>>   else
>>     foo (1, a);
>>
>>on the basis that type1 and type2 are "the same" despite having
>>different calling conventions, we have to be sure that the calls
>>are not treated as equivalent:
>>
>>   foo (1, a);
>>
>>Things like IPA clones would also need to handle this specially.
>>Anything that generates new calls based on old ones will need
>>to copy this information too.
>>
>>This also sounds like it would be fragile and seems a bit too
>>invasive for stage 3.
>
> But we are already relying on this to work (fntype non-propagation) because function pointer conversions are dropped on the floor. 
>
> The real change would be introducing (per call) fntype for calls to unprototyped functions and somehow dealing with varargs. 

Hmm, OK.  Any suggestions for how the varargs type should be
represented?  We can't just change (int, ...) to (int, foo, bar),
since varargs can be passed differently from non-varargs.  We'd need
something like "(int, ...) used as (int, foo, bar)" instead.

Currently TYPE_ARG_TYPES ends with void_list_node for non-varargs
and null for varargs.  Perhaps we could instead add a "..." marker, so:

<ellipsis_node>:
  unprototyped function

<type1, ellipsis_node>:
  (type1, ...) prototype

<type1, type2>:
  (type1, type2) prototype

<ellipsis_node, type1>:
  unprototyped function called with type1

<type1, ellipsis_node, type2>:
  (type1, ...) prototype called with (type1, type2)

If so, would something like that be OK during stage3?

Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-13 12:25             ` Richard Biener
  2019-12-13 13:10               ` Richard Sandiford
@ 2019-12-14 11:13               ` Richard Sandiford
  2019-12-14 14:34                 ` Richard Biener
  1 sibling, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2019-12-14 11:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>>Richard Biener <richard.guenther@gmail.com> writes:
>>>>>>The AArch64 port emits an error if calls pass values of SVE type to
>>>>an
>>>>>>unprototyped function.  To do that we need to know whether the
>>value
>>>>>>really is an SVE type rathr than a plain vector.
>>>>>>
>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>>>same problem there once we support -msve-vector-bits=128, since the
>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>>>
>>>>> But then why don't you have different modes?
>>>>
>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>>>difference.  But from a vector value POV, a vector of 4 ints is a
>>>>vector
>>>>of 4 ints, so even distinguishing based on the mode is artificial.
>>>
>>> True. 
>>>
>>>>SVE is AFAIK the first target to have different modes for potentially
>>>>the "same" vector type, and I had to add new infrastructure to allow
>>>>targets to define multiple modes of the same size.  So the fact that
>>>>gimple distinguishes otherwise identical vectors based on mode is a
>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>>>than being deliberately planned.  It happens to be convenient in this
>>>>context, but it hasn't been important until now.
>>>>
>>>>The hook doesn't seem any worse than distinguishing based on the
>>mode.
>>>>Another way to avoid this would have been to define separate SVE
>>modes
>>>>for the predefined vectors.  The big downside of that is that we'd
>>end
>>>>up doubling the number of SVE patterns.
>>>>
>>>>Extra on-the-side metadata is going to be easy to drop accidentally,
>>>>and this is something we need for correctness rather than
>>optimisation.
>>>
>>> Still selecting the ABI during call expansion only and based on
>>values types at that point is fragile.
>>
>>Agreed.  But it's fragile in general, not just for this case.  Changing
>>something as fundamental as that would be a lot of work and seems
>>likely
>>to introduce accidental ABI breakage.
>>
>>> The frontend are in charge of specifying the actual argument type and
>>> at that point the target may fix the ABI. The ABI can be recorded in
>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>>> ways for varargs functions (in full generality that would mean
>>> attaching varargs ABI meta to each call).
>>>
>>> The alternative is to have an actual argument type vector associated
>>> with each call.
>>
>>I think multiple pieces of gimple code would then have to cope with
>>that
>>as a special case.  E.g. if:
>>
>>   void foo (int, ...);
>>
>>   type1 a;
>>   b = VIEW_CONVERT_EXPR<type2> (a);
>>   if (a)
>>     foo (1, a);
>>   else
>>     foo (1, b);
>>
>>gets converted to:
>>
>>   if (a)
>>     foo (1, a);
>>   else
>>     foo (1, a);
>>
>>on the basis that type1 and type2 are "the same" despite having
>>different calling conventions, we have to be sure that the calls
>>are not treated as equivalent:
>>
>>   foo (1, a);
>>
>>Things like IPA clones would also need to handle this specially.
>>Anything that generates new calls based on old ones will need
>>to copy this information too.
>>
>>This also sounds like it would be fragile and seems a bit too
>>invasive for stage 3.
>
> But we are already relying on this to work (fntype non-propagation) because function pointer conversions are dropped on the floor. 
>
> The real change would be introducing (per call) fntype for calls to unprototyped functions and somehow dealing with varargs. 

It looks like this itself relies on useless_type_conversion_p,
is that right?  E.g. we have things like:

bool
func_checker::compare_gimple_call (gcall *s1, gcall *s2)
{
  ...
  tree fntype1 = gimple_call_fntype (s1);
  tree fntype2 = gimple_call_fntype (s2);
  if ((fntype1 && !fntype2)
      || (!fntype1 && fntype2)
      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
    return return_false_with_msg ("call function types are not compatible");

and useless_type_conversion_p has:

  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
	    || TREE_CODE (inner_type) == METHOD_TYPE)
	   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
    {
      tree outer_parm, inner_parm;

      /* If the return types are not compatible bail out.  */
      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
				      TREE_TYPE (inner_type)))
	return false;

      /* Method types should belong to a compatible base class.  */
      if (TREE_CODE (inner_type) == METHOD_TYPE
	  && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
					 TYPE_METHOD_BASETYPE (inner_type)))
	return false;

      /* A conversion to an unprototyped argument list is ok.  */
      if (!prototype_p (outer_type))
	return true;

      /* If the unqualified argument types are compatible the conversion
	 is useless.  */
      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
	return true;

      for (outer_parm = TYPE_ARG_TYPES (outer_type),
	   inner_parm = TYPE_ARG_TYPES (inner_type);
	   outer_parm && inner_parm;
	   outer_parm = TREE_CHAIN (outer_parm),
	   inner_parm = TREE_CHAIN (inner_parm))
	if (!useless_type_conversion_p
	       (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
		TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
	  return false;

So it looks like we'd still need to distinguish the vector types in
useless_type_conversion_p even if we went the fntype route.  The difference
is that the fntype route would give us the option of only distinguishing
the vectors for return and argument types and not in general.

But if we are going to have to distinguish the vectors here anyway
in some form, could we go with the patch as-is for stage 3 and leave
restricting this to just return and argument types as a follow-on
optimisation?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-14 11:13               ` Richard Sandiford
@ 2019-12-14 14:34                 ` Richard Biener
  2019-12-16 16:02                   ` Richard Sandiford
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2019-12-14 14:34 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches

On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>Richard Biener <richard.guenther@gmail.com> writes:
>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford
><richard.sandiford@arm.com> wrote:
>>>Richard Biener <richard.guenther@gmail.com> writes:
>>>>>>>The AArch64 port emits an error if calls pass values of SVE type
>to
>>>>>an
>>>>>>>unprototyped function.  To do that we need to know whether the
>>>value
>>>>>>>really is an SVE type rathr than a plain vector.
>>>>>>>
>>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>>>>same problem there once we support -msve-vector-bits=128, since
>the
>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>>>>
>>>>>> But then why don't you have different modes?
>>>>>
>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>>>>difference.  But from a vector value POV, a vector of 4 ints is a
>>>>>vector
>>>>>of 4 ints, so even distinguishing based on the mode is artificial.
>>>>
>>>> True. 
>>>>
>>>>>SVE is AFAIK the first target to have different modes for
>potentially
>>>>>the "same" vector type, and I had to add new infrastructure to
>allow
>>>>>targets to define multiple modes of the same size.  So the fact
>that
>>>>>gimple distinguishes otherwise identical vectors based on mode is a
>>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>>>>than being deliberately planned.  It happens to be convenient in
>this
>>>>>context, but it hasn't been important until now.
>>>>>
>>>>>The hook doesn't seem any worse than distinguishing based on the
>>>mode.
>>>>>Another way to avoid this would have been to define separate SVE
>>>modes
>>>>>for the predefined vectors.  The big downside of that is that we'd
>>>end
>>>>>up doubling the number of SVE patterns.
>>>>>
>>>>>Extra on-the-side metadata is going to be easy to drop
>accidentally,
>>>>>and this is something we need for correctness rather than
>>>optimisation.
>>>>
>>>> Still selecting the ABI during call expansion only and based on
>>>values types at that point is fragile.
>>>
>>>Agreed.  But it's fragile in general, not just for this case. 
>Changing
>>>something as fundamental as that would be a lot of work and seems
>>>likely
>>>to introduce accidental ABI breakage.
>>>
>>>> The frontend are in charge of specifying the actual argument type
>and
>>>> at that point the target may fix the ABI. The ABI can be recorded
>in
>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>>>> ways for varargs functions (in full generality that would mean
>>>> attaching varargs ABI meta to each call).
>>>>
>>>> The alternative is to have an actual argument type vector
>associated
>>>> with each call.
>>>
>>>I think multiple pieces of gimple code would then have to cope with
>>>that
>>>as a special case.  E.g. if:
>>>
>>>   void foo (int, ...);
>>>
>>>   type1 a;
>>>   b = VIEW_CONVERT_EXPR<type2> (a);
>>>   if (a)
>>>     foo (1, a);
>>>   else
>>>     foo (1, b);
>>>
>>>gets converted to:
>>>
>>>   if (a)
>>>     foo (1, a);
>>>   else
>>>     foo (1, a);
>>>
>>>on the basis that type1 and type2 are "the same" despite having
>>>different calling conventions, we have to be sure that the calls
>>>are not treated as equivalent:
>>>
>>>   foo (1, a);
>>>
>>>Things like IPA clones would also need to handle this specially.
>>>Anything that generates new calls based on old ones will need
>>>to copy this information too.
>>>
>>>This also sounds like it would be fragile and seems a bit too
>>>invasive for stage 3.
>>
>> But we are already relying on this to work (fntype non-propagation)
>because function pointer conversions are dropped on the floor. 
>>
>> The real change would be introducing (per call) fntype for calls to
>unprototyped functions and somehow dealing with varargs. 
>
>It looks like this itself relies on useless_type_conversion_p,
>is that right?  E.g. we have things like:
>
>bool
>func_checker::compare_gimple_call (gcall *s1, gcall *s2)
>{
>  ...
>  tree fntype1 = gimple_call_fntype (s1);
>  tree fntype2 = gimple_call_fntype (s2);
>  if ((fntype1 && !fntype2)
>      || (!fntype1 && fntype2)
>      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
>return return_false_with_msg ("call function types are not
>compatible");
>
>and useless_type_conversion_p has:
>
>  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
>	    || TREE_CODE (inner_type) == METHOD_TYPE)
>	   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
>    {
>      tree outer_parm, inner_parm;
>
>      /* If the return types are not compatible bail out.  */
>      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
>				      TREE_TYPE (inner_type)))
>	return false;
>
>      /* Method types should belong to a compatible base class.  */
>      if (TREE_CODE (inner_type) == METHOD_TYPE
>	  && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
>					 TYPE_METHOD_BASETYPE (inner_type)))
>	return false;
>
>      /* A conversion to an unprototyped argument list is ok.  */
>      if (!prototype_p (outer_type))
>	return true;
>
>     /* If the unqualified argument types are compatible the conversion
>	 is useless.  */
>      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
>	return true;
>
>      for (outer_parm = TYPE_ARG_TYPES (outer_type),
>	   inner_parm = TYPE_ARG_TYPES (inner_type);
>	   outer_parm && inner_parm;
>	   outer_parm = TREE_CHAIN (outer_parm),
>	   inner_parm = TREE_CHAIN (inner_parm))
>	if (!useless_type_conversion_p
>	       (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
>		TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
>	  return false;
>
>So it looks like we'd still need to distinguish the vector types in
>useless_type_conversion_p even if we went the fntype route.  The
>difference
>is that the fntype route would give us the option of only
>distinguishing
>the vectors for return and argument types and not in general.
>
>But if we are going to have to distinguish the vectors here anyway
>in some form, could we go with the patch as-is for stage 3 and leave
>restricting this to just return and argument types as a follow-on
>optimisation?

How does this get around the LTO canonical type merging machinery? That is, how are those types streamed and how are they identified by the backend? Just by means of being pointer equal to some statically built type in the backend? 
Or does the type have some attribute on it or on the component? How does the middle end build a related type with the same ABI, like a vector with the half number of elements? 

Richard. 

>Thanks,
>Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-14 14:34                 ` Richard Biener
@ 2019-12-16 16:02                   ` Richard Sandiford
  2020-01-07 10:33                     ` Richard Sandiford
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2019-12-16 16:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>>Richard Biener <richard.guenther@gmail.com> writes:
>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford
>><richard.sandiford@arm.com> wrote:
>>>>Richard Biener <richard.guenther@gmail.com> writes:
>>>>>>>>The AArch64 port emits an error if calls pass values of SVE type
>>to
>>>>>>an
>>>>>>>>unprototyped function.  To do that we need to know whether the
>>>>value
>>>>>>>>really is an SVE type rathr than a plain vector.
>>>>>>>>
>>>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>>>>>same problem there once we support -msve-vector-bits=128, since
>>the
>>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>>>>>
>>>>>>> But then why don't you have different modes?
>>>>>>
>>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>>>>>difference.  But from a vector value POV, a vector of 4 ints is a
>>>>>>vector
>>>>>>of 4 ints, so even distinguishing based on the mode is artificial.
>>>>>
>>>>> True. 
>>>>>
>>>>>>SVE is AFAIK the first target to have different modes for
>>potentially
>>>>>>the "same" vector type, and I had to add new infrastructure to
>>allow
>>>>>>targets to define multiple modes of the same size.  So the fact
>>that
>>>>>>gimple distinguishes otherwise identical vectors based on mode is a
>>>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>>>>>than being deliberately planned.  It happens to be convenient in
>>this
>>>>>>context, but it hasn't been important until now.
>>>>>>
>>>>>>The hook doesn't seem any worse than distinguishing based on the
>>>>mode.
>>>>>>Another way to avoid this would have been to define separate SVE
>>>>modes
>>>>>>for the predefined vectors.  The big downside of that is that we'd
>>>>end
>>>>>>up doubling the number of SVE patterns.
>>>>>>
>>>>>>Extra on-the-side metadata is going to be easy to drop
>>accidentally,
>>>>>>and this is something we need for correctness rather than
>>>>optimisation.
>>>>>
>>>>> Still selecting the ABI during call expansion only and based on
>>>>values types at that point is fragile.
>>>>
>>>>Agreed.  But it's fragile in general, not just for this case. 
>>Changing
>>>>something as fundamental as that would be a lot of work and seems
>>>>likely
>>>>to introduce accidental ABI breakage.
>>>>
>>>>> The frontend are in charge of specifying the actual argument type
>>and
>>>>> at that point the target may fix the ABI. The ABI can be recorded
>>in
>>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>>>>> ways for varargs functions (in full generality that would mean
>>>>> attaching varargs ABI meta to each call).
>>>>>
>>>>> The alternative is to have an actual argument type vector
>>associated
>>>>> with each call.
>>>>
>>>>I think multiple pieces of gimple code would then have to cope with
>>>>that
>>>>as a special case.  E.g. if:
>>>>
>>>>   void foo (int, ...);
>>>>
>>>>   type1 a;
>>>>   b = VIEW_CONVERT_EXPR<type2> (a);
>>>>   if (a)
>>>>     foo (1, a);
>>>>   else
>>>>     foo (1, b);
>>>>
>>>>gets converted to:
>>>>
>>>>   if (a)
>>>>     foo (1, a);
>>>>   else
>>>>     foo (1, a);
>>>>
>>>>on the basis that type1 and type2 are "the same" despite having
>>>>different calling conventions, we have to be sure that the calls
>>>>are not treated as equivalent:
>>>>
>>>>   foo (1, a);
>>>>
>>>>Things like IPA clones would also need to handle this specially.
>>>>Anything that generates new calls based on old ones will need
>>>>to copy this information too.
>>>>
>>>>This also sounds like it would be fragile and seems a bit too
>>>>invasive for stage 3.
>>>
>>> But we are already relying on this to work (fntype non-propagation)
>>because function pointer conversions are dropped on the floor. 
>>>
>>> The real change would be introducing (per call) fntype for calls to
>>unprototyped functions and somehow dealing with varargs. 
>>
>>It looks like this itself relies on useless_type_conversion_p,
>>is that right?  E.g. we have things like:
>>
>>bool
>>func_checker::compare_gimple_call (gcall *s1, gcall *s2)
>>{
>>  ...
>>  tree fntype1 = gimple_call_fntype (s1);
>>  tree fntype2 = gimple_call_fntype (s2);
>>  if ((fntype1 && !fntype2)
>>      || (!fntype1 && fntype2)
>>      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
>>return return_false_with_msg ("call function types are not
>>compatible");
>>
>>and useless_type_conversion_p has:
>>
>>  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
>>	    || TREE_CODE (inner_type) == METHOD_TYPE)
>>	   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
>>    {
>>      tree outer_parm, inner_parm;
>>
>>      /* If the return types are not compatible bail out.  */
>>      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
>>				      TREE_TYPE (inner_type)))
>>	return false;
>>
>>      /* Method types should belong to a compatible base class.  */
>>      if (TREE_CODE (inner_type) == METHOD_TYPE
>>	  && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
>>					 TYPE_METHOD_BASETYPE (inner_type)))
>>	return false;
>>
>>      /* A conversion to an unprototyped argument list is ok.  */
>>      if (!prototype_p (outer_type))
>>	return true;
>>
>>     /* If the unqualified argument types are compatible the conversion
>>	 is useless.  */
>>      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
>>	return true;
>>
>>      for (outer_parm = TYPE_ARG_TYPES (outer_type),
>>	   inner_parm = TYPE_ARG_TYPES (inner_type);
>>	   outer_parm && inner_parm;
>>	   outer_parm = TREE_CHAIN (outer_parm),
>>	   inner_parm = TREE_CHAIN (inner_parm))
>>	if (!useless_type_conversion_p
>>	       (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
>>		TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
>>	  return false;
>>
>>So it looks like we'd still need to distinguish the vector types in
>>useless_type_conversion_p even if we went the fntype route.  The
>>difference
>>is that the fntype route would give us the option of only
>>distinguishing
>>the vectors for return and argument types and not in general.
>>
>>But if we are going to have to distinguish the vectors here anyway
>>in some form, could we go with the patch as-is for stage 3 and leave
>>restricting this to just return and argument types as a follow-on
>>optimisation?
>
> How does this get around the LTO canonical type merging machinery? That is, how are those types streamed and how are they identified by the backend? Just by means of being pointer equal to some statically built type in the backend? 
> Or does the type have some attribute on it or on the component? How does the middle end build a related type with the same ABI, like a vector with the half number of elements? 

Hmm...

At the moment it's based on pointer equality between the TYPE_MAIN_VARIANT
and statically-built types.  We predefine the only available SVE "ABI types"
and there's no way to create "new" ones.

But you're right that that doesn't work for LTO -- in general, not just
for this conversion patch -- because no streamed types end up as ABI types.
So we'll need an attribute after all, with the ABI decisions keyed off that
rather than TYPE_MAIN_VARIANT pointer equality.  Will fix...

Once that's fixed, the fact that we use SET_TYPE_STRUCTURAL_EQUALITY
for the ABI types means that the types remain distinct from "normal"
vector types even for TYPE_CANONICAL purposes, since:

     As a special case, if TYPE_CANONICAL is NULL_TREE, and thus
     TYPE_STRUCTURAL_EQUALITY_P is true, then it cannot
     be used for comparison against other types.  Instead, the type is
     said to require structural equality checks, described in
     TYPE_STRUCTURAL_EQUALITY_P.
     [...]
  #define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type_common.canonical)
  /* Indicates that the type node requires structural equality
     checks.  The compiler will need to look at the composition of the
     type to determine whether it is equal to another type, rather than
     just comparing canonical type pointers.  For instance, we would need
     to look at the return and parameter types of a FUNCTION_TYPE
     node.  */
  #define TYPE_STRUCTURAL_EQUALITY_P(NODE) (TYPE_CANONICAL (NODE) == NULL_TREE)

We also have:

/* Return ture if get_alias_set care about TYPE_CANONICAL of given type.
   We don't define the types for pointers, arrays and vectors.  The reason is
   that pointers are handled specially: ptr_type_node accesses conflict with
   accesses to all other pointers.  This is done by alias.c.
   Because alias sets of arrays and vectors are the same as types of their
   elements, we can't compute canonical type either.  Otherwise we could go
   form void *[10] to int *[10] (because they are equivalent for canonical type
   machinery) and get wrong TBAA.  */

inline bool
canonical_type_used_p (const_tree t)
{
  return !(POINTER_TYPE_P (t)
	   || TREE_CODE (t) == ARRAY_TYPE
	   || TREE_CODE (t) == VECTOR_TYPE);
}

So with the attribute added (needed anyway), the patch does seem to
work for LTO too.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2019-12-16 16:02                   ` Richard Sandiford
@ 2020-01-07 10:33                     ` Richard Sandiford
  2020-01-09 13:26                       ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2020-01-07 10:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Sandiford <richard.sandiford@arm.com> writes:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
>>>Richard Biener <richard.guenther@gmail.com> writes:
>>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford
>>><richard.sandiford@arm.com> wrote:
>>>>>Richard Biener <richard.guenther@gmail.com> writes:
>>>>>>>>>The AArch64 port emits an error if calls pass values of SVE type
>>>to
>>>>>>>an
>>>>>>>>>unprototyped function.  To do that we need to know whether the
>>>>>value
>>>>>>>>>really is an SVE type rathr than a plain vector.
>>>>>>>>>
>>>>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>>>>>>same problem there once we support -msve-vector-bits=128, since
>>>the
>>>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>>>>>>
>>>>>>>> But then why don't you have different modes?
>>>>>>>
>>>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>>>>>>difference.  But from a vector value POV, a vector of 4 ints is a
>>>>>>>vector
>>>>>>>of 4 ints, so even distinguishing based on the mode is artificial.
>>>>>>
>>>>>> True. 
>>>>>>
>>>>>>>SVE is AFAIK the first target to have different modes for
>>>potentially
>>>>>>>the "same" vector type, and I had to add new infrastructure to
>>>allow
>>>>>>>targets to define multiple modes of the same size.  So the fact
>>>that
>>>>>>>gimple distinguishes otherwise identical vectors based on mode is a
>>>>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>>>>>>than being deliberately planned.  It happens to be convenient in
>>>this
>>>>>>>context, but it hasn't been important until now.
>>>>>>>
>>>>>>>The hook doesn't seem any worse than distinguishing based on the
>>>>>mode.
>>>>>>>Another way to avoid this would have been to define separate SVE
>>>>>modes
>>>>>>>for the predefined vectors.  The big downside of that is that we'd
>>>>>end
>>>>>>>up doubling the number of SVE patterns.
>>>>>>>
>>>>>>>Extra on-the-side metadata is going to be easy to drop
>>>accidentally,
>>>>>>>and this is something we need for correctness rather than
>>>>>optimisation.
>>>>>>
>>>>>> Still selecting the ABI during call expansion only and based on
>>>>>values types at that point is fragile.
>>>>>
>>>>>Agreed.  But it's fragile in general, not just for this case. 
>>>Changing
>>>>>something as fundamental as that would be a lot of work and seems
>>>>>likely
>>>>>to introduce accidental ABI breakage.
>>>>>
>>>>>> The frontend are in charge of specifying the actual argument type
>>>and
>>>>>> at that point the target may fix the ABI. The ABI can be recorded
>>>in
>>>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>>>>>> ways for varargs functions (in full generality that would mean
>>>>>> attaching varargs ABI meta to each call).
>>>>>>
>>>>>> The alternative is to have an actual argument type vector
>>>associated
>>>>>> with each call.
>>>>>
>>>>>I think multiple pieces of gimple code would then have to cope with
>>>>>that
>>>>>as a special case.  E.g. if:
>>>>>
>>>>>   void foo (int, ...);
>>>>>
>>>>>   type1 a;
>>>>>   b = VIEW_CONVERT_EXPR<type2> (a);
>>>>>   if (a)
>>>>>     foo (1, a);
>>>>>   else
>>>>>     foo (1, b);
>>>>>
>>>>>gets converted to:
>>>>>
>>>>>   if (a)
>>>>>     foo (1, a);
>>>>>   else
>>>>>     foo (1, a);
>>>>>
>>>>>on the basis that type1 and type2 are "the same" despite having
>>>>>different calling conventions, we have to be sure that the calls
>>>>>are not treated as equivalent:
>>>>>
>>>>>   foo (1, a);
>>>>>
>>>>>Things like IPA clones would also need to handle this specially.
>>>>>Anything that generates new calls based on old ones will need
>>>>>to copy this information too.
>>>>>
>>>>>This also sounds like it would be fragile and seems a bit too
>>>>>invasive for stage 3.
>>>>
>>>> But we are already relying on this to work (fntype non-propagation)
>>>because function pointer conversions are dropped on the floor. 
>>>>
>>>> The real change would be introducing (per call) fntype for calls to
>>>unprototyped functions and somehow dealing with varargs. 
>>>
>>>It looks like this itself relies on useless_type_conversion_p,
>>>is that right?  E.g. we have things like:
>>>
>>>bool
>>>func_checker::compare_gimple_call (gcall *s1, gcall *s2)
>>>{
>>>  ...
>>>  tree fntype1 = gimple_call_fntype (s1);
>>>  tree fntype2 = gimple_call_fntype (s2);
>>>  if ((fntype1 && !fntype2)
>>>      || (!fntype1 && fntype2)
>>>      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
>>>return return_false_with_msg ("call function types are not
>>>compatible");
>>>
>>>and useless_type_conversion_p has:
>>>
>>>  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
>>>	    || TREE_CODE (inner_type) == METHOD_TYPE)
>>>	   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
>>>    {
>>>      tree outer_parm, inner_parm;
>>>
>>>      /* If the return types are not compatible bail out.  */
>>>      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
>>>				      TREE_TYPE (inner_type)))
>>>	return false;
>>>
>>>      /* Method types should belong to a compatible base class.  */
>>>      if (TREE_CODE (inner_type) == METHOD_TYPE
>>>	  && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
>>>					 TYPE_METHOD_BASETYPE (inner_type)))
>>>	return false;
>>>
>>>      /* A conversion to an unprototyped argument list is ok.  */
>>>      if (!prototype_p (outer_type))
>>>	return true;
>>>
>>>     /* If the unqualified argument types are compatible the conversion
>>>	 is useless.  */
>>>      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
>>>	return true;
>>>
>>>      for (outer_parm = TYPE_ARG_TYPES (outer_type),
>>>	   inner_parm = TYPE_ARG_TYPES (inner_type);
>>>	   outer_parm && inner_parm;
>>>	   outer_parm = TREE_CHAIN (outer_parm),
>>>	   inner_parm = TREE_CHAIN (inner_parm))
>>>	if (!useless_type_conversion_p
>>>	       (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
>>>		TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
>>>	  return false;
>>>
>>>So it looks like we'd still need to distinguish the vector types in
>>>useless_type_conversion_p even if we went the fntype route.  The
>>>difference
>>>is that the fntype route would give us the option of only
>>>distinguishing
>>>the vectors for return and argument types and not in general.
>>>
>>>But if we are going to have to distinguish the vectors here anyway
>>>in some form, could we go with the patch as-is for stage 3 and leave
>>>restricting this to just return and argument types as a follow-on
>>>optimisation?
>>
>> How does this get around the LTO canonical type merging machinery? That is, how are those types streamed and how are they identified by the backend? Just by means of being pointer equal to some statically built type in the backend? 
>> Or does the type have some attribute on it or on the component? How does the middle end build a related type with the same ABI, like a vector with the half number of elements? 
>
> Hmm...
>
> At the moment it's based on pointer equality between the TYPE_MAIN_VARIANT
> and statically-built types.  We predefine the only available SVE "ABI types"
> and there's no way to create "new" ones.
>
> But you're right that that doesn't work for LTO -- in general, not just
> for this conversion patch -- because no streamed types end up as ABI types.
> So we'll need an attribute after all, with the ABI decisions keyed off that
> rather than TYPE_MAIN_VARIANT pointer equality.  Will fix...

Now fixed :-)

> Once that's fixed, the fact that we use SET_TYPE_STRUCTURAL_EQUALITY
> for the ABI types means that the types remain distinct from "normal"
> vector types even for TYPE_CANONICAL purposes, since:
>
>      As a special case, if TYPE_CANONICAL is NULL_TREE, and thus
>      TYPE_STRUCTURAL_EQUALITY_P is true, then it cannot
>      be used for comparison against other types.  Instead, the type is
>      said to require structural equality checks, described in
>      TYPE_STRUCTURAL_EQUALITY_P.
>      [...]
>   #define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type_common.canonical)
>   /* Indicates that the type node requires structural equality
>      checks.  The compiler will need to look at the composition of the
>      type to determine whether it is equal to another type, rather than
>      just comparing canonical type pointers.  For instance, we would need
>      to look at the return and parameter types of a FUNCTION_TYPE
>      node.  */
>   #define TYPE_STRUCTURAL_EQUALITY_P(NODE) (TYPE_CANONICAL (NODE) == NULL_TREE)
>
> We also have:
>
> /* Return ture if get_alias_set care about TYPE_CANONICAL of given type.
>    We don't define the types for pointers, arrays and vectors.  The reason is
>    that pointers are handled specially: ptr_type_node accesses conflict with
>    accesses to all other pointers.  This is done by alias.c.
>    Because alias sets of arrays and vectors are the same as types of their
>    elements, we can't compute canonical type either.  Otherwise we could go
>    form void *[10] to int *[10] (because they are equivalent for canonical type
>    machinery) and get wrong TBAA.  */
>
> inline bool
> canonical_type_used_p (const_tree t)
> {
>   return !(POINTER_TYPE_P (t)
> 	   || TREE_CODE (t) == ARRAY_TYPE
> 	   || TREE_CODE (t) == VECTOR_TYPE);
> }
>
> So with the attribute added (needed anyway), the patch does seem to
> work for LTO too.

Given the above, is the patch OK?  I agree it isn't very elegant,
but at the moment we have no choice but to distinguish the vector
types at some point during gimple.

Thanks,
Richard


2020-01-07  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (compatible_vector_types_p): New target hook.
	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
	* doc/tm.texi: Regenerate.
	* gimple-expr.c: Include target.h.
	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
	function.
	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
	* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
	Use the original predicate if it already has a suitable type.

gcc/testsuite/
	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2020-01-06 12:57:55.753930730 +0000
+++ gcc/target.def	2020-01-07 10:24:01.546344751 +0000
@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
  hook_bool_mode_false)
 
 DEFHOOK
+(compatible_vector_types_p,
+ "Return true if there is no target-specific reason for treating\n\
+vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
+has already checked for target-independent reasons, meaning that the\n\
+types are known to have the same mode, to have the same number of elements,\n\
+and to have what the caller considers to be compatible element types.\n\
+\n\
+The main reason for defining this hook is to reject pairs of types\n\
+that are handled differently by the target's calling convention.\n\
+For example, when a new @var{N}-bit vector architecture is added\n\
+to a target, the target may want to handle normal @var{N}-bit\n\
+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
+before, to maintain backwards compatibility.  However, it may also\n\
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
+and returned in a more efficient way.  It is then important to maintain\n\
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
+architecture-specific ones.\n\
+\n\
+The default implementation returns true, which is correct for most targets.",
+ bool, (const_tree type1, const_tree type2),
+ hook_bool_const_tree_const_tree_true)
+
+DEFHOOK
 (vector_alignment,
  "This hook can be used to define the alignment for a vector of type\n\
 @var{type}, in order to comply with a platform ABI.  The default is to\n\
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h	2020-01-06 12:57:54.749937335 +0000
+++ gcc/hooks.h	2020-01-07 10:24:01.542344777 +0000
@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
 extern bool hook_bool_const_tree_false (const_tree);
+extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
 extern bool hook_bool_tree_true (tree);
 extern bool hook_bool_const_tree_true (const_tree);
 extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c	2020-01-06 12:57:54.745937361 +0000
+++ gcc/hooks.c	2020-01-07 10:24:01.542344777 +0000
@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
 }
 
 bool
+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
+{
+  return true;
+}
+
+bool
 hook_bool_tree_true (tree)
 {
   return true;
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2020-01-06 12:57:53.657944518 +0000
+++ gcc/doc/tm.texi.in	2020-01-07 10:24:01.542344777 +0000
@@ -3365,6 +3365,8 @@ stack.
 
 @hook TARGET_VECTOR_MODE_SUPPORTED_P
 
+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
+
 @hook TARGET_ARRAY_MODE
 
 @hook TARGET_ARRAY_MODE_SUPPORTED_P
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2020-01-06 12:57:53.649944570 +0000
+++ gcc/doc/tm.texi	2020-01-07 10:24:01.542344777 +0000
@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
 must have move patterns for this mode.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2})
+Return true if there is no target-specific reason for treating
+vector types @var{type1} and @var{type2} as distinct types.  The caller
+has already checked for target-independent reasons, meaning that the
+types are known to have the same mode, to have the same number of elements,
+and to have what the caller considers to be compatible element types.
+
+The main reason for defining this hook is to reject pairs of types
+that are handled differently by the target's calling convention.
+For example, when a new @var{N}-bit vector architecture is added
+to a target, the target may want to handle normal @var{N}-bit
+@code{VECTOR_TYPE} arguments and return values in the same way as
+before, to maintain backwards compatibility.  However, it may also
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
+and returned in a more efficient way.  It is then important to maintain
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
+architecture-specific ones.
+
+The default implementation returns true, which is correct for most targets.
+@end deftypefn
+
 @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
 Return the mode that GCC should use for an array that has
 @var{nelems} elements, with each element having mode @var{mode}.
Index: gcc/gimple-expr.c
===================================================================
--- gcc/gimple-expr.c	2020-01-06 12:58:10.545833431 +0000
+++ gcc/gimple-expr.c	2020-01-07 10:24:01.542344777 +0000
@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "target.h"
 
 /* ----- Type related -----  */
 
@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
 
   /* Recurse for vector types with the same number of subparts.  */
   else if (TREE_CODE (inner_type) == VECTOR_TYPE
-	   && TREE_CODE (outer_type) == VECTOR_TYPE
-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
-    return useless_type_conversion_p (TREE_TYPE (outer_type),
-				      TREE_TYPE (inner_type));
+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
+		      TYPE_VECTOR_SUBPARTS (outer_type))
+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
+					  TREE_TYPE (inner_type))
+	    && targetm.compatible_vector_types_p (inner_type, outer_type));
 
   else if (TREE_CODE (inner_type) == ARRAY_TYPE
 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2020-01-07 10:18:06.572651552 +0000
+++ gcc/config/aarch64/aarch64.c	2020-01-07 10:24:01.538344801 +0000
@@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype)
   return default_function_abi;
 }
 
+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
+
+static bool
+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
+{
+  return (aarch64_sve::builtin_type_p (type1)
+	  == aarch64_sve::builtin_type_p (type2));
+}
+
 /* Return true if we should emit CFI for register REGNO.  */
 
 static bool
@@ -22099,6 +22108,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
 
+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
+#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
+
 #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
   aarch64_builtin_support_vector_misalignment
Index: gcc/config/aarch64/aarch64-sve-builtins.cc
===================================================================
--- gcc/config/aarch64/aarch64-sve-builtins.cc	2020-01-07 10:21:17.575410530 +0000
+++ gcc/config/aarch64/aarch64-sve-builtins.cc	2020-01-07 10:24:01.534344828 +0000
@@ -2265,9 +2265,13 @@ tree
 gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
 			     unsigned int argno)
 {
-  tree predtype = truth_type_for (vectype);
   tree pred = gimple_call_arg (call, argno);
-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
+		TYPE_VECTOR_SUBPARTS (vectype)))
+    return pred;
+
+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
+		       truth_type_for (vectype), pred);
 }
 
 /* Return a pointer to the address in a contiguous load or store,
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c	2020-01-07 10:24:01.546344751 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (float16x16_t);
+void float32_callee (float32x8_t);
+void float64_callee (float64x4_t);
+void int8_callee (int8x32_t);
+void int16_callee (int16x16_t);
+void int32_callee (int32x8_t);
+void int64_callee (int64x4_t);
+void uint8_callee (uint8x32_t);
+void uint16_callee (uint16x16_t);
+void uint32_callee (uint32x8_t);
+void uint64_callee (uint64x4_t);
+
+void
+float16_caller (void)
+{
+  float16_callee (svdup_f16 (1.0));
+}
+
+void
+float32_caller (void)
+{
+  float32_callee (svdup_f32 (2.0));
+}
+
+void
+float64_caller (void)
+{
+  float64_callee (svdup_f64 (3.0));
+}
+
+void
+int8_caller (void)
+{
+  int8_callee (svindex_s8 (0, 1));
+}
+
+void
+int16_caller (void)
+{
+  int16_callee (svindex_s16 (0, 2));
+}
+
+void
+int32_caller (void)
+{
+  int32_callee (svindex_s32 (0, 3));
+}
+
+void
+int64_caller (void)
+{
+  int64_callee (svindex_s64 (0, 4));
+}
+
+void
+uint8_caller (void)
+{
+  uint8_callee (svindex_u8 (1, 1));
+}
+
+void
+uint16_caller (void)
+{
+  uint16_callee (svindex_u16 (1, 2));
+}
+
+void
+uint32_caller (void)
+{
+  uint32_callee (svindex_u32 (1, 3));
+}
+
+void
+uint64_caller (void)
+{
+  uint64_callee (svindex_u64 (1, 4));
+}
+
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c	2020-01-07 10:24:01.546344751 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (svfloat16_t);
+void float32_callee (svfloat32_t);
+void float64_callee (svfloat64_t);
+void int8_callee (svint8_t);
+void int16_callee (svint16_t);
+void int32_callee (svint32_t);
+void int64_callee (svint64_t);
+void uint8_callee (svuint8_t);
+void uint16_callee (svuint16_t);
+void uint32_callee (svuint32_t);
+void uint64_callee (svuint64_t);
+
+void
+float16_caller (float16x16_t arg)
+{
+  float16_callee (arg);
+}
+
+void
+float32_caller (float32x8_t arg)
+{
+  float32_callee (arg);
+}
+
+void
+float64_caller (float64x4_t arg)
+{
+  float64_callee (arg);
+}
+
+void
+int8_caller (int8x32_t arg)
+{
+  int8_callee (arg);
+}
+
+void
+int16_caller (int16x16_t arg)
+{
+  int16_callee (arg);
+}
+
+void
+int32_caller (int32x8_t arg)
+{
+  int32_callee (arg);
+}
+
+void
+int64_caller (int64x4_t arg)
+{
+  int64_callee (arg);
+}
+
+void
+uint8_caller (uint8x32_t arg)
+{
+  uint8_callee (arg);
+}
+
+void
+uint16_caller (uint16x16_t arg)
+{
+  uint16_callee (arg);
+}
+
+void
+uint32_caller (uint32x8_t arg)
+{
+  uint32_callee (arg);
+}
+
+void
+uint64_caller (uint64x4_t arg)
+{
+  uint64_callee (arg);
+}
+
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Add a compatible_vector_types_p target hook
  2020-01-07 10:33                     ` Richard Sandiford
@ 2020-01-09 13:26                       ` Richard Biener
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Biener @ 2020-01-09 13:26 UTC (permalink / raw)
  To: Richard Biener, GCC Patches, Richard Sandiford

On Tue, Jan 7, 2020 at 11:33 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > Richard Biener <richard.guenther@gmail.com> writes:
> >> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford <richard.sandiford@arm.com> wrote:
> >>>Richard Biener <richard.guenther@gmail.com> writes:
> >>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford
> >>><richard.sandiford@arm.com> wrote:
> >>>>>Richard Biener <richard.guenther@gmail.com> writes:
> >>>>>>>>>The AArch64 port emits an error if calls pass values of SVE type
> >>>to
> >>>>>>>an
> >>>>>>>>>unprototyped function.  To do that we need to know whether the
> >>>>>value
> >>>>>>>>>really is an SVE type rathr than a plain vector.
> >>>>>>>>>
> >>>>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
> >>>>>>>>>same problem there once we support -msve-vector-bits=128, since
> >>>the
> >>>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
> >>>>>>>>
> >>>>>>>> But then why don't you have different modes?
> >>>>>>>
> >>>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
> >>>>>>>difference.  But from a vector value POV, a vector of 4 ints is a
> >>>>>>>vector
> >>>>>>>of 4 ints, so even distinguishing based on the mode is artificial.
> >>>>>>
> >>>>>> True.
> >>>>>>
> >>>>>>>SVE is AFAIK the first target to have different modes for
> >>>potentially
> >>>>>>>the "same" vector type, and I had to add new infrastructure to
> >>>allow
> >>>>>>>targets to define multiple modes of the same size.  So the fact
> >>>that
> >>>>>>>gimple distinguishes otherwise identical vectors based on mode is a
> >>>>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
> >>>>>>>than being deliberately planned.  It happens to be convenient in
> >>>this
> >>>>>>>context, but it hasn't been important until now.
> >>>>>>>
> >>>>>>>The hook doesn't seem any worse than distinguishing based on the
> >>>>>mode.
> >>>>>>>Another way to avoid this would have been to define separate SVE
> >>>>>modes
> >>>>>>>for the predefined vectors.  The big downside of that is that we'd
> >>>>>end
> >>>>>>>up doubling the number of SVE patterns.
> >>>>>>>
> >>>>>>>Extra on-the-side metadata is going to be easy to drop
> >>>accidentally,
> >>>>>>>and this is something we need for correctness rather than
> >>>>>optimisation.
> >>>>>>
> >>>>>> Still selecting the ABI during call expansion only and based on
> >>>>>values types at that point is fragile.
> >>>>>
> >>>>>Agreed.  But it's fragile in general, not just for this case.
> >>>Changing
> >>>>>something as fundamental as that would be a lot of work and seems
> >>>>>likely
> >>>>>to introduce accidental ABI breakage.
> >>>>>
> >>>>>> The frontend are in charge of specifying the actual argument type
> >>>and
> >>>>>> at that point the target may fix the ABI. The ABI can be recorded
> >>>in
> >>>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
> >>>>>> ways for varargs functions (in full generality that would mean
> >>>>>> attaching varargs ABI meta to each call).
> >>>>>>
> >>>>>> The alternative is to have an actual argument type vector
> >>>associated
> >>>>>> with each call.
> >>>>>
> >>>>>I think multiple pieces of gimple code would then have to cope with
> >>>>>that
> >>>>>as a special case.  E.g. if:
> >>>>>
> >>>>>   void foo (int, ...);
> >>>>>
> >>>>>   type1 a;
> >>>>>   b = VIEW_CONVERT_EXPR<type2> (a);
> >>>>>   if (a)
> >>>>>     foo (1, a);
> >>>>>   else
> >>>>>     foo (1, b);
> >>>>>
> >>>>>gets converted to:
> >>>>>
> >>>>>   if (a)
> >>>>>     foo (1, a);
> >>>>>   else
> >>>>>     foo (1, a);
> >>>>>
> >>>>>on the basis that type1 and type2 are "the same" despite having
> >>>>>different calling conventions, we have to be sure that the calls
> >>>>>are not treated as equivalent:
> >>>>>
> >>>>>   foo (1, a);
> >>>>>
> >>>>>Things like IPA clones would also need to handle this specially.
> >>>>>Anything that generates new calls based on old ones will need
> >>>>>to copy this information too.
> >>>>>
> >>>>>This also sounds like it would be fragile and seems a bit too
> >>>>>invasive for stage 3.
> >>>>
> >>>> But we are already relying on this to work (fntype non-propagation)
> >>>because function pointer conversions are dropped on the floor.
> >>>>
> >>>> The real change would be introducing (per call) fntype for calls to
> >>>unprototyped functions and somehow dealing with varargs.
> >>>
> >>>It looks like this itself relies on useless_type_conversion_p,
> >>>is that right?  E.g. we have things like:
> >>>
> >>>bool
> >>>func_checker::compare_gimple_call (gcall *s1, gcall *s2)
> >>>{
> >>>  ...
> >>>  tree fntype1 = gimple_call_fntype (s1);
> >>>  tree fntype2 = gimple_call_fntype (s2);
> >>>  if ((fntype1 && !fntype2)
> >>>      || (!fntype1 && fntype2)
> >>>      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
> >>>return return_false_with_msg ("call function types are not
> >>>compatible");
> >>>
> >>>and useless_type_conversion_p has:
> >>>
> >>>  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
> >>>         || TREE_CODE (inner_type) == METHOD_TYPE)
> >>>        && TREE_CODE (inner_type) == TREE_CODE (outer_type))
> >>>    {
> >>>      tree outer_parm, inner_parm;
> >>>
> >>>      /* If the return types are not compatible bail out.  */
> >>>      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
> >>>                                   TREE_TYPE (inner_type)))
> >>>     return false;
> >>>
> >>>      /* Method types should belong to a compatible base class.  */
> >>>      if (TREE_CODE (inner_type) == METHOD_TYPE
> >>>       && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
> >>>                                      TYPE_METHOD_BASETYPE (inner_type)))
> >>>     return false;
> >>>
> >>>      /* A conversion to an unprototyped argument list is ok.  */
> >>>      if (!prototype_p (outer_type))
> >>>     return true;
> >>>
> >>>     /* If the unqualified argument types are compatible the conversion
> >>>      is useless.  */
> >>>      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
> >>>     return true;
> >>>
> >>>      for (outer_parm = TYPE_ARG_TYPES (outer_type),
> >>>        inner_parm = TYPE_ARG_TYPES (inner_type);
> >>>        outer_parm && inner_parm;
> >>>        outer_parm = TREE_CHAIN (outer_parm),
> >>>        inner_parm = TREE_CHAIN (inner_parm))
> >>>     if (!useless_type_conversion_p
> >>>            (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
> >>>             TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
> >>>       return false;
> >>>
> >>>So it looks like we'd still need to distinguish the vector types in
> >>>useless_type_conversion_p even if we went the fntype route.  The
> >>>difference
> >>>is that the fntype route would give us the option of only
> >>>distinguishing
> >>>the vectors for return and argument types and not in general.
> >>>
> >>>But if we are going to have to distinguish the vectors here anyway
> >>>in some form, could we go with the patch as-is for stage 3 and leave
> >>>restricting this to just return and argument types as a follow-on
> >>>optimisation?
> >>
> >> How does this get around the LTO canonical type merging machinery? That is, how are those types streamed and how are they identified by the backend? Just by means of being pointer equal to some statically built type in the backend?
> >> Or does the type have some attribute on it or on the component? How does the middle end build a related type with the same ABI, like a vector with the half number of elements?
> >
> > Hmm...
> >
> > At the moment it's based on pointer equality between the TYPE_MAIN_VARIANT
> > and statically-built types.  We predefine the only available SVE "ABI types"
> > and there's no way to create "new" ones.
> >
> > But you're right that that doesn't work for LTO -- in general, not just
> > for this conversion patch -- because no streamed types end up as ABI types.
> > So we'll need an attribute after all, with the ABI decisions keyed off that
> > rather than TYPE_MAIN_VARIANT pointer equality.  Will fix...
>
> Now fixed :-)
>
> > Once that's fixed, the fact that we use SET_TYPE_STRUCTURAL_EQUALITY
> > for the ABI types means that the types remain distinct from "normal"
> > vector types even for TYPE_CANONICAL purposes, since:
> >
> >      As a special case, if TYPE_CANONICAL is NULL_TREE, and thus
> >      TYPE_STRUCTURAL_EQUALITY_P is true, then it cannot
> >      be used for comparison against other types.  Instead, the type is
> >      said to require structural equality checks, described in
> >      TYPE_STRUCTURAL_EQUALITY_P.
> >      [...]
> >   #define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type_common.canonical)
> >   /* Indicates that the type node requires structural equality
> >      checks.  The compiler will need to look at the composition of the
> >      type to determine whether it is equal to another type, rather than
> >      just comparing canonical type pointers.  For instance, we would need
> >      to look at the return and parameter types of a FUNCTION_TYPE
> >      node.  */
> >   #define TYPE_STRUCTURAL_EQUALITY_P(NODE) (TYPE_CANONICAL (NODE) == NULL_TREE)
> >
> > We also have:
> >
> > /* Return ture if get_alias_set care about TYPE_CANONICAL of given type.
> >    We don't define the types for pointers, arrays and vectors.  The reason is
> >    that pointers are handled specially: ptr_type_node accesses conflict with
> >    accesses to all other pointers.  This is done by alias.c.
> >    Because alias sets of arrays and vectors are the same as types of their
> >    elements, we can't compute canonical type either.  Otherwise we could go
> >    form void *[10] to int *[10] (because they are equivalent for canonical type
> >    machinery) and get wrong TBAA.  */
> >
> > inline bool
> > canonical_type_used_p (const_tree t)
> > {
> >   return !(POINTER_TYPE_P (t)
> >          || TREE_CODE (t) == ARRAY_TYPE
> >          || TREE_CODE (t) == VECTOR_TYPE);
> > }
> >
> > So with the attribute added (needed anyway), the patch does seem to
> > work for LTO too.
>
> Given the above, is the patch OK?  I agree it isn't very elegant,
> but at the moment we have no choice but to distinguish the vector
> types at some point during gimple.

OK.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> 2020-01-07  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * target.def (compatible_vector_types_p): New target hook.
>         * hooks.h (hook_bool_const_tree_const_tree_true): Declare.
>         * hooks.c (hook_bool_const_tree_const_tree_true): New function.
>         * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
>         * doc/tm.texi: Regenerate.
>         * gimple-expr.c: Include target.h.
>         (useless_type_conversion_p): Use targetm.compatible_vector_types_p.
>         * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
>         function.
>         (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
>         * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
>         Use the original predicate if it already has a suitable type.
>
> gcc/testsuite/
>         * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
>         * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
>
> Index: gcc/target.def
> ===================================================================
> --- gcc/target.def      2020-01-06 12:57:55.753930730 +0000
> +++ gcc/target.def      2020-01-07 10:24:01.546344751 +0000
> @@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
>   hook_bool_mode_false)
>
>  DEFHOOK
> +(compatible_vector_types_p,
> + "Return true if there is no target-specific reason for treating\n\
> +vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
> +has already checked for target-independent reasons, meaning that the\n\
> +types are known to have the same mode, to have the same number of elements,\n\
> +and to have what the caller considers to be compatible element types.\n\
> +\n\
> +The main reason for defining this hook is to reject pairs of types\n\
> +that are handled differently by the target's calling convention.\n\
> +For example, when a new @var{N}-bit vector architecture is added\n\
> +to a target, the target may want to handle normal @var{N}-bit\n\
> +@code{VECTOR_TYPE} arguments and return values in the same way as\n\
> +before, to maintain backwards compatibility.  However, it may also\n\
> +provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
> +and returned in a more efficient way.  It is then important to maintain\n\
> +a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
> +architecture-specific ones.\n\
> +\n\
> +The default implementation returns true, which is correct for most targets.",
> + bool, (const_tree type1, const_tree type2),
> + hook_bool_const_tree_const_tree_true)
> +
> +DEFHOOK
>  (vector_alignment,
>   "This hook can be used to define the alignment for a vector of type\n\
>  @var{type}, in order to comply with a platform ABI.  The default is to\n\
> Index: gcc/hooks.h
> ===================================================================
> --- gcc/hooks.h 2020-01-06 12:57:54.749937335 +0000
> +++ gcc/hooks.h 2020-01-07 10:24:01.542344777 +0000
> @@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
>  extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
>  extern bool hook_bool_tree_false (tree);
>  extern bool hook_bool_const_tree_false (const_tree);
> +extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
>  extern bool hook_bool_tree_true (tree);
>  extern bool hook_bool_const_tree_true (const_tree);
>  extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
> Index: gcc/hooks.c
> ===================================================================
> --- gcc/hooks.c 2020-01-06 12:57:54.745937361 +0000
> +++ gcc/hooks.c 2020-01-07 10:24:01.542344777 +0000
> @@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
>  }
>
>  bool
> +hook_bool_const_tree_const_tree_true (const_tree, const_tree)
> +{
> +  return true;
> +}
> +
> +bool
>  hook_bool_tree_true (tree)
>  {
>    return true;
> Index: gcc/doc/tm.texi.in
> ===================================================================
> --- gcc/doc/tm.texi.in  2020-01-06 12:57:53.657944518 +0000
> +++ gcc/doc/tm.texi.in  2020-01-07 10:24:01.542344777 +0000
> @@ -3365,6 +3365,8 @@ stack.
>
>  @hook TARGET_VECTOR_MODE_SUPPORTED_P
>
> +@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
> +
>  @hook TARGET_ARRAY_MODE
>
>  @hook TARGET_ARRAY_MODE_SUPPORTED_P
> Index: gcc/doc/tm.texi
> ===================================================================
> --- gcc/doc/tm.texi     2020-01-06 12:57:53.649944570 +0000
> +++ gcc/doc/tm.texi     2020-01-07 10:24:01.542344777 +0000
> @@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
>  must have move patterns for this mode.
>  @end deftypefn
>
> +@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2})
> +Return true if there is no target-specific reason for treating
> +vector types @var{type1} and @var{type2} as distinct types.  The caller
> +has already checked for target-independent reasons, meaning that the
> +types are known to have the same mode, to have the same number of elements,
> +and to have what the caller considers to be compatible element types.
> +
> +The main reason for defining this hook is to reject pairs of types
> +that are handled differently by the target's calling convention.
> +For example, when a new @var{N}-bit vector architecture is added
> +to a target, the target may want to handle normal @var{N}-bit
> +@code{VECTOR_TYPE} arguments and return values in the same way as
> +before, to maintain backwards compatibility.  However, it may also
> +provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
> +and returned in a more efficient way.  It is then important to maintain
> +a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
> +architecture-specific ones.
> +
> +The default implementation returns true, which is correct for most targets.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
>  Return the mode that GCC should use for an array that has
>  @var{nelems} elements, with each element having mode @var{mode}.
> Index: gcc/gimple-expr.c
> ===================================================================
> --- gcc/gimple-expr.c   2020-01-06 12:58:10.545833431 +0000
> +++ gcc/gimple-expr.c   2020-01-07 10:24:01.542344777 +0000
> @@ -37,6 +37,7 @@ Software Foundation; either version 3, o
>  #include "tree-pass.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "target.h"
>
>  /* ----- Type related -----  */
>
> @@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
>
>    /* Recurse for vector types with the same number of subparts.  */
>    else if (TREE_CODE (inner_type) == VECTOR_TYPE
> -          && TREE_CODE (outer_type) == VECTOR_TYPE
> -          && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
> -    return useless_type_conversion_p (TREE_TYPE (outer_type),
> -                                     TREE_TYPE (inner_type));
> +          && TREE_CODE (outer_type) == VECTOR_TYPE)
> +    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
> +                     TYPE_VECTOR_SUBPARTS (outer_type))
> +           && useless_type_conversion_p (TREE_TYPE (outer_type),
> +                                         TREE_TYPE (inner_type))
> +           && targetm.compatible_vector_types_p (inner_type, outer_type));
>
>    else if (TREE_CODE (inner_type) == ARRAY_TYPE
>            && TREE_CODE (outer_type) == ARRAY_TYPE)
> Index: gcc/config/aarch64/aarch64.c
> ===================================================================
> --- gcc/config/aarch64/aarch64.c        2020-01-07 10:18:06.572651552 +0000
> +++ gcc/config/aarch64/aarch64.c        2020-01-07 10:24:01.538344801 +0000
> @@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype)
>    return default_function_abi;
>  }
>
> +/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
> +
> +static bool
> +aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
> +{
> +  return (aarch64_sve::builtin_type_p (type1)
> +         == aarch64_sve::builtin_type_p (type2));
> +}
> +
>  /* Return true if we should emit CFI for register REGNO.  */
>
>  static bool
> @@ -22099,6 +22108,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
>  #undef TARGET_VECTOR_MODE_SUPPORTED_P
>  #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
>
> +#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
> +#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
> +
>  #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
>  #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
>    aarch64_builtin_support_vector_misalignment
> Index: gcc/config/aarch64/aarch64-sve-builtins.cc
> ===================================================================
> --- gcc/config/aarch64/aarch64-sve-builtins.cc  2020-01-07 10:21:17.575410530 +0000
> +++ gcc/config/aarch64/aarch64-sve-builtins.cc  2020-01-07 10:24:01.534344828 +0000
> @@ -2265,9 +2265,13 @@ tree
>  gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
>                              unsigned int argno)
>  {
> -  tree predtype = truth_type_for (vectype);
>    tree pred = gimple_call_arg (call, argno);
> -  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
> +  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
> +               TYPE_VECTOR_SUBPARTS (vectype)))
> +    return pred;
> +
> +  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
> +                      truth_type_for (vectype), pred);
>  }
>
>  /* Return a pointer to the address in a contiguous load or store,
> Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
> ===================================================================
> --- /dev/null   2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c    2020-01-07 10:24:01.546344751 +0000
> @@ -0,0 +1,99 @@
> +/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
> +
> +#include <arm_sve.h>
> +
> +typedef float16_t float16x16_t __attribute__((vector_size (32)));
> +typedef float32_t float32x8_t __attribute__((vector_size (32)));
> +typedef float64_t float64x4_t __attribute__((vector_size (32)));
> +typedef int8_t int8x32_t __attribute__((vector_size (32)));
> +typedef int16_t int16x16_t __attribute__((vector_size (32)));
> +typedef int32_t int32x8_t __attribute__((vector_size (32)));
> +typedef int64_t int64x4_t __attribute__((vector_size (32)));
> +typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
> +typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
> +typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
> +typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
> +
> +void float16_callee (float16x16_t);
> +void float32_callee (float32x8_t);
> +void float64_callee (float64x4_t);
> +void int8_callee (int8x32_t);
> +void int16_callee (int16x16_t);
> +void int32_callee (int32x8_t);
> +void int64_callee (int64x4_t);
> +void uint8_callee (uint8x32_t);
> +void uint16_callee (uint16x16_t);
> +void uint32_callee (uint32x8_t);
> +void uint64_callee (uint64x4_t);
> +
> +void
> +float16_caller (void)
> +{
> +  float16_callee (svdup_f16 (1.0));
> +}
> +
> +void
> +float32_caller (void)
> +{
> +  float32_callee (svdup_f32 (2.0));
> +}
> +
> +void
> +float64_caller (void)
> +{
> +  float64_callee (svdup_f64 (3.0));
> +}
> +
> +void
> +int8_caller (void)
> +{
> +  int8_callee (svindex_s8 (0, 1));
> +}
> +
> +void
> +int16_caller (void)
> +{
> +  int16_callee (svindex_s16 (0, 2));
> +}
> +
> +void
> +int32_caller (void)
> +{
> +  int32_callee (svindex_s32 (0, 3));
> +}
> +
> +void
> +int64_caller (void)
> +{
> +  int64_callee (svindex_s64 (0, 4));
> +}
> +
> +void
> +uint8_caller (void)
> +{
> +  uint8_callee (svindex_u8 (1, 1));
> +}
> +
> +void
> +uint16_caller (void)
> +{
> +  uint16_callee (svindex_u16 (1, 2));
> +}
> +
> +void
> +uint32_caller (void)
> +{
> +  uint32_callee (svindex_u32 (1, 3));
> +}
> +
> +void
> +uint64_caller (void)
> +{
> +  uint64_callee (svindex_u64 (1, 4));
> +}
> +
> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */
> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */
> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */
> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */
> +/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
> ===================================================================
> --- /dev/null   2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c    2020-01-07 10:24:01.546344751 +0000
> @@ -0,0 +1,99 @@
> +/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
> +
> +#include <arm_sve.h>
> +
> +typedef float16_t float16x16_t __attribute__((vector_size (32)));
> +typedef float32_t float32x8_t __attribute__((vector_size (32)));
> +typedef float64_t float64x4_t __attribute__((vector_size (32)));
> +typedef int8_t int8x32_t __attribute__((vector_size (32)));
> +typedef int16_t int16x16_t __attribute__((vector_size (32)));
> +typedef int32_t int32x8_t __attribute__((vector_size (32)));
> +typedef int64_t int64x4_t __attribute__((vector_size (32)));
> +typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
> +typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
> +typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
> +typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
> +
> +void float16_callee (svfloat16_t);
> +void float32_callee (svfloat32_t);
> +void float64_callee (svfloat64_t);
> +void int8_callee (svint8_t);
> +void int16_callee (svint16_t);
> +void int32_callee (svint32_t);
> +void int64_callee (svint64_t);
> +void uint8_callee (svuint8_t);
> +void uint16_callee (svuint16_t);
> +void uint32_callee (svuint32_t);
> +void uint64_callee (svuint64_t);
> +
> +void
> +float16_caller (float16x16_t arg)
> +{
> +  float16_callee (arg);
> +}
> +
> +void
> +float32_caller (float32x8_t arg)
> +{
> +  float32_callee (arg);
> +}
> +
> +void
> +float64_caller (float64x4_t arg)
> +{
> +  float64_callee (arg);
> +}
> +
> +void
> +int8_caller (int8x32_t arg)
> +{
> +  int8_callee (arg);
> +}
> +
> +void
> +int16_caller (int16x16_t arg)
> +{
> +  int16_callee (arg);
> +}
> +
> +void
> +int32_caller (int32x8_t arg)
> +{
> +  int32_callee (arg);
> +}
> +
> +void
> +int64_caller (int64x4_t arg)
> +{
> +  int64_callee (arg);
> +}
> +
> +void
> +uint8_caller (uint8x32_t arg)
> +{
> +  uint8_callee (arg);
> +}
> +
> +void
> +uint16_caller (uint16x16_t arg)
> +{
> +  uint16_callee (arg);
> +}
> +
> +void
> +uint32_caller (uint32x8_t arg)
> +{
> +  uint32_callee (arg);
> +}
> +
> +void
> +uint64_caller (uint64x4_t arg)
> +{
> +  uint64_callee (arg);
> +}
> +
> +/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
> +/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
> +/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
> +/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
> +/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-01-09 13:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-12 15:10 Add a compatible_vector_types_p target hook Richard Sandiford
2019-12-12 16:04 ` Richard Biener
2019-12-12 16:44   ` Richard Sandiford
2019-12-12 17:20     ` Richard Biener
2019-12-12 18:16       ` Richard Sandiford
2019-12-13  8:41         ` Richard Biener
2019-12-13  9:12           ` Richard Sandiford
2019-12-13 12:25             ` Richard Biener
2019-12-13 13:10               ` Richard Sandiford
2019-12-14 11:13               ` Richard Sandiford
2019-12-14 14:34                 ` Richard Biener
2019-12-16 16:02                   ` Richard Sandiford
2020-01-07 10:33                     ` Richard Sandiford
2020-01-09 13:26                       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).