Re: [AArch64] Enable generation of FRINTNZ instructions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: "Andre Vieira (lists) via Gcc-patches" <gcc-patches@gcc.gnu.org>,
	richard.sandiford@arm.com
Subject: Re: [AArch64] Enable generation of FRINTNZ instructions
Date: Mon, 10 Jan 2022 14:09:04 +0000	[thread overview]
Message-ID: <5d7bb7af-b09e-cb91-b457-c6148f65028e@arm.com> (raw)
In-Reply-To: <231396s0-2756-q51s-q55-o8npqo91on32@fhfr.qr>

[-- Attachment #1: Type: text/plain, Size: 5328 bytes --]

Yeah seems I forgot to send the latest version, my bad.

Bootstrapped on aarch64-none-linux.

OK for trunk?

gcc/ChangeLog:

         * config/aarch64/aarch64.md (ftrunc<mode><frintnz_mode>2): New 
pattern.
         * config/aarch64/iterators.md (FRINTNZ): New iterator.
         (frintnz_mode): New int attribute.
         (VSFDF): Make iterator conditional.
         * internal-fn.def (FTRUNC_INT): New IFN.
         * internal-fn.c (ftrunc_int_direct): New define.
         (expand_ftrunc_int_optab_fn): New custom expander.
         (direct_ftrunc_int_optab_supported_p): New supported_p.
         * match.pd: Add to the existing TRUNC pattern match.
         * optabs.def (ftrunc_int): New entry.
         * stor-layout.h (element_precision): Moved from here...
         * tree.h (element_precision): ... to here.
         (element_type): New declaration.
         * tree.c (element_type): New function.
         (element_precision): Changed to use element_type.
         * tree-vect-stmts.c (vectorizable_internal_function): Add 
support for
         IFNs with different input types.
         (vectorizable_call): Teach to handle IFN_FTRUNC_INT.
         * doc/md.texi: New entry for ftrunc pattern name.
         * doc/sourcebuild.texi (aarch64_frintzx_ok): New target.

gcc/testsuite/ChangeLog:

         * gcc.target/aarch64/merge_trunc1.c: Adapted to skip if frintNz 
instruction available.
         * lib/target-supports.exp: Added arm_v8_5a_frintnzx_ok target.
         * gcc.target/aarch64/frintnz.c: New test.
         * gcc.target/aarch64/frintnz_vec.c: New test.

On 03/01/2022 12:18, Richard Biener wrote:
> On Wed, 29 Dec 2021, Andre Vieira (lists) wrote:
>
>> Hi Richard,
>>
>> Thank you for the review, I've adopted all above suggestions downstream, I am
>> still surprised how many style things I still miss after years of gcc
>> development :(
>>
>> On 17/12/2021 12:44, Richard Sandiford wrote:
>>>> @@ -3252,16 +3257,31 @@ vectorizable_call (vec_info *vinfo,
>>>>          rhs_type = unsigned_type_node;
>>>>        }
>>>>    -  int mask_opno = -1;
>>>> +  /* The argument that is not of the same type as the others.  */
>>>> +  int diff_opno = -1;
>>>> +  bool masked = false;
>>>>      if (internal_fn_p (cfn))
>>>> -    mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
>>>> +    {
>>>> +      if (cfn == CFN_FTRUNC_INT)
>>>> +	/* For FTRUNC this represents the argument that carries the type of
>>>> the
>>>> +	   intermediate signed integer.  */
>>>> +	diff_opno = 1;
>>>> +      else
>>>> +	{
>>>> +	  /* For masked operations this represents the argument that carries
>>>> the
>>>> +	     mask.  */
>>>> +	  diff_opno = internal_fn_mask_index (as_internal_fn (cfn));
>>>> +	  masked = diff_opno >=  0;
>>>> +	}
>>>> +    }
>>> I think it would be cleaner not to process argument 1 at all for
>>> CFN_FTRUNC_INT.  There's no particular need to vectorise it.
>> I agree with this,  will change the loop to continue for argument 1 when
>> dealing with non-masked CFN's.
>>
>>>>    	}
>>>> […]
>>>> diff --git a/gcc/tree.c b/gcc/tree.c
>>>> index
>>>> 845228a055b2cfac0c9ca8c0cda1b9df4b0095c6..f1e9a1eb48769cb11aa69730e2480ed5522f78c1
>>>> 100644
>>>> --- a/gcc/tree.c
>>>> +++ b/gcc/tree.c
>>>> @@ -6645,11 +6645,11 @@ valid_constant_size_p (const_tree size,
>>>> cst_size_error *perr /* = NULL */)
>>>>      return true;
>>>>    }
>>>>    
>>>> -/* Return the precision of the type, or for a complex or vector type the
>>>> -   precision of the type of its elements.  */
>>>> +/* Return the type, or for a complex or vector type the type of its
>>>> +   elements.  */
>>>>    -unsigned int
>>>> -element_precision (const_tree type)
>>>> +tree
>>>> +element_type (const_tree type)
>>>>    {
>>>>      if (!TYPE_P (type))
>>>>        type = TREE_TYPE (type);
>>>> @@ -6657,7 +6657,16 @@ element_precision (const_tree type)
>>>>      if (code == COMPLEX_TYPE || code == VECTOR_TYPE)
>>>>        type = TREE_TYPE (type);
>>>>    -  return TYPE_PRECISION (type);
>>>> +  return (tree) type;
>>> I think we should stick a const_cast in element_precision and make
>>> element_type take a plain “type”.  As it stands element_type is an
>>> implicit const_cast for many cases.
>>>
>>> Thanks,
>> Was just curious about something here, I thought the purpose of having
>> element_precision (before) and element_type (now) take a const_tree as an
>> argument was to make it clear we aren't changing the input type. I understand
>> that as it stands element_type could be an implicit const_cast (which I should
>> be using rather than the '(tree)' cast), but that's only if 'type' is a type
>> that isn't complex/vector, either way, we are conforming to the promise that
>> we aren't changing the incoming type, what the caller then does with the
>> result is up to them no?
>>
>> I don't mind making the changes, just trying to understand the reasoning
>> behind it.
>>
>> I'll send in a new patch with all the changes after the review on the match.pd
>> stuff.
> I'm missing an updated patch after my initial review of the match.pd
> stuff so not sure what to review.  Can you re-post and updated patch?
>
> Thanks,
> Richard.

[-- Attachment #2: frintnz4.patch --]
[-- Type: text/plain, Size: 23433 bytes --]

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 3c72bdad01bfab49ee4ae6fb7b139202e89c8d34..9d04a2e088cd7d03963b58ed3708c339b841801c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -7423,12 +7423,18 @@ (define_insn "despeculate_simpleti"
    (set_attr "speculation_barrier" "true")]
 )
 
+(define_expand "ftrunc<mode><frintnz_mode>2"
+  [(set (match_operand:VSFDF 0 "register_operand" "=w")
+        (unspec:VSFDF [(match_operand:VSFDF 1 "register_operand" "w")]
+		      FRINTNZ))]
+  "TARGET_FRINT"
+)
+
 (define_insn "aarch64_<frintnzs_op><mode>"
   [(set (match_operand:VSFDF 0 "register_operand" "=w")
 	(unspec:VSFDF [(match_operand:VSFDF 1 "register_operand" "w")]
 		      FRINTNZX))]
-  "TARGET_FRINT && TARGET_FLOAT
-   && !(VECTOR_MODE_P (<MODE>mode) && !TARGET_SIMD)"
+  "TARGET_FRINT"
   "<frintnzs_op>\\t%<v>0<Vmtype>, %<v>1<Vmtype>"
   [(set_attr "type" "f_rint<stype>")]
 )
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 9160ce3e69c2c6b1b75e46f7aabd27e7949a269a..7962b15a87db2f1ede3836efbb827b8fb95266da 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -163,7 +163,11 @@ (define_mode_iterator VHSDF_HSDF [(V4HF "TARGET_SIMD_F16INST")
 				  SF DF])
 
 ;; Scalar and vetor modes for SF, DF.
-(define_mode_iterator VSFDF [V2SF V4SF V2DF DF SF])
+(define_mode_iterator VSFDF [(V2SF "TARGET_SIMD")
+			     (V4SF "TARGET_SIMD")
+			     (V2DF "TARGET_SIMD")
+			     (DF "TARGET_FLOAT")
+			     (SF "TARGET_FLOAT")])
 
 ;; Advanced SIMD single Float modes.
 (define_mode_iterator VDQSF [V2SF V4SF])
@@ -3078,6 +3082,8 @@ (define_int_iterator FCMLA [UNSPEC_FCMLA
 (define_int_iterator FRINTNZX [UNSPEC_FRINT32Z UNSPEC_FRINT32X
 			       UNSPEC_FRINT64Z UNSPEC_FRINT64X])
 
+(define_int_iterator FRINTNZ [UNSPEC_FRINT32Z UNSPEC_FRINT64Z])
+
 (define_int_iterator SVE_BRK_UNARY [UNSPEC_BRKA UNSPEC_BRKB])
 
 (define_int_iterator SVE_BRK_BINARY [UNSPEC_BRKN UNSPEC_BRKPA UNSPEC_BRKPB])
@@ -3485,6 +3491,8 @@ (define_int_attr f16mac1 [(UNSPEC_FMLAL "a") (UNSPEC_FMLSL "s")
 (define_int_attr frintnzs_op [(UNSPEC_FRINT32Z "frint32z") (UNSPEC_FRINT32X "frint32x")
 			      (UNSPEC_FRINT64Z "frint64z") (UNSPEC_FRINT64X "frint64x")])
 
+(define_int_attr frintnz_mode [(UNSPEC_FRINT32Z "si") (UNSPEC_FRINT64Z "di")])
+
 ;; The condition associated with an UNSPEC_COND_<xx>.
 (define_int_attr cmp_op [(UNSPEC_COND_CMPEQ_WIDE "eq")
 			 (UNSPEC_COND_CMPGE_WIDE "ge")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 19e89ae502bc2f51db64667b236c1cb669718b02..3b0e4e0875b4392ab6833568b207580ef597a98f 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6191,6 +6191,15 @@ operands; otherwise, it may not.
 
 This pattern is not allowed to @code{FAIL}.
 
+@cindex @code{ftrunc@var{m}@var{n}2} instruction pattern
+@item @samp{ftrunc@var{m}@var{n}2}
+Truncate operand 1 to a @var{n} mode signed integer, towards zero, and store
+the result in operand 0. Both operands have mode @var{m}, which is a scalar or
+vector floating-point mode.  An exception must be raised if operand 1 does not
+fit in a @var{n} mode signed integer as it would have if the truncation
+happened through separate floating point to integer conversion.
+
+
 @cindex @code{round@var{m}2} instruction pattern
 @item @samp{round@var{m}2}
 Round operand 1 to the nearest integer, rounding away from zero in the
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 6095a35cd4565fdb7d758104e80fe6411230f758..a56bbb775572fa72379854f90a01ad543557e29a 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2286,6 +2286,10 @@ Like @code{aarch64_sve_hw}, but also test for an exact hardware vector length.
 @item aarch64_fjcvtzs_hw
 AArch64 target that is able to generate and execute armv8.3-a FJCVTZS
 instruction.
+
+@item aarch64_frintzx_ok
+AArch64 target that is able to generate the Armv8.5-a FRINT32Z, FRINT64Z,
+FRINT32X and FRINT64X instructions.
 @end table
 
 @subsubsection MIPS-specific attributes
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index b24102a5990bea4cbb102069f7a6df497fc81ebf..9047b486f41948059a7a7f1ccc4032410a369139 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -130,6 +130,7 @@ init_internal_fns ()
 #define fold_left_direct { 1, 1, false }
 #define mask_fold_left_direct { 1, 1, false }
 #define check_ptrs_direct { 0, 0, false }
+#define ftrunc_int_direct { 0, 1, true }
 
 const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct,
@@ -156,6 +157,29 @@ get_multi_vector_move (tree array_type, convert_optab optab)
   return convert_optab_handler (optab, imode, vmode);
 }
 
+/* Expand FTRUNC_INT call STMT using optab OPTAB.  */
+
+static void
+expand_ftrunc_int_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+{
+  class expand_operand ops[2];
+  tree lhs, float_type, int_type;
+  rtx target, op;
+
+  lhs = gimple_call_lhs (stmt);
+  target = expand_normal (lhs);
+  op = expand_normal (gimple_call_arg (stmt, 0));
+
+  float_type = TREE_TYPE (lhs);
+  int_type = element_type (gimple_call_arg (stmt, 1));
+
+  create_output_operand (&ops[0], target, TYPE_MODE (float_type));
+  create_input_operand (&ops[1], op, TYPE_MODE (float_type));
+
+  expand_insn (convert_optab_handler (optab, TYPE_MODE (float_type),
+				      TYPE_MODE (int_type)), 2, ops);
+}
+
 /* Expand LOAD_LANES call STMT using optab OPTAB.  */
 
 static void
@@ -3747,6 +3771,15 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 	  != CODE_FOR_nothing);
 }
 
+static bool
+direct_ftrunc_int_optab_supported_p (convert_optab optab, tree_pair types,
+				     optimization_type opt_type)
+{
+  return (convert_optab_handler (optab, TYPE_MODE (types.first),
+				TYPE_MODE (element_type (types.second)),
+				opt_type) != CODE_FOR_nothing);
+}
+
 #define direct_unary_optab_supported_p direct_optab_supported_p
 #define direct_binary_optab_supported_p direct_optab_supported_p
 #define direct_ternary_optab_supported_p direct_optab_supported_p
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 8891071a6a360961643731094379b607f317af17..a0fd75829e942f529c879c669e58c098b62b26ba 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -66,6 +66,9 @@ along with GCC; see the file COPYING3.  If not see
 
    - fold_left: for scalar = FN (scalar, vector), keyed off the vector mode
    - check_ptrs: used for check_{raw,war}_ptrs
+   - ftrunc_int: a unary conversion optab that takes and returns values of the
+   same mode, but internally converts via another mode.  This second mode is
+   specified using a dummy final function argument.
 
    DEF_INTERNAL_SIGNED_OPTAB_FN defines an internal function that
    maps to one of two optabs, depending on the signedness of an input.
@@ -275,6 +278,7 @@ DEF_INTERNAL_FLT_FLOATN_FN (RINT, ECF_CONST, rint, unary)
 DEF_INTERNAL_FLT_FLOATN_FN (ROUND, ECF_CONST, round, unary)
 DEF_INTERNAL_FLT_FLOATN_FN (ROUNDEVEN, ECF_CONST, roundeven, unary)
 DEF_INTERNAL_FLT_FLOATN_FN (TRUNC, ECF_CONST, btrunc, unary)
+DEF_INTERNAL_OPTAB_FN (FTRUNC_INT, ECF_CONST, ftruncint, ftrunc_int)
 
 /* Binary math functions.  */
 DEF_INTERNAL_FLT_FN (ATAN2, ECF_CONST, atan2, binary)
diff --git a/gcc/match.pd b/gcc/match.pd
index 84c9b918041eef3409bdb0fbe04565b90b25d6e9..a5d892ac1ebfaa7b5d5fa970baa04c8e5b8acb28 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3751,12 +3751,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    trapping behaviour, so require !flag_trapping_math. */
 #if GIMPLE
 (simplify
-   (float (fix_trunc @0))
-   (if (!flag_trapping_math
-	&& types_match (type, TREE_TYPE (@0))
-	&& direct_internal_fn_supported_p (IFN_TRUNC, type,
-					  OPTIMIZE_FOR_BOTH))
-      (IFN_TRUNC @0)))
+   (float (fix_trunc@1 @0))
+   (if (types_match (type, TREE_TYPE (@0)))
+    (with {
+      tree int_type = element_type (@1);
+     }
+     (if (TYPE_SIGN (TREE_TYPE (@1)) == SIGNED
+	  && direct_internal_fn_supported_p (IFN_FTRUNC_INT, type, int_type,
+					     OPTIMIZE_FOR_BOTH))
+      (IFN_FTRUNC_INT @0 {
+       wide_int_to_tree (int_type, wi::max_value (TYPE_PRECISION (int_type),
+						  SIGNED)); })
+      (if (!flag_trapping_math
+	   && direct_internal_fn_supported_p (IFN_TRUNC, type,
+					      OPTIMIZE_FOR_BOTH))
+       (IFN_TRUNC @0))))))
 #endif
 
 /* If we have a narrowing conversion to an integral type that is fed by a
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 5fcf5386a0b3112ef9004055c82e15fe47668970..04a4ee82e15fe7b52e726f2ee0bf704c30ac450d 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -63,6 +63,7 @@ OPTAB_CX(fractuns_optab, "fractuns$Q$b$I$a2")
 OPTAB_CL(satfract_optab, "satfract$b$Q$a2", SAT_FRACT, "satfract", gen_satfract_conv_libfunc)
 OPTAB_CL(satfractuns_optab, "satfractuns$I$b$Q$a2", UNSIGNED_SAT_FRACT, "satfractuns", gen_satfractuns_conv_libfunc)
 
+OPTAB_CD(ftruncint_optab, "ftrunc$a$b2")
 OPTAB_CD(sfixtrunc_optab, "fix_trunc$F$b$I$a2")
 OPTAB_CD(ufixtrunc_optab, "fixuns_trunc$F$b$I$a2")
 
diff --git a/gcc/stor-layout.h b/gcc/stor-layout.h
index b67abebc0096113272bfb1221eabaabd08657a58..e0219c8af4846ea0f947586b1915d9d06cb6c107 100644
--- a/gcc/stor-layout.h
+++ b/gcc/stor-layout.h
@@ -36,7 +36,6 @@ extern void place_field (record_layout_info, tree);
 extern void compute_record_mode (tree);
 extern void finish_bitfield_layout (tree);
 extern void finish_record_layout (record_layout_info, int);
-extern unsigned int element_precision (const_tree);
 extern void finalize_size_functions (void);
 extern void fixup_unsigned_type (tree);
 extern void initialize_sizetypes (void);
diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz.c b/gcc/testsuite/gcc.target/aarch64/frintnz.c
new file mode 100644
index 0000000000000000000000000000000000000000..008e1cf9f4a1b0148128c65c9ea0d1bb111467b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/frintnz.c
@@ -0,0 +1,91 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv8.5-a" } */
+/* { dg-require-effective-target aarch64_frintnzx_ok } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** f1:
+**	frint32z	s0, s0
+**	ret
+*/
+float
+f1 (float x)
+{
+  int y = x;
+  return (float) y;
+}
+
+/*
+** f2:
+**	frint64z	s0, s0
+**	ret
+*/
+float
+f2 (float x)
+{
+  long long int y = x;
+  return (float) y;
+}
+
+/*
+** f3:
+**	frint32z	d0, d0
+**	ret
+*/
+double
+f3 (double x)
+{
+  int y = x;
+  return (double) y;
+}
+
+/*
+** f4:
+**	frint64z	d0, d0
+**	ret
+*/
+double
+f4 (double x)
+{
+  long long int y = x;
+  return (double) y;
+}
+
+float
+f1_dont (float x)
+{
+  unsigned int y = x;
+  return (float) y;
+}
+
+float
+f2_dont (float x)
+{
+  unsigned long long int y = x;
+  return (float) y;
+}
+
+double
+f3_dont (double x)
+{
+  unsigned int y = x;
+  return (double) y;
+}
+
+double
+f4_dont (double x)
+{
+  unsigned long long int y = x;
+  return (double) y;
+}
+
+double
+f5_dont (double x)
+{
+  signed short y = x;
+  return (double) y;
+}
+
+/* Make sure the 'dont's don't generate any frintNz.  */
+/* { dg-final { scan-assembler-times {frint32z} 2 } } */
+/* { dg-final { scan-assembler-times {frint64z} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
new file mode 100644
index 0000000000000000000000000000000000000000..801d65ea8325cb680691286aab42747f43b90687
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.5-a" } */
+/* { dg-require-effective-target aarch64_frintnzx_ok } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define TEST(name,float_type,int_type)					\
+void									\
+name (float_type * __restrict__ x, float_type * __restrict__ y, int n)  \
+{									\
+  for (int i = 0; i < n; ++i)					      \
+    {								      \
+      int_type x_i = x[i];					      \
+      y[i] = (float_type) x_i;					      \
+    }								      \
+}
+
+/*
+** f1:
+**	...
+**	frint32z	v[0-9]+\.4s, v[0-9]+\.4s
+**	...
+*/
+TEST(f1, float, int)
+
+/*
+** f2:
+**	...
+**	frint64z	v[0-9]+\.4s, v[0-9]+\.4s
+**	...
+*/
+TEST(f2, float, long long)
+
+/*
+** f3:
+**	...
+**	frint32z	v[0-9]+\.2d, v[0-9]+\.2d
+**	...
+*/
+TEST(f3, double, int)
+
+/*
+** f4:
+**	...
+**	frint64z	v[0-9]+\.2d, v[0-9]+\.2d
+**	...
+*/
+TEST(f4, double, long long)
diff --git a/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c b/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c
index 07217064e2ba54fcf4f5edc440e6ec19ddae66e1..3d80871c4cebd5fb5cac0714b3feee27038f05fd 100644
--- a/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c
+++ b/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ffast-math" } */
+/* { dg-skip-if "" { aarch64_frintnzx_ok } } */
 
 float
 f1 (float x)
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index c1ad97c6bd20d6e970edb24a125451580f014d55..5758e9cee4416b60b6766ecb37cbf3b37ac98522 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11399,6 +11399,32 @@ proc check_effective_target_arm_v8_3a_bkey_directive { } {
 	}]
 }
 
+# Return 1 if the target supports Armv8.5-A scalar and Advanced SIMD
+# FRINT32[ZX] andd FRINT64[ZX] instructions, 0 otherwise. The test is valid for
+# AArch64.
+proc check_effective_target_aarch64_frintnzx_ok_nocache { } {
+
+    if { ![istarget aarch64*-*-*] } {
+        return 0;
+    }
+
+    if { [check_no_compiler_messages_nocache \
+	      aarch64_frintnzx_ok assembly {
+	#if !defined (__ARM_FEATURE_FRINT)
+	#error "__ARM_FEATURE_FRINT not defined"
+	#endif
+    } [current_compiler_flags]] } {
+	return 1;
+    }
+
+    return 0;
+}
+
+proc check_effective_target_aarch64_frintnzx_ok { } {
+    return [check_cached_effective_target aarch64_frintnzx_ok \
+                check_effective_target_aarch64_frintnzx_ok_nocache] 
+}
+
 # Return 1 if the target supports executing the Armv8.1-M Mainline Low
 # Overhead Loop, 0 otherwise.  The test is valid for ARM.
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index f2625a2ff4089739326ce11785f1b68678c07f0e..435f2f4f5aeb2ed4c503c7b6a97d375634ae4514 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1625,7 +1625,8 @@ vect_finish_stmt_generation (vec_info *vinfo,
 
 static internal_fn
 vectorizable_internal_function (combined_fn cfn, tree fndecl,
-				tree vectype_out, tree vectype_in)
+				tree vectype_out, tree vectype_in,
+				tree *vectypes)
 {
   internal_fn ifn;
   if (internal_fn_p (cfn))
@@ -1637,8 +1638,12 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl,
       const direct_internal_fn_info &info = direct_internal_fn (ifn);
       if (info.vectorizable)
 	{
-	  tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
-	  tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
+	  tree type0 = (info.type0 < 0 ? vectype_out : vectypes[info.type0]);
+	  if (!type0)
+	    type0 = vectype_in;
+	  tree type1 = (info.type1 < 0 ? vectype_out : vectypes[info.type1]);
+	  if (!type1)
+	    type1 = vectype_in;
 	  if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
 					      OPTIMIZE_FOR_SPEED))
 	    return ifn;
@@ -3263,18 +3268,40 @@ vectorizable_call (vec_info *vinfo,
       rhs_type = unsigned_type_node;
     }
 
-  int mask_opno = -1;
+  /* The argument that is not of the same type as the others.  */
+  int diff_opno = -1;
+  bool masked = false;
   if (internal_fn_p (cfn))
-    mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
+    {
+      if (cfn == CFN_FTRUNC_INT)
+	/* For FTRUNC this represents the argument that carries the type of the
+	   intermediate signed integer.  */
+	diff_opno = 1;
+      else
+	{
+	  /* For masked operations this represents the argument that carries the
+	     mask.  */
+	  diff_opno = internal_fn_mask_index (as_internal_fn (cfn));
+	  masked = diff_opno >=  0;
+	}
+    }
 
   for (i = 0; i < nargs; i++)
     {
-      if ((int) i == mask_opno)
+      if ((int) i == diff_opno)
 	{
-	  if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_opno,
-				       &op, &slp_op[i], &dt[i], &vectypes[i]))
-	    return false;
-	  continue;
+	  if (masked)
+	    {
+	      if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node,
+					   diff_opno, &op, &slp_op[i], &dt[i],
+					   &vectypes[i]))
+		return false;
+	    }
+	  else
+	    {
+	      vectypes[i] = TREE_TYPE (gimple_call_arg (stmt, i));
+	      continue;
+	    }
 	}
 
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node,
@@ -3286,27 +3313,30 @@ vectorizable_call (vec_info *vinfo,
 	  return false;
 	}
 
-      /* We can only handle calls with arguments of the same type.  */
-      if (rhs_type
-	  && !types_compatible_p (rhs_type, TREE_TYPE (op)))
+      if ((int) i != diff_opno)
 	{
-	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-                             "argument types differ.\n");
-	  return false;
-	}
-      if (!rhs_type)
-	rhs_type = TREE_TYPE (op);
+	  /* We can only handle calls with arguments of the same type.  */
+	  if (rhs_type
+	      && !types_compatible_p (rhs_type, TREE_TYPE (op)))
+	    {
+	      if (dump_enabled_p ())
+		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+				 "argument types differ.\n");
+	      return false;
+	    }
+	  if (!rhs_type)
+	    rhs_type = TREE_TYPE (op);
 
-      if (!vectype_in)
-	vectype_in = vectypes[i];
-      else if (vectypes[i]
-	       && !types_compatible_p (vectypes[i], vectype_in))
-	{
-	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-                             "argument vector types differ.\n");
-	  return false;
+	  if (!vectype_in)
+	    vectype_in = vectypes[i];
+	  else if (vectypes[i]
+		   && !types_compatible_p (vectypes[i], vectype_in))
+	    {
+	      if (dump_enabled_p ())
+		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+				 "argument vector types differ.\n");
+	      return false;
+	    }
 	}
     }
   /* If all arguments are external or constant defs, infer the vector type
@@ -3382,8 +3412,8 @@ vectorizable_call (vec_info *vinfo,
 	  || (modifier == NARROW
 	      && simple_integer_narrowing (vectype_out, vectype_in,
 					   &convert_code))))
-    ifn = vectorizable_internal_function (cfn, callee, vectype_out,
-					  vectype_in);
+    ifn = vectorizable_internal_function (cfn, callee, vectype_out, vectype_in,
+					  &vectypes[0]);
 
   /* If that fails, try asking for a target-specific built-in function.  */
   if (ifn == IFN_LAST)
@@ -3461,7 +3491,7 @@ vectorizable_call (vec_info *vinfo,
 
       if (loop_vinfo
 	  && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
-	  && (reduc_idx >= 0 || mask_opno >= 0))
+	  && (reduc_idx >= 0 || masked))
 	{
 	  if (reduc_idx >= 0
 	      && (cond_fn == IFN_LAST
@@ -3481,8 +3511,8 @@ vectorizable_call (vec_info *vinfo,
 		   ? SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node)
 		   : ncopies);
 	      tree scalar_mask = NULL_TREE;
-	      if (mask_opno >= 0)
-		scalar_mask = gimple_call_arg (stmt_info->stmt, mask_opno);
+	      if (masked)
+		scalar_mask = gimple_call_arg (stmt_info->stmt, diff_opno);
 	      vect_record_loop_mask (loop_vinfo, masks, nvectors,
 				     vectype_out, scalar_mask);
 	    }
@@ -3547,7 +3577,7 @@ vectorizable_call (vec_info *vinfo,
 		    {
 		      /* We don't define any narrowing conditional functions
 			 at present.  */
-		      gcc_assert (mask_opno < 0);
+		      gcc_assert (!masked);
 		      tree half_res = make_ssa_name (vectype_in);
 		      gcall *call
 			= gimple_build_call_internal_vec (ifn, vargs);
@@ -3567,16 +3597,16 @@ vectorizable_call (vec_info *vinfo,
 		    }
 		  else
 		    {
-		      if (mask_opno >= 0 && masked_loop_p)
+		      if (masked && masked_loop_p)
 			{
 			  unsigned int vec_num = vec_oprnds0.length ();
 			  /* Always true for SLP.  */
 			  gcc_assert (ncopies == 1);
 			  tree mask = vect_get_loop_mask (gsi, masks, vec_num,
 							  vectype_out, i);
-			  vargs[mask_opno] = prepare_vec_mask
+			  vargs[diff_opno] = prepare_vec_mask
 			    (loop_vinfo, TREE_TYPE (mask), mask,
-			     vargs[mask_opno], gsi);
+			     vargs[diff_opno], gsi);
 			}
 
 		      gcall *call;
@@ -3614,13 +3644,13 @@ vectorizable_call (vec_info *vinfo,
 	  if (masked_loop_p && reduc_idx >= 0)
 	    vargs[varg++] = vargs[reduc_idx + 1];
 
-	  if (mask_opno >= 0 && masked_loop_p)
+	  if (masked && masked_loop_p)
 	    {
 	      tree mask = vect_get_loop_mask (gsi, masks, ncopies,
 					      vectype_out, j);
-	      vargs[mask_opno]
+	      vargs[diff_opno]
 		= prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask,
-				    vargs[mask_opno], gsi);
+				    vargs[diff_opno], gsi);
 	    }
 
 	  gimple *new_stmt;
@@ -3639,7 +3669,7 @@ vectorizable_call (vec_info *vinfo,
 	    {
 	      /* We don't define any narrowing conditional functions at
 		 present.  */
-	      gcc_assert (mask_opno < 0);
+	      gcc_assert (!masked);
 	      tree half_res = make_ssa_name (vectype_in);
 	      gcall *call = gimple_build_call_internal_vec (ifn, vargs);
 	      gimple_call_set_lhs (call, half_res);
@@ -3683,7 +3713,7 @@ vectorizable_call (vec_info *vinfo,
     {
       auto_vec<vec<tree> > vec_defs (nargs);
       /* We don't define any narrowing conditional functions at present.  */
-      gcc_assert (mask_opno < 0);
+      gcc_assert (!masked);
       for (j = 0; j < ncopies; ++j)
 	{
 	  /* Build argument list for the vectorized call.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 318019c4dc5373271551f5d9a48dadb57a29d4a7..770d0ddfcc9a7acda01ed2fafa61eab0f1ba4cfa 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -6558,4 +6558,12 @@ extern unsigned fndecl_dealloc_argno (tree);
    object or pointer.  Otherwise return null.  */
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
 
+/* Return the type, or for a complex or vector type the type of its
+   elements.  */
+extern tree element_type (tree);
+
+/* Return the precision of the type, or for a complex or vector type the
+   precision of the type of its elements.  */
+extern unsigned int element_precision (const_tree);
+
 #endif  /* GCC_TREE_H  */
diff --git a/gcc/tree.c b/gcc/tree.c
index d98b77db50b29b22dc9af1f98cd86044f62af019..81e66dd710ce6bc237f508655cfb437b40ec0bfa 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -6646,11 +6646,11 @@ valid_constant_size_p (const_tree size, cst_size_error *perr /* = NULL */)
   return true;
 }
 
-/* Return the precision of the type, or for a complex or vector type the
-   precision of the type of its elements.  */
+/* Return the type, or for a complex or vector type the type of its
+   elements.  */
 
-unsigned int
-element_precision (const_tree type)
+tree
+element_type (tree type)
 {
   if (!TYPE_P (type))
     type = TREE_TYPE (type);
@@ -6658,7 +6658,16 @@ element_precision (const_tree type)
   if (code == COMPLEX_TYPE || code == VECTOR_TYPE)
     type = TREE_TYPE (type);
 
-  return TYPE_PRECISION (type);
+  return const_cast<tree> (type);
+}
+
+/* Return the precision of the type, or for a complex or vector type the
+   precision of the type of its elements.  */
+
+unsigned int
+element_precision (const_tree type)
+{
+  return TYPE_PRECISION (element_type (const_cast<tree> (type)));
 }
 
 /* Return true if CODE represents an associative tree code.  Otherwise

next prev parent reply	other threads:[~2022-01-10 14:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 17:51 Andre Vieira (lists)
2021-11-12 10:56 ` Richard Biener
2021-11-12 11:48   ` Andre Simoes Dias Vieira
2021-11-16 12:10     ` Richard Biener
2021-11-17 13:30       ` Andre Vieira (lists)
2021-11-17 15:38         ` Richard Sandiford
2021-11-18 11:05         ` Richard Biener
2021-11-22 11:38           ` Andre Vieira (lists)
2021-11-22 11:41             ` Richard Biener
2021-11-25 13:53               ` Andre Vieira (lists)
2021-12-07 11:29                 ` Andre Vieira (lists)
2021-12-17 12:44                 ` Richard Sandiford
2021-12-29 15:55                   ` Andre Vieira (lists)
2021-12-29 16:54                     ` Richard Sandiford
2022-01-03 12:18                     ` Richard Biener
2022-01-10 14:09                       ` Andre Vieira (lists) [this message]
2022-01-10 14:45                         ` Richard Biener
2022-01-14 10:37                         ` Richard Sandiford
2022-11-04 17:40                           ` Andre Vieira (lists)
2022-11-07 11:05                             ` Richard Biener
2022-11-07 14:19                               ` Andre Vieira (lists)
2022-11-07 14:56                                 ` Richard Biener
2022-11-09 11:33                                   ` Andre Vieira (lists)
2022-11-15 18:24                                 ` Richard Sandiford
2022-11-16 12:25                                   ` Richard Biener
2021-11-29 11:17           ` Andre Vieira (lists)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5d7bb7af-b09e-cb91-b457-c6148f65028e@arm.com \
    --to=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).