From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: "Andre Vieira (lists) via Gcc-patches" <gcc-patches@gcc.gnu.org>,
richard.sandiford@arm.com
Subject: Re: [AArch64] Enable generation of FRINTNZ instructions
Date: Mon, 10 Jan 2022 14:09:04 +0000 [thread overview]
Message-ID: <5d7bb7af-b09e-cb91-b457-c6148f65028e@arm.com> (raw)
In-Reply-To: <231396s0-2756-q51s-q55-o8npqo91on32@fhfr.qr>
[-- Attachment #1: Type: text/plain, Size: 5328 bytes --]
Yeah seems I forgot to send the latest version, my bad.
Bootstrapped on aarch64-none-linux.
OK for trunk?
gcc/ChangeLog:
* config/aarch64/aarch64.md (ftrunc<mode><frintnz_mode>2): New
pattern.
* config/aarch64/iterators.md (FRINTNZ): New iterator.
(frintnz_mode): New int attribute.
(VSFDF): Make iterator conditional.
* internal-fn.def (FTRUNC_INT): New IFN.
* internal-fn.c (ftrunc_int_direct): New define.
(expand_ftrunc_int_optab_fn): New custom expander.
(direct_ftrunc_int_optab_supported_p): New supported_p.
* match.pd: Add to the existing TRUNC pattern match.
* optabs.def (ftrunc_int): New entry.
* stor-layout.h (element_precision): Moved from here...
* tree.h (element_precision): ... to here.
(element_type): New declaration.
* tree.c (element_type): New function.
(element_precision): Changed to use element_type.
* tree-vect-stmts.c (vectorizable_internal_function): Add
support for
IFNs with different input types.
(vectorizable_call): Teach to handle IFN_FTRUNC_INT.
* doc/md.texi: New entry for ftrunc pattern name.
* doc/sourcebuild.texi (aarch64_frintzx_ok): New target.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/merge_trunc1.c: Adapted to skip if frintNz
instruction available.
* lib/target-supports.exp: Added arm_v8_5a_frintnzx_ok target.
* gcc.target/aarch64/frintnz.c: New test.
* gcc.target/aarch64/frintnz_vec.c: New test.
On 03/01/2022 12:18, Richard Biener wrote:
> On Wed, 29 Dec 2021, Andre Vieira (lists) wrote:
>
>> Hi Richard,
>>
>> Thank you for the review, I've adopted all above suggestions downstream, I am
>> still surprised how many style things I still miss after years of gcc
>> development :(
>>
>> On 17/12/2021 12:44, Richard Sandiford wrote:
>>>> @@ -3252,16 +3257,31 @@ vectorizable_call (vec_info *vinfo,
>>>> rhs_type = unsigned_type_node;
>>>> }
>>>> - int mask_opno = -1;
>>>> + /* The argument that is not of the same type as the others. */
>>>> + int diff_opno = -1;
>>>> + bool masked = false;
>>>> if (internal_fn_p (cfn))
>>>> - mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
>>>> + {
>>>> + if (cfn == CFN_FTRUNC_INT)
>>>> + /* For FTRUNC this represents the argument that carries the type of
>>>> the
>>>> + intermediate signed integer. */
>>>> + diff_opno = 1;
>>>> + else
>>>> + {
>>>> + /* For masked operations this represents the argument that carries
>>>> the
>>>> + mask. */
>>>> + diff_opno = internal_fn_mask_index (as_internal_fn (cfn));
>>>> + masked = diff_opno >= 0;
>>>> + }
>>>> + }
>>> I think it would be cleaner not to process argument 1 at all for
>>> CFN_FTRUNC_INT. There's no particular need to vectorise it.
>> I agree with this, will change the loop to continue for argument 1 when
>> dealing with non-masked CFN's.
>>
>>>> }
>>>> […]
>>>> diff --git a/gcc/tree.c b/gcc/tree.c
>>>> index
>>>> 845228a055b2cfac0c9ca8c0cda1b9df4b0095c6..f1e9a1eb48769cb11aa69730e2480ed5522f78c1
>>>> 100644
>>>> --- a/gcc/tree.c
>>>> +++ b/gcc/tree.c
>>>> @@ -6645,11 +6645,11 @@ valid_constant_size_p (const_tree size,
>>>> cst_size_error *perr /* = NULL */)
>>>> return true;
>>>> }
>>>>
>>>> -/* Return the precision of the type, or for a complex or vector type the
>>>> - precision of the type of its elements. */
>>>> +/* Return the type, or for a complex or vector type the type of its
>>>> + elements. */
>>>> -unsigned int
>>>> -element_precision (const_tree type)
>>>> +tree
>>>> +element_type (const_tree type)
>>>> {
>>>> if (!TYPE_P (type))
>>>> type = TREE_TYPE (type);
>>>> @@ -6657,7 +6657,16 @@ element_precision (const_tree type)
>>>> if (code == COMPLEX_TYPE || code == VECTOR_TYPE)
>>>> type = TREE_TYPE (type);
>>>> - return TYPE_PRECISION (type);
>>>> + return (tree) type;
>>> I think we should stick a const_cast in element_precision and make
>>> element_type take a plain “type”. As it stands element_type is an
>>> implicit const_cast for many cases.
>>>
>>> Thanks,
>> Was just curious about something here, I thought the purpose of having
>> element_precision (before) and element_type (now) take a const_tree as an
>> argument was to make it clear we aren't changing the input type. I understand
>> that as it stands element_type could be an implicit const_cast (which I should
>> be using rather than the '(tree)' cast), but that's only if 'type' is a type
>> that isn't complex/vector, either way, we are conforming to the promise that
>> we aren't changing the incoming type, what the caller then does with the
>> result is up to them no?
>>
>> I don't mind making the changes, just trying to understand the reasoning
>> behind it.
>>
>> I'll send in a new patch with all the changes after the review on the match.pd
>> stuff.
> I'm missing an updated patch after my initial review of the match.pd
> stuff so not sure what to review. Can you re-post and updated patch?
>
> Thanks,
> Richard.
[-- Attachment #2: frintnz4.patch --]
[-- Type: text/plain, Size: 23433 bytes --]
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 3c72bdad01bfab49ee4ae6fb7b139202e89c8d34..9d04a2e088cd7d03963b58ed3708c339b841801c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -7423,12 +7423,18 @@ (define_insn "despeculate_simpleti"
(set_attr "speculation_barrier" "true")]
)
+(define_expand "ftrunc<mode><frintnz_mode>2"
+ [(set (match_operand:VSFDF 0 "register_operand" "=w")
+ (unspec:VSFDF [(match_operand:VSFDF 1 "register_operand" "w")]
+ FRINTNZ))]
+ "TARGET_FRINT"
+)
+
(define_insn "aarch64_<frintnzs_op><mode>"
[(set (match_operand:VSFDF 0 "register_operand" "=w")
(unspec:VSFDF [(match_operand:VSFDF 1 "register_operand" "w")]
FRINTNZX))]
- "TARGET_FRINT && TARGET_FLOAT
- && !(VECTOR_MODE_P (<MODE>mode) && !TARGET_SIMD)"
+ "TARGET_FRINT"
"<frintnzs_op>\\t%<v>0<Vmtype>, %<v>1<Vmtype>"
[(set_attr "type" "f_rint<stype>")]
)
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 9160ce3e69c2c6b1b75e46f7aabd27e7949a269a..7962b15a87db2f1ede3836efbb827b8fb95266da 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -163,7 +163,11 @@ (define_mode_iterator VHSDF_HSDF [(V4HF "TARGET_SIMD_F16INST")
SF DF])
;; Scalar and vetor modes for SF, DF.
-(define_mode_iterator VSFDF [V2SF V4SF V2DF DF SF])
+(define_mode_iterator VSFDF [(V2SF "TARGET_SIMD")
+ (V4SF "TARGET_SIMD")
+ (V2DF "TARGET_SIMD")
+ (DF "TARGET_FLOAT")
+ (SF "TARGET_FLOAT")])
;; Advanced SIMD single Float modes.
(define_mode_iterator VDQSF [V2SF V4SF])
@@ -3078,6 +3082,8 @@ (define_int_iterator FCMLA [UNSPEC_FCMLA
(define_int_iterator FRINTNZX [UNSPEC_FRINT32Z UNSPEC_FRINT32X
UNSPEC_FRINT64Z UNSPEC_FRINT64X])
+(define_int_iterator FRINTNZ [UNSPEC_FRINT32Z UNSPEC_FRINT64Z])
+
(define_int_iterator SVE_BRK_UNARY [UNSPEC_BRKA UNSPEC_BRKB])
(define_int_iterator SVE_BRK_BINARY [UNSPEC_BRKN UNSPEC_BRKPA UNSPEC_BRKPB])
@@ -3485,6 +3491,8 @@ (define_int_attr f16mac1 [(UNSPEC_FMLAL "a") (UNSPEC_FMLSL "s")
(define_int_attr frintnzs_op [(UNSPEC_FRINT32Z "frint32z") (UNSPEC_FRINT32X "frint32x")
(UNSPEC_FRINT64Z "frint64z") (UNSPEC_FRINT64X "frint64x")])
+(define_int_attr frintnz_mode [(UNSPEC_FRINT32Z "si") (UNSPEC_FRINT64Z "di")])
+
;; The condition associated with an UNSPEC_COND_<xx>.
(define_int_attr cmp_op [(UNSPEC_COND_CMPEQ_WIDE "eq")
(UNSPEC_COND_CMPGE_WIDE "ge")
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 19e89ae502bc2f51db64667b236c1cb669718b02..3b0e4e0875b4392ab6833568b207580ef597a98f 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6191,6 +6191,15 @@ operands; otherwise, it may not.
This pattern is not allowed to @code{FAIL}.
+@cindex @code{ftrunc@var{m}@var{n}2} instruction pattern
+@item @samp{ftrunc@var{m}@var{n}2}
+Truncate operand 1 to a @var{n} mode signed integer, towards zero, and store
+the result in operand 0. Both operands have mode @var{m}, which is a scalar or
+vector floating-point mode. An exception must be raised if operand 1 does not
+fit in a @var{n} mode signed integer as it would have if the truncation
+happened through separate floating point to integer conversion.
+
+
@cindex @code{round@var{m}2} instruction pattern
@item @samp{round@var{m}2}
Round operand 1 to the nearest integer, rounding away from zero in the
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 6095a35cd4565fdb7d758104e80fe6411230f758..a56bbb775572fa72379854f90a01ad543557e29a 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2286,6 +2286,10 @@ Like @code{aarch64_sve_hw}, but also test for an exact hardware vector length.
@item aarch64_fjcvtzs_hw
AArch64 target that is able to generate and execute armv8.3-a FJCVTZS
instruction.
+
+@item aarch64_frintzx_ok
+AArch64 target that is able to generate the Armv8.5-a FRINT32Z, FRINT64Z,
+FRINT32X and FRINT64X instructions.
@end table
@subsubsection MIPS-specific attributes
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index b24102a5990bea4cbb102069f7a6df497fc81ebf..9047b486f41948059a7a7f1ccc4032410a369139 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -130,6 +130,7 @@ init_internal_fns ()
#define fold_left_direct { 1, 1, false }
#define mask_fold_left_direct { 1, 1, false }
#define check_ptrs_direct { 0, 0, false }
+#define ftrunc_int_direct { 0, 1, true }
const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct,
@@ -156,6 +157,29 @@ get_multi_vector_move (tree array_type, convert_optab optab)
return convert_optab_handler (optab, imode, vmode);
}
+/* Expand FTRUNC_INT call STMT using optab OPTAB. */
+
+static void
+expand_ftrunc_int_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+{
+ class expand_operand ops[2];
+ tree lhs, float_type, int_type;
+ rtx target, op;
+
+ lhs = gimple_call_lhs (stmt);
+ target = expand_normal (lhs);
+ op = expand_normal (gimple_call_arg (stmt, 0));
+
+ float_type = TREE_TYPE (lhs);
+ int_type = element_type (gimple_call_arg (stmt, 1));
+
+ create_output_operand (&ops[0], target, TYPE_MODE (float_type));
+ create_input_operand (&ops[1], op, TYPE_MODE (float_type));
+
+ expand_insn (convert_optab_handler (optab, TYPE_MODE (float_type),
+ TYPE_MODE (int_type)), 2, ops);
+}
+
/* Expand LOAD_LANES call STMT using optab OPTAB. */
static void
@@ -3747,6 +3771,15 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
!= CODE_FOR_nothing);
}
+static bool
+direct_ftrunc_int_optab_supported_p (convert_optab optab, tree_pair types,
+ optimization_type opt_type)
+{
+ return (convert_optab_handler (optab, TYPE_MODE (types.first),
+ TYPE_MODE (element_type (types.second)),
+ opt_type) != CODE_FOR_nothing);
+}
+
#define direct_unary_optab_supported_p direct_optab_supported_p
#define direct_binary_optab_supported_p direct_optab_supported_p
#define direct_ternary_optab_supported_p direct_optab_supported_p
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 8891071a6a360961643731094379b607f317af17..a0fd75829e942f529c879c669e58c098b62b26ba 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -66,6 +66,9 @@ along with GCC; see the file COPYING3. If not see
- fold_left: for scalar = FN (scalar, vector), keyed off the vector mode
- check_ptrs: used for check_{raw,war}_ptrs
+ - ftrunc_int: a unary conversion optab that takes and returns values of the
+ same mode, but internally converts via another mode. This second mode is
+ specified using a dummy final function argument.
DEF_INTERNAL_SIGNED_OPTAB_FN defines an internal function that
maps to one of two optabs, depending on the signedness of an input.
@@ -275,6 +278,7 @@ DEF_INTERNAL_FLT_FLOATN_FN (RINT, ECF_CONST, rint, unary)
DEF_INTERNAL_FLT_FLOATN_FN (ROUND, ECF_CONST, round, unary)
DEF_INTERNAL_FLT_FLOATN_FN (ROUNDEVEN, ECF_CONST, roundeven, unary)
DEF_INTERNAL_FLT_FLOATN_FN (TRUNC, ECF_CONST, btrunc, unary)
+DEF_INTERNAL_OPTAB_FN (FTRUNC_INT, ECF_CONST, ftruncint, ftrunc_int)
/* Binary math functions. */
DEF_INTERNAL_FLT_FN (ATAN2, ECF_CONST, atan2, binary)
diff --git a/gcc/match.pd b/gcc/match.pd
index 84c9b918041eef3409bdb0fbe04565b90b25d6e9..a5d892ac1ebfaa7b5d5fa970baa04c8e5b8acb28 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3751,12 +3751,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
trapping behaviour, so require !flag_trapping_math. */
#if GIMPLE
(simplify
- (float (fix_trunc @0))
- (if (!flag_trapping_math
- && types_match (type, TREE_TYPE (@0))
- && direct_internal_fn_supported_p (IFN_TRUNC, type,
- OPTIMIZE_FOR_BOTH))
- (IFN_TRUNC @0)))
+ (float (fix_trunc@1 @0))
+ (if (types_match (type, TREE_TYPE (@0)))
+ (with {
+ tree int_type = element_type (@1);
+ }
+ (if (TYPE_SIGN (TREE_TYPE (@1)) == SIGNED
+ && direct_internal_fn_supported_p (IFN_FTRUNC_INT, type, int_type,
+ OPTIMIZE_FOR_BOTH))
+ (IFN_FTRUNC_INT @0 {
+ wide_int_to_tree (int_type, wi::max_value (TYPE_PRECISION (int_type),
+ SIGNED)); })
+ (if (!flag_trapping_math
+ && direct_internal_fn_supported_p (IFN_TRUNC, type,
+ OPTIMIZE_FOR_BOTH))
+ (IFN_TRUNC @0))))))
#endif
/* If we have a narrowing conversion to an integral type that is fed by a
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 5fcf5386a0b3112ef9004055c82e15fe47668970..04a4ee82e15fe7b52e726f2ee0bf704c30ac450d 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -63,6 +63,7 @@ OPTAB_CX(fractuns_optab, "fractuns$Q$b$I$a2")
OPTAB_CL(satfract_optab, "satfract$b$Q$a2", SAT_FRACT, "satfract", gen_satfract_conv_libfunc)
OPTAB_CL(satfractuns_optab, "satfractuns$I$b$Q$a2", UNSIGNED_SAT_FRACT, "satfractuns", gen_satfractuns_conv_libfunc)
+OPTAB_CD(ftruncint_optab, "ftrunc$a$b2")
OPTAB_CD(sfixtrunc_optab, "fix_trunc$F$b$I$a2")
OPTAB_CD(ufixtrunc_optab, "fixuns_trunc$F$b$I$a2")
diff --git a/gcc/stor-layout.h b/gcc/stor-layout.h
index b67abebc0096113272bfb1221eabaabd08657a58..e0219c8af4846ea0f947586b1915d9d06cb6c107 100644
--- a/gcc/stor-layout.h
+++ b/gcc/stor-layout.h
@@ -36,7 +36,6 @@ extern void place_field (record_layout_info, tree);
extern void compute_record_mode (tree);
extern void finish_bitfield_layout (tree);
extern void finish_record_layout (record_layout_info, int);
-extern unsigned int element_precision (const_tree);
extern void finalize_size_functions (void);
extern void fixup_unsigned_type (tree);
extern void initialize_sizetypes (void);
diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz.c b/gcc/testsuite/gcc.target/aarch64/frintnz.c
new file mode 100644
index 0000000000000000000000000000000000000000..008e1cf9f4a1b0148128c65c9ea0d1bb111467b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/frintnz.c
@@ -0,0 +1,91 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv8.5-a" } */
+/* { dg-require-effective-target aarch64_frintnzx_ok } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** f1:
+** frint32z s0, s0
+** ret
+*/
+float
+f1 (float x)
+{
+ int y = x;
+ return (float) y;
+}
+
+/*
+** f2:
+** frint64z s0, s0
+** ret
+*/
+float
+f2 (float x)
+{
+ long long int y = x;
+ return (float) y;
+}
+
+/*
+** f3:
+** frint32z d0, d0
+** ret
+*/
+double
+f3 (double x)
+{
+ int y = x;
+ return (double) y;
+}
+
+/*
+** f4:
+** frint64z d0, d0
+** ret
+*/
+double
+f4 (double x)
+{
+ long long int y = x;
+ return (double) y;
+}
+
+float
+f1_dont (float x)
+{
+ unsigned int y = x;
+ return (float) y;
+}
+
+float
+f2_dont (float x)
+{
+ unsigned long long int y = x;
+ return (float) y;
+}
+
+double
+f3_dont (double x)
+{
+ unsigned int y = x;
+ return (double) y;
+}
+
+double
+f4_dont (double x)
+{
+ unsigned long long int y = x;
+ return (double) y;
+}
+
+double
+f5_dont (double x)
+{
+ signed short y = x;
+ return (double) y;
+}
+
+/* Make sure the 'dont's don't generate any frintNz. */
+/* { dg-final { scan-assembler-times {frint32z} 2 } } */
+/* { dg-final { scan-assembler-times {frint64z} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
new file mode 100644
index 0000000000000000000000000000000000000000..801d65ea8325cb680691286aab42747f43b90687
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.5-a" } */
+/* { dg-require-effective-target aarch64_frintnzx_ok } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define TEST(name,float_type,int_type) \
+void \
+name (float_type * __restrict__ x, float_type * __restrict__ y, int n) \
+{ \
+ for (int i = 0; i < n; ++i) \
+ { \
+ int_type x_i = x[i]; \
+ y[i] = (float_type) x_i; \
+ } \
+}
+
+/*
+** f1:
+** ...
+** frint32z v[0-9]+\.4s, v[0-9]+\.4s
+** ...
+*/
+TEST(f1, float, int)
+
+/*
+** f2:
+** ...
+** frint64z v[0-9]+\.4s, v[0-9]+\.4s
+** ...
+*/
+TEST(f2, float, long long)
+
+/*
+** f3:
+** ...
+** frint32z v[0-9]+\.2d, v[0-9]+\.2d
+** ...
+*/
+TEST(f3, double, int)
+
+/*
+** f4:
+** ...
+** frint64z v[0-9]+\.2d, v[0-9]+\.2d
+** ...
+*/
+TEST(f4, double, long long)
diff --git a/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c b/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c
index 07217064e2ba54fcf4f5edc440e6ec19ddae66e1..3d80871c4cebd5fb5cac0714b3feee27038f05fd 100644
--- a/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c
+++ b/gcc/testsuite/gcc.target/aarch64/merge_trunc1.c
@@ -1,5 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-O2 -ffast-math" } */
+/* { dg-skip-if "" { aarch64_frintnzx_ok } } */
float
f1 (float x)
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index c1ad97c6bd20d6e970edb24a125451580f014d55..5758e9cee4416b60b6766ecb37cbf3b37ac98522 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11399,6 +11399,32 @@ proc check_effective_target_arm_v8_3a_bkey_directive { } {
}]
}
+# Return 1 if the target supports Armv8.5-A scalar and Advanced SIMD
+# FRINT32[ZX] andd FRINT64[ZX] instructions, 0 otherwise. The test is valid for
+# AArch64.
+proc check_effective_target_aarch64_frintnzx_ok_nocache { } {
+
+ if { ![istarget aarch64*-*-*] } {
+ return 0;
+ }
+
+ if { [check_no_compiler_messages_nocache \
+ aarch64_frintnzx_ok assembly {
+ #if !defined (__ARM_FEATURE_FRINT)
+ #error "__ARM_FEATURE_FRINT not defined"
+ #endif
+ } [current_compiler_flags]] } {
+ return 1;
+ }
+
+ return 0;
+}
+
+proc check_effective_target_aarch64_frintnzx_ok { } {
+ return [check_cached_effective_target aarch64_frintnzx_ok \
+ check_effective_target_aarch64_frintnzx_ok_nocache]
+}
+
# Return 1 if the target supports executing the Armv8.1-M Mainline Low
# Overhead Loop, 0 otherwise. The test is valid for ARM.
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index f2625a2ff4089739326ce11785f1b68678c07f0e..435f2f4f5aeb2ed4c503c7b6a97d375634ae4514 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1625,7 +1625,8 @@ vect_finish_stmt_generation (vec_info *vinfo,
static internal_fn
vectorizable_internal_function (combined_fn cfn, tree fndecl,
- tree vectype_out, tree vectype_in)
+ tree vectype_out, tree vectype_in,
+ tree *vectypes)
{
internal_fn ifn;
if (internal_fn_p (cfn))
@@ -1637,8 +1638,12 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl,
const direct_internal_fn_info &info = direct_internal_fn (ifn);
if (info.vectorizable)
{
- tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
- tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
+ tree type0 = (info.type0 < 0 ? vectype_out : vectypes[info.type0]);
+ if (!type0)
+ type0 = vectype_in;
+ tree type1 = (info.type1 < 0 ? vectype_out : vectypes[info.type1]);
+ if (!type1)
+ type1 = vectype_in;
if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
OPTIMIZE_FOR_SPEED))
return ifn;
@@ -3263,18 +3268,40 @@ vectorizable_call (vec_info *vinfo,
rhs_type = unsigned_type_node;
}
- int mask_opno = -1;
+ /* The argument that is not of the same type as the others. */
+ int diff_opno = -1;
+ bool masked = false;
if (internal_fn_p (cfn))
- mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
+ {
+ if (cfn == CFN_FTRUNC_INT)
+ /* For FTRUNC this represents the argument that carries the type of the
+ intermediate signed integer. */
+ diff_opno = 1;
+ else
+ {
+ /* For masked operations this represents the argument that carries the
+ mask. */
+ diff_opno = internal_fn_mask_index (as_internal_fn (cfn));
+ masked = diff_opno >= 0;
+ }
+ }
for (i = 0; i < nargs; i++)
{
- if ((int) i == mask_opno)
+ if ((int) i == diff_opno)
{
- if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_opno,
- &op, &slp_op[i], &dt[i], &vectypes[i]))
- return false;
- continue;
+ if (masked)
+ {
+ if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node,
+ diff_opno, &op, &slp_op[i], &dt[i],
+ &vectypes[i]))
+ return false;
+ }
+ else
+ {
+ vectypes[i] = TREE_TYPE (gimple_call_arg (stmt, i));
+ continue;
+ }
}
if (!vect_is_simple_use (vinfo, stmt_info, slp_node,
@@ -3286,27 +3313,30 @@ vectorizable_call (vec_info *vinfo,
return false;
}
- /* We can only handle calls with arguments of the same type. */
- if (rhs_type
- && !types_compatible_p (rhs_type, TREE_TYPE (op)))
+ if ((int) i != diff_opno)
{
- if (dump_enabled_p ())
- dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
- "argument types differ.\n");
- return false;
- }
- if (!rhs_type)
- rhs_type = TREE_TYPE (op);
+ /* We can only handle calls with arguments of the same type. */
+ if (rhs_type
+ && !types_compatible_p (rhs_type, TREE_TYPE (op)))
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+ "argument types differ.\n");
+ return false;
+ }
+ if (!rhs_type)
+ rhs_type = TREE_TYPE (op);
- if (!vectype_in)
- vectype_in = vectypes[i];
- else if (vectypes[i]
- && !types_compatible_p (vectypes[i], vectype_in))
- {
- if (dump_enabled_p ())
- dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
- "argument vector types differ.\n");
- return false;
+ if (!vectype_in)
+ vectype_in = vectypes[i];
+ else if (vectypes[i]
+ && !types_compatible_p (vectypes[i], vectype_in))
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+ "argument vector types differ.\n");
+ return false;
+ }
}
}
/* If all arguments are external or constant defs, infer the vector type
@@ -3382,8 +3412,8 @@ vectorizable_call (vec_info *vinfo,
|| (modifier == NARROW
&& simple_integer_narrowing (vectype_out, vectype_in,
&convert_code))))
- ifn = vectorizable_internal_function (cfn, callee, vectype_out,
- vectype_in);
+ ifn = vectorizable_internal_function (cfn, callee, vectype_out, vectype_in,
+ &vectypes[0]);
/* If that fails, try asking for a target-specific built-in function. */
if (ifn == IFN_LAST)
@@ -3461,7 +3491,7 @@ vectorizable_call (vec_info *vinfo,
if (loop_vinfo
&& LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
- && (reduc_idx >= 0 || mask_opno >= 0))
+ && (reduc_idx >= 0 || masked))
{
if (reduc_idx >= 0
&& (cond_fn == IFN_LAST
@@ -3481,8 +3511,8 @@ vectorizable_call (vec_info *vinfo,
? SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node)
: ncopies);
tree scalar_mask = NULL_TREE;
- if (mask_opno >= 0)
- scalar_mask = gimple_call_arg (stmt_info->stmt, mask_opno);
+ if (masked)
+ scalar_mask = gimple_call_arg (stmt_info->stmt, diff_opno);
vect_record_loop_mask (loop_vinfo, masks, nvectors,
vectype_out, scalar_mask);
}
@@ -3547,7 +3577,7 @@ vectorizable_call (vec_info *vinfo,
{
/* We don't define any narrowing conditional functions
at present. */
- gcc_assert (mask_opno < 0);
+ gcc_assert (!masked);
tree half_res = make_ssa_name (vectype_in);
gcall *call
= gimple_build_call_internal_vec (ifn, vargs);
@@ -3567,16 +3597,16 @@ vectorizable_call (vec_info *vinfo,
}
else
{
- if (mask_opno >= 0 && masked_loop_p)
+ if (masked && masked_loop_p)
{
unsigned int vec_num = vec_oprnds0.length ();
/* Always true for SLP. */
gcc_assert (ncopies == 1);
tree mask = vect_get_loop_mask (gsi, masks, vec_num,
vectype_out, i);
- vargs[mask_opno] = prepare_vec_mask
+ vargs[diff_opno] = prepare_vec_mask
(loop_vinfo, TREE_TYPE (mask), mask,
- vargs[mask_opno], gsi);
+ vargs[diff_opno], gsi);
}
gcall *call;
@@ -3614,13 +3644,13 @@ vectorizable_call (vec_info *vinfo,
if (masked_loop_p && reduc_idx >= 0)
vargs[varg++] = vargs[reduc_idx + 1];
- if (mask_opno >= 0 && masked_loop_p)
+ if (masked && masked_loop_p)
{
tree mask = vect_get_loop_mask (gsi, masks, ncopies,
vectype_out, j);
- vargs[mask_opno]
+ vargs[diff_opno]
= prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask,
- vargs[mask_opno], gsi);
+ vargs[diff_opno], gsi);
}
gimple *new_stmt;
@@ -3639,7 +3669,7 @@ vectorizable_call (vec_info *vinfo,
{
/* We don't define any narrowing conditional functions at
present. */
- gcc_assert (mask_opno < 0);
+ gcc_assert (!masked);
tree half_res = make_ssa_name (vectype_in);
gcall *call = gimple_build_call_internal_vec (ifn, vargs);
gimple_call_set_lhs (call, half_res);
@@ -3683,7 +3713,7 @@ vectorizable_call (vec_info *vinfo,
{
auto_vec<vec<tree> > vec_defs (nargs);
/* We don't define any narrowing conditional functions at present. */
- gcc_assert (mask_opno < 0);
+ gcc_assert (!masked);
for (j = 0; j < ncopies; ++j)
{
/* Build argument list for the vectorized call. */
diff --git a/gcc/tree.h b/gcc/tree.h
index 318019c4dc5373271551f5d9a48dadb57a29d4a7..770d0ddfcc9a7acda01ed2fafa61eab0f1ba4cfa 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -6558,4 +6558,12 @@ extern unsigned fndecl_dealloc_argno (tree);
object or pointer. Otherwise return null. */
extern tree get_attr_nonstring_decl (tree, tree * = NULL);
+/* Return the type, or for a complex or vector type the type of its
+ elements. */
+extern tree element_type (tree);
+
+/* Return the precision of the type, or for a complex or vector type the
+ precision of the type of its elements. */
+extern unsigned int element_precision (const_tree);
+
#endif /* GCC_TREE_H */
diff --git a/gcc/tree.c b/gcc/tree.c
index d98b77db50b29b22dc9af1f98cd86044f62af019..81e66dd710ce6bc237f508655cfb437b40ec0bfa 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -6646,11 +6646,11 @@ valid_constant_size_p (const_tree size, cst_size_error *perr /* = NULL */)
return true;
}
-/* Return the precision of the type, or for a complex or vector type the
- precision of the type of its elements. */
+/* Return the type, or for a complex or vector type the type of its
+ elements. */
-unsigned int
-element_precision (const_tree type)
+tree
+element_type (tree type)
{
if (!TYPE_P (type))
type = TREE_TYPE (type);
@@ -6658,7 +6658,16 @@ element_precision (const_tree type)
if (code == COMPLEX_TYPE || code == VECTOR_TYPE)
type = TREE_TYPE (type);
- return TYPE_PRECISION (type);
+ return const_cast<tree> (type);
+}
+
+/* Return the precision of the type, or for a complex or vector type the
+ precision of the type of its elements. */
+
+unsigned int
+element_precision (const_tree type)
+{
+ return TYPE_PRECISION (element_type (const_cast<tree> (type)));
}
/* Return true if CODE represents an associative tree code. Otherwise
next prev parent reply other threads:[~2022-01-10 14:09 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-11 17:51 Andre Vieira (lists)
2021-11-12 10:56 ` Richard Biener
2021-11-12 11:48 ` Andre Simoes Dias Vieira
2021-11-16 12:10 ` Richard Biener
2021-11-17 13:30 ` Andre Vieira (lists)
2021-11-17 15:38 ` Richard Sandiford
2021-11-18 11:05 ` Richard Biener
2021-11-22 11:38 ` Andre Vieira (lists)
2021-11-22 11:41 ` Richard Biener
2021-11-25 13:53 ` Andre Vieira (lists)
2021-12-07 11:29 ` Andre Vieira (lists)
2021-12-17 12:44 ` Richard Sandiford
2021-12-29 15:55 ` Andre Vieira (lists)
2021-12-29 16:54 ` Richard Sandiford
2022-01-03 12:18 ` Richard Biener
2022-01-10 14:09 ` Andre Vieira (lists) [this message]
2022-01-10 14:45 ` Richard Biener
2022-01-14 10:37 ` Richard Sandiford
2022-11-04 17:40 ` Andre Vieira (lists)
2022-11-07 11:05 ` Richard Biener
2022-11-07 14:19 ` Andre Vieira (lists)
2022-11-07 14:56 ` Richard Biener
2022-11-09 11:33 ` Andre Vieira (lists)
2022-11-15 18:24 ` Richard Sandiford
2022-11-16 12:25 ` Richard Biener
2021-11-29 11:17 ` Andre Vieira (lists)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5d7bb7af-b09e-cb91-b457-c6148f65028e@arm.com \
--to=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rguenther@suse.de \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).