From: Richard Sandiford <richard.sandiford@arm.com>
To: "Andre Vieira \(lists\) via Gcc-patches" <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <rguenther@suse.de>,
"Andre Vieira \(lists\)" <andre.simoesdiasvieira@arm.com>
Subject: Re: [AArch64] Enable generation of FRINTNZ instructions
Date: Fri, 17 Dec 2021 12:44:25 +0000 [thread overview]
Message-ID: <mptee6b8kpi.fsf@arm.com> (raw)
In-Reply-To: <f5312e85-b175-fbc9-8858-de611b3d99ac@arm.com> (Andre Vieira via Gcc-patches's message of "Thu, 25 Nov 2021 13:53:08 +0000")
"Andre Vieira (lists) via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> On 22/11/2021 11:41, Richard Biener wrote:
>>
>>> On 18/11/2021 11:05, Richard Biener wrote:
>>>> This is a good shout and made me think about something I hadn't before... I
>>>> thought I could handle the vector forms later, but the problem is if I add
>>>> support for the scalar, it will stop the vectorizer. It seems
>>>> vectorizable_call expects all arguments to have the same type, which doesn't
>>>> work with passing the integer type as an operand work around.
>> We already special case some IFNs there (masked load/store and gather)
>> to ignore some args, so that would just add to this set.
>>
>> Richard.
> Hi,
>
> Reworked it to add support of the new IFN to the vectorizer. Was
> initially trying to make vectorizable_call and
> vectorizable_internal_function handle IFNs with different inputs more
> generically, using the information we have in the <IFN>_direct structs
> regarding what operands to get the modes from. Unfortunately, that
> wasn't straightforward because of how vectorizable_call assumes operands
> have the same type and uses the type of the DEF_STMT_INFO of the
> non-constant operands (either output operand or non-constant inputs) to
> determine the type of constants. I assume there is some reason why we
> use the DEF_STMT_INFO and not always use get_vectype_for_scalar_type on
> the argument types. That is why I ended up with this sort of half-way
> mix of both, which still allows room to add more IFNs that don't take
> inputs of the same type, but require adding a bit of special casing
> similar to the IFN_FTRUNC_INT and masking ones.
>
> Bootstrapped on aarch64-none-linux.
Still leaving the match.pd stuff to Richard, but otherwise:
> index bdc8ba3576cf2c9b4ae96b45a382234e4e25b13f..51f00344b02d0d1d4adf97463f6a46f9fd0fb43f 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -160,7 +160,11 @@ (define_mode_iterator VHSDF_HSDF [(V4HF "TARGET_SIMD_F16INST")
> SF DF])
>
> ;; Scalar and vetor modes for SF, DF.
> -(define_mode_iterator VSFDF [V2SF V4SF V2DF DF SF])
> +(define_mode_iterator VSFDF [ (V2SF "TARGET_SIMD")
Nit: excess space between [ and (.
> + (V4SF "TARGET_SIMD")
> + (V2DF "TARGET_SIMD")
> + (DF "TARGET_FLOAT")
> + (SF "TARGET_FLOAT")])
>
> ;; Advanced SIMD single Float modes.
> (define_mode_iterator VDQSF [V2SF V4SF])
> […]
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 41f1850bf6e95005647ca97a495a97d7e184d137..d50d09b0ae60d98537b9aece4396a490f33f174c 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -6175,6 +6175,15 @@ operands; otherwise, it may not.
>
> This pattern is not allowed to @code{FAIL}.
>
> +@cindex @code{ftrunc@var{m}@var{n}2} instruction pattern
> +@item @samp{ftrunc@var{m}@var{n}2}
> +Truncate operand 1 to a @var{n} mode signed integer, towards zero, and store
> +the result in operand 0. Both operands have mode @var{m}, which is a scalar or
> +vector floating-point mode. Exception must be thrown if operand 1 does not fit
Maybe “An exception must be raised”? “thrown” makes it sound like a
signal must be raised or C++ exception thrown.
> +in a @var{n} mode signed integer as it would have if the truncation happened
> +through separate floating point to integer conversion.
> +
> +
> @cindex @code{round@var{m}2} instruction pattern
> @item @samp{round@var{m}2}
> Round operand 1 to the nearest integer, rounding away from zero in the
> […]
> @@ -3688,6 +3712,15 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
> != CODE_FOR_nothing);
> }
>
> +static bool direct_ftrunc_int_optab_supported_p (convert_optab optab,
> + tree_pair types,
> + optimization_type opt_type)
Formatting nit: should be a line break after “bool”
> +{
> + return (convert_optab_handler (optab, TYPE_MODE (types.first),
> + TYPE_MODE (element_type (types.second)),
> + opt_type) != CODE_FOR_nothing);
> +}
> +
> #define direct_unary_optab_supported_p direct_optab_supported_p
> #define direct_binary_optab_supported_p direct_optab_supported_p
> #define direct_ternary_optab_supported_p direct_optab_supported_p
> […]
> diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..b93304eb2acb3d3d954eebee51d77ff23fee68ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
> @@ -0,0 +1,47 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=armv8.5-a" } */
> +/* { dg-require-effective-target aarch64_frintnzx_ok } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +#define TEST(name,float_type,int_type) \
> +void \
> +name (float_type * __restrict__ x, float_type * __restrict__ y, int n) \
> +{ \
> + for (int i = 0; i < n; ++i) \
> + { \
> + int_type x_i = x[i]; \
> + y[i] = (float_type) x_i; \
> + } \
> +}
> +
> +/*
> +** f1:
> +** ...
> +** frint32z v0.4s, v0.4s
I don't think we can rely on v0 being used here. v[0-9]+\.4s would
be safer.
> +** ...
> +*/
> +TEST(f1, float, int)
> +
> +/*
> +** f2:
> +** ...
> +** frint64z v0.4s, v0.4s
> +** ...
> +*/
> +TEST(f2, float, long long)
> +
> +/*
> +** f3:
> +** ...
> +** frint32z v0.2d, v0.2d
> +** ...
> +*/
> +TEST(f3, double, int)
> +
> +/*
> +** f4:
> +** ...
> +** frint64z v0.2d, v0.2d
> +** ...
> +*/
> +TEST(f4, double, long long)
> […]
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 03cc7267cf80d4ce73c0d89ab86b07e84752456a..35bb1f70f7b173ad0d1e9f70ce0ac9da891dbe62 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1625,7 +1625,8 @@ vect_finish_stmt_generation (vec_info *vinfo,
>
> static internal_fn
> vectorizable_internal_function (combined_fn cfn, tree fndecl,
> - tree vectype_out, tree vectype_in)
> + tree vectype_out, tree vectype_in,
> + tree *vectypes)
> {
> internal_fn ifn;
> if (internal_fn_p (cfn))
> @@ -1637,8 +1638,12 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl,
> const direct_internal_fn_info &info = direct_internal_fn (ifn);
> if (info.vectorizable)
> {
> - tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
> - tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
> + tree type0 = (info.type0 < 0 ? vectype_out : vectypes[info.type0]);
> + if (!type0)
> + type0 = vectype_in;
> + tree type1 = (info.type1 < 0 ? vectype_out : vectypes[info.type1]);
> + if (!type1)
> + type1 = vectype_in;
> if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
> OPTIMIZE_FOR_SPEED))
> return ifn;
> @@ -3252,16 +3257,31 @@ vectorizable_call (vec_info *vinfo,
> rhs_type = unsigned_type_node;
> }
>
> - int mask_opno = -1;
> + /* The argument that is not of the same type as the others. */
> + int diff_opno = -1;
> + bool masked = false;
> if (internal_fn_p (cfn))
> - mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
> + {
> + if (cfn == CFN_FTRUNC_INT)
> + /* For FTRUNC this represents the argument that carries the type of the
> + intermediate signed integer. */
> + diff_opno = 1;
> + else
> + {
> + /* For masked operations this represents the argument that carries the
> + mask. */
> + diff_opno = internal_fn_mask_index (as_internal_fn (cfn));
> + masked = diff_opno >= 0;
> + }
> + }
I think it would be cleaner not to process argument 1 at all for
CFN_FTRUNC_INT. There's no particular need to vectorise it.
> for (i = 0; i < nargs; i++)
> {
> - if ((int) i == mask_opno)
> + if ((int) i == diff_opno && masked)
> {
> - if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_opno,
> - &op, &slp_op[i], &dt[i], &vectypes[i]))
> + if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node,
> + diff_opno, &op, &slp_op[i], &dt[i],
> + &vectypes[i]))
> return false;
> continue;
> }
> […]
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 845228a055b2cfac0c9ca8c0cda1b9df4b0095c6..f1e9a1eb48769cb11aa69730e2480ed5522f78c1 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -6645,11 +6645,11 @@ valid_constant_size_p (const_tree size, cst_size_error *perr /* = NULL */)
> return true;
> }
>
> -/* Return the precision of the type, or for a complex or vector type the
> - precision of the type of its elements. */
> +/* Return the type, or for a complex or vector type the type of its
> + elements. */
>
> -unsigned int
> -element_precision (const_tree type)
> +tree
> +element_type (const_tree type)
> {
> if (!TYPE_P (type))
> type = TREE_TYPE (type);
> @@ -6657,7 +6657,16 @@ element_precision (const_tree type)
> if (code == COMPLEX_TYPE || code == VECTOR_TYPE)
> type = TREE_TYPE (type);
>
> - return TYPE_PRECISION (type);
> + return (tree) type;
I think we should stick a const_cast in element_precision and make
element_type take a plain “type”. As it stands element_type is an
implicit const_cast for many cases.
Thanks,
Richard
> +}
> +
> +/* Return the precision of the type, or for a complex or vector type the
> + precision of the type of its elements. */
> +
> +unsigned int
> +element_precision (const_tree type)
> +{
> + return TYPE_PRECISION (element_type (type));
> }
>
> /* Return true if CODE represents an associative tree code. Otherwise
next prev parent reply other threads:[~2021-12-17 12:44 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-11 17:51 Andre Vieira (lists)
2021-11-12 10:56 ` Richard Biener
2021-11-12 11:48 ` Andre Simoes Dias Vieira
2021-11-16 12:10 ` Richard Biener
2021-11-17 13:30 ` Andre Vieira (lists)
2021-11-17 15:38 ` Richard Sandiford
2021-11-18 11:05 ` Richard Biener
2021-11-22 11:38 ` Andre Vieira (lists)
2021-11-22 11:41 ` Richard Biener
2021-11-25 13:53 ` Andre Vieira (lists)
2021-12-07 11:29 ` Andre Vieira (lists)
2021-12-17 12:44 ` Richard Sandiford [this message]
2021-12-29 15:55 ` Andre Vieira (lists)
2021-12-29 16:54 ` Richard Sandiford
2022-01-03 12:18 ` Richard Biener
2022-01-10 14:09 ` Andre Vieira (lists)
2022-01-10 14:45 ` Richard Biener
2022-01-14 10:37 ` Richard Sandiford
2022-11-04 17:40 ` Andre Vieira (lists)
2022-11-07 11:05 ` Richard Biener
2022-11-07 14:19 ` Andre Vieira (lists)
2022-11-07 14:56 ` Richard Biener
2022-11-09 11:33 ` Andre Vieira (lists)
2022-11-15 18:24 ` Richard Sandiford
2022-11-16 12:25 ` Richard Biener
2021-11-29 11:17 ` Andre Vieira (lists)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptee6b8kpi.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).