Re: [AArch64] Enable generation of FRINTNZ instructions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: "Andre Vieira \(lists\) via Gcc-patches" <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <rguenther@suse.de>,
	"Andre Vieira \(lists\)" <andre.simoesdiasvieira@arm.com>
Subject: Re: [AArch64] Enable generation of FRINTNZ instructions
Date: Fri, 17 Dec 2021 12:44:25 +0000	[thread overview]
Message-ID: <mptee6b8kpi.fsf@arm.com> (raw)
In-Reply-To: <f5312e85-b175-fbc9-8858-de611b3d99ac@arm.com> (Andre Vieira via Gcc-patches's message of "Thu, 25 Nov 2021 13:53:08 +0000")

"Andre Vieira (lists) via Gcc-patches" <gcc-patches@gcc.gnu.org> writes:
> On 22/11/2021 11:41, Richard Biener wrote:
>>
>>> On 18/11/2021 11:05, Richard Biener wrote:
>>>> This is a good shout and made me think about something I hadn't before... I
>>>> thought I could handle the vector forms later, but the problem is if I add
>>>> support for the scalar, it will stop the vectorizer. It seems
>>>> vectorizable_call expects all arguments to have the same type, which doesn't
>>>> work with passing the integer type as an operand work around.
>> We already special case some IFNs there (masked load/store and gather)
>> to ignore some args, so that would just add to this set.
>>
>> Richard.
> Hi,
>
> Reworked it to add support of the new IFN to the vectorizer. Was 
> initially trying to make vectorizable_call and 
> vectorizable_internal_function handle IFNs with different inputs more 
> generically, using the information we have in the <IFN>_direct structs 
> regarding what operands to get the modes from. Unfortunately, that 
> wasn't straightforward because of how vectorizable_call assumes operands 
> have the same type and uses the type of the DEF_STMT_INFO of the 
> non-constant operands (either output operand or non-constant inputs) to 
> determine the type of constants. I assume there is some reason why we 
> use the DEF_STMT_INFO and not always use get_vectype_for_scalar_type on 
> the argument types. That is why I ended up with this sort of half-way 
> mix of both, which still allows room to add more IFNs that don't take 
> inputs of the same type, but require adding a bit of special casing 
> similar to the IFN_FTRUNC_INT and masking ones.
>
> Bootstrapped on aarch64-none-linux.

Still leaving the match.pd stuff to Richard, but otherwise:

> index bdc8ba3576cf2c9b4ae96b45a382234e4e25b13f..51f00344b02d0d1d4adf97463f6a46f9fd0fb43f 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -160,7 +160,11 @@ (define_mode_iterator VHSDF_HSDF [(V4HF "TARGET_SIMD_F16INST")
>  				  SF DF])
>  
>  ;; Scalar and vetor modes for SF, DF.
> -(define_mode_iterator VSFDF [V2SF V4SF V2DF DF SF])
> +(define_mode_iterator VSFDF [ (V2SF "TARGET_SIMD")

Nit: excess space between [ and (.

> +			      (V4SF "TARGET_SIMD")
> +			      (V2DF "TARGET_SIMD")
> +			      (DF "TARGET_FLOAT")
> +			      (SF "TARGET_FLOAT")])
>  
>  ;; Advanced SIMD single Float modes.
>  (define_mode_iterator VDQSF [V2SF V4SF])
> […]
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 41f1850bf6e95005647ca97a495a97d7e184d137..d50d09b0ae60d98537b9aece4396a490f33f174c 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -6175,6 +6175,15 @@ operands; otherwise, it may not.
>  
>  This pattern is not allowed to @code{FAIL}.
>  
> +@cindex @code{ftrunc@var{m}@var{n}2} instruction pattern
> +@item @samp{ftrunc@var{m}@var{n}2}
> +Truncate operand 1 to a @var{n} mode signed integer, towards zero, and store
> +the result in operand 0. Both operands have mode @var{m}, which is a scalar or
> +vector floating-point mode.  Exception must be thrown if operand 1 does not fit

Maybe “An exception must be raised”?  “thrown” makes it sound like a
signal must be raised or C++ exception thrown.

> +in a @var{n} mode signed integer as it would have if the truncation happened
> +through separate floating point to integer conversion.
> +
> +
>  @cindex @code{round@var{m}2} instruction pattern
>  @item @samp{round@var{m}2}
>  Round operand 1 to the nearest integer, rounding away from zero in the
> […]
> @@ -3688,6 +3712,15 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
>  	  != CODE_FOR_nothing);
>  }
>  
> +static bool direct_ftrunc_int_optab_supported_p (convert_optab optab,
> +						 tree_pair types,
> +						 optimization_type opt_type)

Formatting nit: should be a line break after “bool”

> +{
> +  return (convert_optab_handler (optab, TYPE_MODE (types.first),
> +				TYPE_MODE (element_type (types.second)),
> +				opt_type) != CODE_FOR_nothing);
> +}
> +
>  #define direct_unary_optab_supported_p direct_optab_supported_p
>  #define direct_binary_optab_supported_p direct_optab_supported_p
>  #define direct_ternary_optab_supported_p direct_optab_supported_p
> […]
> diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..b93304eb2acb3d3d954eebee51d77ff23fee68ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/frintnz_vec.c
> @@ -0,0 +1,47 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=armv8.5-a" } */
> +/* { dg-require-effective-target aarch64_frintnzx_ok } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +#define TEST(name,float_type,int_type)					\
> +void									\
> +name (float_type * __restrict__ x, float_type * __restrict__ y, int n)  \
> +{									\
> +  for (int i = 0; i < n; ++i)					      \
> +    {								      \
> +      int_type x_i = x[i];					      \
> +      y[i] = (float_type) x_i;					      \
> +    }								      \
> +}
> +
> +/*
> +** f1:
> +**	...
> +**	frint32z	v0.4s, v0.4s

I don't think we can rely on v0 being used here.  v[0-9]+\.4s would
be safer.

> +**	...
> +*/
> +TEST(f1, float, int)
> +
> +/*
> +** f2:
> +**	...
> +**	frint64z	v0.4s, v0.4s
> +**	...
> +*/
> +TEST(f2, float, long long)
> +
> +/*
> +** f3:
> +**	...
> +**	frint32z	v0.2d, v0.2d
> +**	...
> +*/
> +TEST(f3, double, int)
> +
> +/*
> +** f4:
> +**	...
> +**	frint64z	v0.2d, v0.2d
> +**	...
> +*/
> +TEST(f4, double, long long)
> […]
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 03cc7267cf80d4ce73c0d89ab86b07e84752456a..35bb1f70f7b173ad0d1e9f70ce0ac9da891dbe62 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1625,7 +1625,8 @@ vect_finish_stmt_generation (vec_info *vinfo,
>  
>  static internal_fn
>  vectorizable_internal_function (combined_fn cfn, tree fndecl,
> -				tree vectype_out, tree vectype_in)
> +				tree vectype_out, tree vectype_in,
> +				tree *vectypes)
>  {
>    internal_fn ifn;
>    if (internal_fn_p (cfn))
> @@ -1637,8 +1638,12 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl,
>        const direct_internal_fn_info &info = direct_internal_fn (ifn);
>        if (info.vectorizable)
>  	{
> -	  tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
> -	  tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
> +	  tree type0 = (info.type0 < 0 ? vectype_out : vectypes[info.type0]);
> +	  if (!type0)
> +	    type0 = vectype_in;
> +	  tree type1 = (info.type1 < 0 ? vectype_out : vectypes[info.type1]);
> +	  if (!type1)
> +	    type1 = vectype_in;
>  	  if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
>  					      OPTIMIZE_FOR_SPEED))
>  	    return ifn;
> @@ -3252,16 +3257,31 @@ vectorizable_call (vec_info *vinfo,
>        rhs_type = unsigned_type_node;
>      }
>  
> -  int mask_opno = -1;
> +  /* The argument that is not of the same type as the others.  */
> +  int diff_opno = -1;
> +  bool masked = false;
>    if (internal_fn_p (cfn))
> -    mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
> +    {
> +      if (cfn == CFN_FTRUNC_INT)
> +	/* For FTRUNC this represents the argument that carries the type of the
> +	   intermediate signed integer.  */
> +	diff_opno = 1;
> +      else
> +	{
> +	  /* For masked operations this represents the argument that carries the
> +	     mask.  */
> +	  diff_opno = internal_fn_mask_index (as_internal_fn (cfn));
> +	  masked = diff_opno >=  0;
> +	}
> +    }

I think it would be cleaner not to process argument 1 at all for
CFN_FTRUNC_INT.  There's no particular need to vectorise it.

>    for (i = 0; i < nargs; i++)
>      {
> -      if ((int) i == mask_opno)
> +      if ((int) i == diff_opno && masked)
>  	{
> -	  if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_opno,
> -				       &op, &slp_op[i], &dt[i], &vectypes[i]))
> +	  if (!vect_check_scalar_mask (vinfo, stmt_info, slp_node,
> +				       diff_opno, &op, &slp_op[i], &dt[i],
> +				       &vectypes[i]))
>  	    return false;
>  	  continue;
>  	}
> […]
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 845228a055b2cfac0c9ca8c0cda1b9df4b0095c6..f1e9a1eb48769cb11aa69730e2480ed5522f78c1 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -6645,11 +6645,11 @@ valid_constant_size_p (const_tree size, cst_size_error *perr /* = NULL */)
>    return true;
>  }
>  
> -/* Return the precision of the type, or for a complex or vector type the
> -   precision of the type of its elements.  */
> +/* Return the type, or for a complex or vector type the type of its
> +   elements.  */
>  
> -unsigned int
> -element_precision (const_tree type)
> +tree
> +element_type (const_tree type)
>  {
>    if (!TYPE_P (type))
>      type = TREE_TYPE (type);
> @@ -6657,7 +6657,16 @@ element_precision (const_tree type)
>    if (code == COMPLEX_TYPE || code == VECTOR_TYPE)
>      type = TREE_TYPE (type);
>  
> -  return TYPE_PRECISION (type);
> +  return (tree) type;

I think we should stick a const_cast in element_precision and make
element_type take a plain “type”.  As it stands element_type is an
implicit const_cast for many cases.

Thanks,
Richard

> +}
> +
> +/* Return the precision of the type, or for a complex or vector type the
> +   precision of the type of its elements.  */
> +
> +unsigned int
> +element_precision (const_tree type)
> +{
> +  return TYPE_PRECISION (element_type (type));
>  }
>  
>  /* Return true if CODE represents an associative tree code.  Otherwise

next prev parent reply	other threads:[~2021-12-17 12:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 17:51 Andre Vieira (lists)
2021-11-12 10:56 ` Richard Biener
2021-11-12 11:48   ` Andre Simoes Dias Vieira
2021-11-16 12:10     ` Richard Biener
2021-11-17 13:30       ` Andre Vieira (lists)
2021-11-17 15:38         ` Richard Sandiford
2021-11-18 11:05         ` Richard Biener
2021-11-22 11:38           ` Andre Vieira (lists)
2021-11-22 11:41             ` Richard Biener
2021-11-25 13:53               ` Andre Vieira (lists)
2021-12-07 11:29                 ` Andre Vieira (lists)
2021-12-17 12:44                 ` Richard Sandiford [this message]
2021-12-29 15:55                   ` Andre Vieira (lists)
2021-12-29 16:54                     ` Richard Sandiford
2022-01-03 12:18                     ` Richard Biener
2022-01-10 14:09                       ` Andre Vieira (lists)
2022-01-10 14:45                         ` Richard Biener
2022-01-14 10:37                         ` Richard Sandiford
2022-11-04 17:40                           ` Andre Vieira (lists)
2022-11-07 11:05                             ` Richard Biener
2022-11-07 14:19                               ` Andre Vieira (lists)
2022-11-07 14:56                                 ` Richard Biener
2022-11-09 11:33                                   ` Andre Vieira (lists)
2022-11-15 18:24                                 ` Richard Sandiford
2022-11-16 12:25                                   ` Richard Biener
2021-11-29 11:17           ` Andre Vieira (lists)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptee6b8kpi.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).