Re: [AArch64] Enable generation of FRINTNZ instructions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: "Andre Vieira \(lists\)" <andre.simoesdiasvieira@arm.com>
Cc: Richard Biener <rguenther@suse.de>,
	 "Andre Vieira \(lists\) via Gcc-patches"
	<gcc-patches@gcc.gnu.org>
Subject: Re: [AArch64] Enable generation of FRINTNZ instructions
Date: Tue, 15 Nov 2022 18:24:18 +0000	[thread overview]
Message-ID: <mptmt8sgjot.fsf@arm.com> (raw)
In-Reply-To: <1cdffce8-e0e5-3304-52b9-3b736a4d380d@arm.com> (Andre Vieira's message of "Mon, 7 Nov 2022 14:19:09 +0000")

"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes:
> On 07/11/2022 11:05, Richard Biener wrote:
>> On Fri, 4 Nov 2022, Andre Vieira (lists) wrote:
>>
>>> Sorry for the delay, just been reminded I still had this patch outstanding
>>> from last stage 1. Hopefully since it has been mostly reviewed it could go in
>>> for this stage 1?
>>>
>>> I addressed the comments and gave the slp-part of vectorizable_call some TLC
>>> to make it work.
>>>
>>> I also changed vect_get_slp_defs as I noticed that the call from
>>> vectorizable_call was creating an auto_vec with 'nargs' that might be less
>>> than the number of children in the slp_node
>> how so?  Please fix that in the caller.  It looks like it probably
>> shoud use vect_nargs instead?
> Well that was my first intuition, but when I looked at it further the 
> variant it's calling:
> void vect_get_slp_defs (vec_info *, slp_tree slp_node, vec<vec<tree> > 
> *vec_oprnds, unsigned n)
>
> Is actually creating a vector of vectors of slp defs. So for each child 
> of slp_node it calls:
> void vect_get_slp_defs (slp_tree slp_node, vec<tree> *vec_defs)
>
> Which returns a vector of vectorized defs. So vect_nargs would be the 
> right size for the inner vec<tree> of vec_defs, but the outer should 
> have the same number of elements as the original slp_node has children.
>
> However, at the call site (vectorizable_call), the operand we pass to 
> vect_get_slp_defs 'vec_defs', is initialized before the code-path is 
> specialized for slp_node. I'll go see if I can change the call site to 
> not have to do that, given the continue at the end of the if (slp_node) 
> BB I don't think it needs to use vec_defs after it, but it may require 
> some massaging to be able to define it separately for each code-path.
>
>>
>>> , so that quick_push might not be
>>> safe as is, so I added the reserve (n) to ensure it's safe to push. I didn't
>>> actually come across any failure because of it though. Happy to split this
>>> into a separate patch if needed.
>>>
>>> Bootstrapped and regression tested on aarch64-none-linux-gnu and
>>> x86_64-pc-linux-gnu.
>>>
>>> OK for trunk?
>> I'll leave final approval to Richard but
>>
>> -     This only needs 1 bit, but occupies the full 16 to ensure a nice
>> +     This only needs 1 bit, but occupies the full 15 to ensure a nice
>>        layout.  */
>>     unsigned int vectorizable : 16;
>>
>> you don't actually change the width of the bitfield.  I would find
>> it more natural to have
>>
>>    signed int type0 : 7;
>>    signed int type0_vtrans : 1;
>>    signed int type1 : 7;
>>    signed int type1_vtrans : 1;
>>
>> with typeN_vtrans specifying how the types transform when vectorized.
>> I would imagine another variant we could need is narrow/widen
>> according to either result or other argument type?  That said,
>> just your flag would then be
>>
>>    signed int type0 : 7;
>>    signed int pad   : 1;
>>    signed int type1 : 7;
>>    signed int type1_vect_as_scalar : 1;
>>
>> ?
> That's a cool idea! I'll leave it as a single bit for now like that, if 
> we want to re-use it for multiple transformations we will obviously need 
> to rename & give it more bits.

I think we should steal bits from vectorizable rather than shrink
type0 and type1 though.  Then add a 14-bit padding field to show
how many bits are left.

> @@ -3340,9 +3364,20 @@ vectorizable_call (vec_info *vinfo,
>        rhs_type = unsigned_type_node;
>      }
> 
> +  /* The argument that is not of the same type as the others.  */
>    int mask_opno = -1;
> +  int scalar_opno = -1;
>    if (internal_fn_p (cfn))
> -    mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
> +    {
> +      internal_fn ifn = as_internal_fn (cfn);
> +      if (direct_internal_fn_p (ifn)
> +	  && direct_internal_fn (ifn).type1_is_scalar_p)
> +	scalar_opno = direct_internal_fn (ifn).type1;
> +      else
> +	/* For masked operations this represents the argument that carries the
> +	   mask.  */
> +	mask_opno = internal_fn_mask_index (as_internal_fn (cfn));

This doesn't seem logically like an else.  We should do both.

LGTM otherwise for the bits outside match.pd.  If Richard's happy with
the match.pd bits then I think the patch is OK with those changes and
without the vect_get_slp_defs thing (as you mentioned downthread).

Thanks,
Richard


>>
>>> gcc/ChangeLog:
>>>
>>>          * config/aarch64/aarch64.md (ftrunc<mode><frintnz_mode>2): New
>>> pattern.
>>>          * config/aarch64/iterators.md (FRINTNZ): New iterator.
>>>          (frintnz_mode): New int attribute.
>>>          (VSFDF): Make iterator conditional.
>>>          * internal-fn.def (FTRUNC_INT): New IFN.
>>>          * internal-fn.cc (ftrunc_int_direct): New define.
>>>          (expand_ftrunc_int_optab_fn): New custom expander.
>>>          (direct_ftrunc_int_optab_supported_p): New supported_p.
>>>          * internal-fn.h (direct_internal_fn_info): Add new member
>>>          type1_is_scalar_p.
>>>          * match.pd: Add to the existing TRUNC pattern match.
>>>          * optabs.def (ftrunc_int): New entry.
>>>          * stor-layout.h (element_precision): Moved from here...
>>>          * tree.h (element_precision): ... to here.
>>>          (element_type): New declaration.
>>>          * tree.cc (element_type): New function.
>>>          (element_precision): Changed to use element_type.
>>>          * tree-vect-stmts.cc (vectorizable_internal_function): Add
>>> support for
>>>          IFNs with different input types.
>>>          (vect_get_scalar_oprnds): New function.
>>>          (vectorizable_call): Teach to handle IFN_FTRUNC_INT.
>>>          * tree-vect-slp.cc (check_scalar_arg_ok): New function.
>>>          (vect_slp_analyze_node_operations): Use check_scalar_arg_ok.
>>>          (vect_get_slp_defs): Ensure vec_oprnds has enough slots to push.
>>>          * doc/md.texi: New entry for ftrunc pattern name.
>>>          * doc/sourcebuild.texi (aarch64_frintzx_ok): New target.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>          * gcc.target/aarch64/merge_trunc1.c: Adapted to skip if frintnz
>>> instructions available.
>>>          * lib/target-supports.exp: Added aarch64_frintnzx_ok target and
>>> aarch64_frintz options.
>>>          * gcc.target/aarch64/frintnz.c: New test.
>>>          * gcc.target/aarch64/frintnz_vec.c: New test.
>>>          * gcc.target/aarch64/frintnz_slp.c: New test.
>>>

next prev parent reply	other threads:[~2022-11-15 18:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 17:51 Andre Vieira (lists)
2021-11-12 10:56 ` Richard Biener
2021-11-12 11:48   ` Andre Simoes Dias Vieira
2021-11-16 12:10     ` Richard Biener
2021-11-17 13:30       ` Andre Vieira (lists)
2021-11-17 15:38         ` Richard Sandiford
2021-11-18 11:05         ` Richard Biener
2021-11-22 11:38           ` Andre Vieira (lists)
2021-11-22 11:41             ` Richard Biener
2021-11-25 13:53               ` Andre Vieira (lists)
2021-12-07 11:29                 ` Andre Vieira (lists)
2021-12-17 12:44                 ` Richard Sandiford
2021-12-29 15:55                   ` Andre Vieira (lists)
2021-12-29 16:54                     ` Richard Sandiford
2022-01-03 12:18                     ` Richard Biener
2022-01-10 14:09                       ` Andre Vieira (lists)
2022-01-10 14:45                         ` Richard Biener
2022-01-14 10:37                         ` Richard Sandiford
2022-11-04 17:40                           ` Andre Vieira (lists)
2022-11-07 11:05                             ` Richard Biener
2022-11-07 14:19                               ` Andre Vieira (lists)
2022-11-07 14:56                                 ` Richard Biener
2022-11-09 11:33                                   ` Andre Vieira (lists)
2022-11-15 18:24                                 ` Richard Sandiford [this message]
2022-11-16 12:25                                   ` Richard Biener
2021-11-29 11:17           ` Andre Vieira (lists)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptmt8sgjot.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).