From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>,
"Andre Vieira (lists) via Gcc-patches" <gcc-patches@gcc.gnu.org>
Subject: Re: [AArch64] Enable generation of FRINTNZ instructions
Date: Wed, 16 Nov 2022 12:25:06 +0000 (UTC) [thread overview]
Message-ID: <nycvar.YFH.7.77.849.2211161224500.3995@jbgna.fhfr.qr> (raw)
In-Reply-To: <mptmt8sgjot.fsf@arm.com>
[-- Attachment #1: Type: text/plain, Size: 7218 bytes --]
On Tue, 15 Nov 2022, Richard Sandiford wrote:
> "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes:
> > On 07/11/2022 11:05, Richard Biener wrote:
> >> On Fri, 4 Nov 2022, Andre Vieira (lists) wrote:
> >>
> >>> Sorry for the delay, just been reminded I still had this patch outstanding
> >>> from last stage 1. Hopefully since it has been mostly reviewed it could go in
> >>> for this stage 1?
> >>>
> >>> I addressed the comments and gave the slp-part of vectorizable_call some TLC
> >>> to make it work.
> >>>
> >>> I also changed vect_get_slp_defs as I noticed that the call from
> >>> vectorizable_call was creating an auto_vec with 'nargs' that might be less
> >>> than the number of children in the slp_node
> >> how so? Please fix that in the caller. It looks like it probably
> >> shoud use vect_nargs instead?
> > Well that was my first intuition, but when I looked at it further the
> > variant it's calling:
> > void vect_get_slp_defs (vec_info *, slp_tree slp_node, vec<vec<tree> >
> > *vec_oprnds, unsigned n)
> >
> > Is actually creating a vector of vectors of slp defs. So for each child
> > of slp_node it calls:
> > void vect_get_slp_defs (slp_tree slp_node, vec<tree> *vec_defs)
> >
> > Which returns a vector of vectorized defs. So vect_nargs would be the
> > right size for the inner vec<tree> of vec_defs, but the outer should
> > have the same number of elements as the original slp_node has children.
> >
> > However, at the call site (vectorizable_call), the operand we pass to
> > vect_get_slp_defs 'vec_defs', is initialized before the code-path is
> > specialized for slp_node. I'll go see if I can change the call site to
> > not have to do that, given the continue at the end of the if (slp_node)
> > BB I don't think it needs to use vec_defs after it, but it may require
> > some massaging to be able to define it separately for each code-path.
> >
> >>
> >>> , so that quick_push might not be
> >>> safe as is, so I added the reserve (n) to ensure it's safe to push. I didn't
> >>> actually come across any failure because of it though. Happy to split this
> >>> into a separate patch if needed.
> >>>
> >>> Bootstrapped and regression tested on aarch64-none-linux-gnu and
> >>> x86_64-pc-linux-gnu.
> >>>
> >>> OK for trunk?
> >> I'll leave final approval to Richard but
> >>
> >> - This only needs 1 bit, but occupies the full 16 to ensure a nice
> >> + This only needs 1 bit, but occupies the full 15 to ensure a nice
> >> layout. */
> >> unsigned int vectorizable : 16;
> >>
> >> you don't actually change the width of the bitfield. I would find
> >> it more natural to have
> >>
> >> signed int type0 : 7;
> >> signed int type0_vtrans : 1;
> >> signed int type1 : 7;
> >> signed int type1_vtrans : 1;
> >>
> >> with typeN_vtrans specifying how the types transform when vectorized.
> >> I would imagine another variant we could need is narrow/widen
> >> according to either result or other argument type? That said,
> >> just your flag would then be
> >>
> >> signed int type0 : 7;
> >> signed int pad : 1;
> >> signed int type1 : 7;
> >> signed int type1_vect_as_scalar : 1;
> >>
> >> ?
> > That's a cool idea! I'll leave it as a single bit for now like that, if
> > we want to re-use it for multiple transformations we will obviously need
> > to rename & give it more bits.
>
> I think we should steal bits from vectorizable rather than shrink
> type0 and type1 though. Then add a 14-bit padding field to show
> how many bits are left.
>
> > @@ -3340,9 +3364,20 @@ vectorizable_call (vec_info *vinfo,
> > rhs_type = unsigned_type_node;
> > }
> >
> > + /* The argument that is not of the same type as the others. */
> > int mask_opno = -1;
> > + int scalar_opno = -1;
> > if (internal_fn_p (cfn))
> > - mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
> > + {
> > + internal_fn ifn = as_internal_fn (cfn);
> > + if (direct_internal_fn_p (ifn)
> > + && direct_internal_fn (ifn).type1_is_scalar_p)
> > + scalar_opno = direct_internal_fn (ifn).type1;
> > + else
> > + /* For masked operations this represents the argument that carries the
> > + mask. */
> > + mask_opno = internal_fn_mask_index (as_internal_fn (cfn));
>
> This doesn't seem logically like an else. We should do both.
>
> LGTM otherwise for the bits outside match.pd. If Richard's happy with
> the match.pd bits then I think the patch is OK with those changes and
> without the vect_get_slp_defs thing (as you mentioned downthread).
Yes, the match.pd part looked OK.
> Thanks,
> Richard
>
>
> >>
> >>> gcc/ChangeLog:
> >>>
> >>> * config/aarch64/aarch64.md (ftrunc<mode><frintnz_mode>2): New
> >>> pattern.
> >>> * config/aarch64/iterators.md (FRINTNZ): New iterator.
> >>> (frintnz_mode): New int attribute.
> >>> (VSFDF): Make iterator conditional.
> >>> * internal-fn.def (FTRUNC_INT): New IFN.
> >>> * internal-fn.cc (ftrunc_int_direct): New define.
> >>> (expand_ftrunc_int_optab_fn): New custom expander.
> >>> (direct_ftrunc_int_optab_supported_p): New supported_p.
> >>> * internal-fn.h (direct_internal_fn_info): Add new member
> >>> type1_is_scalar_p.
> >>> * match.pd: Add to the existing TRUNC pattern match.
> >>> * optabs.def (ftrunc_int): New entry.
> >>> * stor-layout.h (element_precision): Moved from here...
> >>> * tree.h (element_precision): ... to here.
> >>> (element_type): New declaration.
> >>> * tree.cc (element_type): New function.
> >>> (element_precision): Changed to use element_type.
> >>> * tree-vect-stmts.cc (vectorizable_internal_function): Add
> >>> support for
> >>> IFNs with different input types.
> >>> (vect_get_scalar_oprnds): New function.
> >>> (vectorizable_call): Teach to handle IFN_FTRUNC_INT.
> >>> * tree-vect-slp.cc (check_scalar_arg_ok): New function.
> >>> (vect_slp_analyze_node_operations): Use check_scalar_arg_ok.
> >>> (vect_get_slp_defs): Ensure vec_oprnds has enough slots to push.
> >>> * doc/md.texi: New entry for ftrunc pattern name.
> >>> * doc/sourcebuild.texi (aarch64_frintzx_ok): New target.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>> * gcc.target/aarch64/merge_trunc1.c: Adapted to skip if frintnz
> >>> instructions available.
> >>> * lib/target-supports.exp: Added aarch64_frintnzx_ok target and
> >>> aarch64_frintz options.
> >>> * gcc.target/aarch64/frintnz.c: New test.
> >>> * gcc.target/aarch64/frintnz_vec.c: New test.
> >>> * gcc.target/aarch64/frintnz_slp.c: New test.
> >>>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
next prev parent reply other threads:[~2022-11-16 12:25 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-11 17:51 Andre Vieira (lists)
2021-11-12 10:56 ` Richard Biener
2021-11-12 11:48 ` Andre Simoes Dias Vieira
2021-11-16 12:10 ` Richard Biener
2021-11-17 13:30 ` Andre Vieira (lists)
2021-11-17 15:38 ` Richard Sandiford
2021-11-18 11:05 ` Richard Biener
2021-11-22 11:38 ` Andre Vieira (lists)
2021-11-22 11:41 ` Richard Biener
2021-11-25 13:53 ` Andre Vieira (lists)
2021-12-07 11:29 ` Andre Vieira (lists)
2021-12-17 12:44 ` Richard Sandiford
2021-12-29 15:55 ` Andre Vieira (lists)
2021-12-29 16:54 ` Richard Sandiford
2022-01-03 12:18 ` Richard Biener
2022-01-10 14:09 ` Andre Vieira (lists)
2022-01-10 14:45 ` Richard Biener
2022-01-14 10:37 ` Richard Sandiford
2022-11-04 17:40 ` Andre Vieira (lists)
2022-11-07 11:05 ` Richard Biener
2022-11-07 14:19 ` Andre Vieira (lists)
2022-11-07 14:56 ` Richard Biener
2022-11-09 11:33 ` Andre Vieira (lists)
2022-11-15 18:24 ` Richard Sandiford
2022-11-16 12:25 ` Richard Biener [this message]
2021-11-29 11:17 ` Andre Vieira (lists)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.77.849.2211161224500.3995@jbgna.fhfr.qr \
--to=rguenther@suse.de \
--cc=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).