public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: "Andre Vieira \(lists\)" <andre.simoesdiasvieira@arm.com>,
	 Richard Biener <richard.guenther@gmail.com>,
	 "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 2/3] Refactor widen_plus as internal_fn
Date: Mon, 15 May 2023 11:47:15 +0100	[thread overview]
Message-ID: <mptr0rh7txo.fsf@arm.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2305151018120.4723@jbgna.fhfr.qr> (Richard Biener's message of "Mon, 15 May 2023 10:20:54 +0000 (UTC)")

Richard Biener <rguenther@suse.de> writes:
> On Fri, 12 May 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Fri, 12 May 2023, Andre Vieira (lists) wrote:
>> >
>> >> I have dealt with, I think..., most of your comments. There's quite a few
>> >> changes, I think it's all a bit simpler now. I made some other changes to the
>> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>> >> the same behaviour as we had with the tree codes before. Also added some extra
>> >> checks to tree-cfg.cc that made sense to me.
>> >> 
>> >> I am still regression testing the gimple-range-op change, as that was a last
>> >> minute change, but the rest survived a bootstrap and regression test on
>> >> aarch64-unknown-linux-gnu.
>> >> 
>> >> cover letter:
>> >> 
>> >> This patch replaces the existing tree_code widen_plus and widen_minus
>> >> patterns with internal_fn versions.
>> >> 
>> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>> >> except they provide convenience wrappers for defining conversions that require
>> >> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
>> >> and each of those will also require a signed and unsigned version in the case
>> >> of widening. The hi/lo pair is necessary because the widening and narrowing
>> >> operations take n narrow elements as inputs and return n/2 wide elements as
>> >> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>> >> 'hi' operation operates on the second n/2 elements of input. Defining an
>> >> internal_fn along with hi/lo variations allows a single internal function to
>> >> be returned from a vect_recog function that will later be expanded to hi/lo.
>> >> 
>> >> 
>> >>  For example:
>> >>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>> >> (u/s)addl2
>> >>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>> >> -> (u/s)addl
>> >> 
>> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>> >
>> > What I still don't understand is how we are so narrowly focused on
>> > HI/LO?  We need a combined scalar IFN for pattern selection (not
>> > sure why that's now called _HILO, I expected no suffix).  Then there's
>> > three possibilities the target can implement this:
>> >
>> >  1) with a widen_[su]add<mode> instruction - I _think_ that's what
>> >     RISCV is going to offer since it is a target where vector modes
>> >     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>> >     RVV can do a V4HI to V4SI widening and widening add/subtract
>> >     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>> >     done with a widening add of zero - eh).
>> >     IIRC GCN is the same here.
>> 
>> SVE currently does this too, but the addition and widening are
>> separate operations.  E.g. in principle there's no reason why
>> you can't sign-extend one operand, zero-extend the other, and
>> then add the result together.  Or you could extend them from
>> different sizes (QI and HI).  All of those are supported
>> (if the costing allows them).
>
> I see.  So why does the target the expose widen_[su]add<mode> at all?

It shouldn't (need to) do that.  I don't think we should have an optab
for the unsplit operation.

At least on SVE, we really want the extensions to be fused with loads
(where possible) rather than with arithmetic.

We can still do the widening arithmetic in one go.  It's just that
fusing with the loads works for the mixed-sign and mixed-size cases,
and can handle more than just doubling the element size.

>> If the target has operations to do combined extending and adding (or
>> whatever), then at the moment we rely on combine to generate them.
>> 
>> So I think this case is separate from Andre's work.  The addition
>> itself is just an ordinary addition, and any widening happens by
>> vectorising a CONVERT/NOP_EXPR.
>> 
>> >  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>> >     codes currently support (exclusively)
>> >  3) similar, but widen_[su]add{_even,_odd}<mode>
>> >
>> > that said, things like decomposes_to_hilo_fn_p look to paint us into
>> > a 2) corner without good reason.
>> 
>> I suppose one question is: how much of the patch is really specific
>> to HI/LO, and how much is just grouping two halves together?
>
> Yep, that I don't know for sure.
>
>>  The nice
>> thing about the internal-fn grouping macros is that, if (3) is
>> implemented in future, the structure will strongly encourage even/odd
>> pairs to be supported for all operations that support hi/lo.  That is,
>> I would expect the grouping macros to be extended to define even/odd
>> ifns alongside hi/lo ones, rather than adding separate definitions
>> for even/odd functions.
>> 
>> If so, at least from the internal-fn.* side of things, I think the question
>> is whether it's OK to stick with hilo names for now, or whether we should
>> use more forward-looking names.
>
> I think for parts that are independent we could use a more
> forward-looking name.  Maybe _halves?

Using _halves for the ifn macros sounds good to me FWIW.

> But I'm also not sure
> how much of that is really needed (it seems to be tied around
> optimizing optabs space?)

Not sure what you mean by "this".  Optabs space shouldn't be a problem
though.  The optab encoding gives us a full int to play with, and it
could easily go up to 64 bits if necessary/convenient.

At least on the internal-fn.* side, the aim is really just to establish
a regular structure, so that we don't have arbitrary differences between
different widening operations, or too much cut-&-paste.

Thanks,
Richard

  reply	other threads:[~2023-05-15 10:47 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25  9:11 [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Joel Hutton
2022-05-27 13:23 ` Richard Biener
2022-05-31 10:07   ` Joel Hutton
2022-05-31 16:46     ` Tamar Christina
2022-06-01 10:11     ` Richard Biener
2022-06-06 17:20       ` Joel Hutton
2022-06-07  8:18         ` Richard Sandiford
2022-06-07  9:01           ` Joel Hutton
2022-06-09 14:03             ` Joel Hutton
2022-06-13  9:02             ` Richard Biener
2022-06-30 13:20               ` Joel Hutton
2022-07-12 12:32                 ` Richard Biener
2023-03-17 10:14                   ` Andre Vieira (lists)
2023-03-17 11:52                     ` Richard Biener
2023-04-20 13:23                       ` Andre Vieira (lists)
2023-04-24 11:57                         ` Richard Biener
2023-04-24 13:01                           ` Richard Sandiford
2023-04-25 12:30                             ` Richard Biener
2023-04-28 16:06                               ` Andre Vieira (lists)
2023-04-25  9:55                           ` Andre Vieira (lists)
2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
2023-05-03 11:55                               ` Richard Biener
2023-05-04 15:20                                 ` Andre Vieira (lists)
2023-05-05  6:09                                   ` Richard Biener
2023-05-12 12:14                                     ` Andre Vieira (lists)
2023-05-12 13:18                                       ` Richard Biener
2023-04-28 12:37                             ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists)
2023-05-03 12:11                               ` Richard Biener
2023-05-03 19:07                                 ` Richard Sandiford
2023-05-12 12:16                                   ` Andre Vieira (lists)
2023-05-12 13:28                                     ` Richard Biener
2023-05-12 13:55                                       ` Andre Vieira (lists)
2023-05-12 14:01                                       ` Richard Sandiford
2023-05-15 10:20                                         ` Richard Biener
2023-05-15 10:47                                           ` Richard Sandiford [this message]
2023-05-15 11:01                                             ` Richard Biener
2023-05-15 11:10                                               ` Richard Sandiford
2023-05-15 11:53                                               ` Andre Vieira (lists)
2023-05-15 12:21                                                 ` Richard Biener
2023-05-18 17:15                                                   ` Andre Vieira (lists)
2023-05-22 13:06                                                     ` Richard Biener
2023-06-01 16:27                                                       ` Andre Vieira (lists)
2023-06-02 12:00                                                         ` Richard Sandiford
2023-06-06 19:00                                                         ` Jakub Jelinek
2023-06-06 21:28                                                           ` [PATCH] modula2: Fix bootstrap Jakub Jelinek
2023-06-06 22:18                                                             ` Gaius Mulley
2023-06-07  8:42                                                             ` Andre Vieira (lists)
2023-06-13 14:48                                                               ` Jakub Jelinek
2023-04-28 12:37                             ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists)
2023-05-03 12:29                               ` Richard Biener
2023-05-10  9:15                                 ` Andre Vieira (lists)
2023-05-12 12:18                                   ` Andre Vieira (lists)
2022-06-13  9:18           ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptr0rh7txo.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).