public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
Cc: Richard Sandiford <richard.sandiford@arm.com>,
	 Richard Biener <richard.guenther@gmail.com>,
	 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 2/3] Refactor widen_plus as internal_fn
Date: Mon, 15 May 2023 12:21:54 +0000 (UTC)	[thread overview]
Message-ID: <nycvar.YFH.7.77.849.2305151216260.4723@jbgna.fhfr.qr> (raw)
In-Reply-To: <37abc128-776c-0f04-f755-30e514909e15@arm.com>

On Mon, 15 May 2023, Andre Vieira (lists) wrote:

> 
> 
> On 15/05/2023 12:01, Richard Biener wrote:
> > On Mon, 15 May 2023, Richard Sandiford wrote:
> > 
> >> Richard Biener <rguenther@suse.de> writes:
> >>> On Fri, 12 May 2023, Richard Sandiford wrote:
> >>>
> >>>> Richard Biener <rguenther@suse.de> writes:
> >>>>> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> >>>>>
> >>>>>> I have dealt with, I think..., most of your comments. There's quite a
> >>>>>> few
> >>>>>> changes, I think it's all a bit simpler now. I made some other changes
> >>>>>> to the
> >>>>>> costing in tree-inline.cc and gimple-range-op.cc in which I try to
> >>>>>> preserve
> >>>>>> the same behaviour as we had with the tree codes before. Also added
> >>>>>> some extra
> >>>>>> checks to tree-cfg.cc that made sense to me.
> >>>>>>
> >>>>>> I am still regression testing the gimple-range-op change, as that was a
> >>>>>> last
> >>>>>> minute change, but the rest survived a bootstrap and regression test on
> >>>>>> aarch64-unknown-linux-gnu.
> >>>>>>
> >>>>>> cover letter:
> >>>>>>
> >>>>>> This patch replaces the existing tree_code widen_plus and widen_minus
> >>>>>> patterns with internal_fn versions.
> >>>>>>
> >>>>>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and
> >>>>>> DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> >>>>>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN
> >>>>>> respectively
> >>>>>> except they provide convenience wrappers for defining conversions that
> >>>>>> require
> >>>>>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi
> >>>>>> and _lo
> >>>>>> and each of those will also require a signed and unsigned version in
> >>>>>> the case
> >>>>>> of widening. The hi/lo pair is necessary because the widening and
> >>>>>> narrowing
> >>>>>> operations take n narrow elements as inputs and return n/2 wide
> >>>>>> elements as
> >>>>>> outputs. The 'lo' operation operates on the first n/2 elements of
> >>>>>> input. The
> >>>>>> 'hi' operation operates on the second n/2 elements of input. Defining
> >>>>>> an
> >>>>>> internal_fn along with hi/lo variations allows a single internal
> >>>>>> function to
> >>>>>> be returned from a vect_recog function that will later be expanded to
> >>>>>> hi/lo.
> >>>>>>
> >>>>>>
> >>>>>>   For example:
> >>>>>>   IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> >>>>>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
> >>>>>> (u/s)addl2
> >>>>>>                         IFN_VEC_WIDEN_PLUS_LO  ->
> >>>>>> vec_widen_<su>add_lo_<mode>
> >>>>>> -> (u/s)addl
> >>>>>>
> >>>>>> This gives the same functionality as the previous
> >>>>>> WIDEN_PLUS/WIDEN_MINUS tree
> >>>>>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> >>>>>
> >>>>> What I still don't understand is how we are so narrowly focused on
> >>>>> HI/LO?  We need a combined scalar IFN for pattern selection (not
> >>>>> sure why that's now called _HILO, I expected no suffix).  Then there's
> >>>>> three possibilities the target can implement this:
> >>>>>
> >>>>>   1) with a widen_[su]add<mode> instruction - I _think_ that's what
> >>>>>      RISCV is going to offer since it is a target where vector modes
> >>>>>      have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
> >>>>>      RVV can do a V4HI to V4SI widening and widening add/subtract
> >>>>>      using vwadd[u] and vwsub[u] (the HI->SI widening is actually
> >>>>>      done with a widening add of zero - eh).
> >>>>>      IIRC GCN is the same here.
> >>>>
> >>>> SVE currently does this too, but the addition and widening are
> >>>> separate operations.  E.g. in principle there's no reason why
> >>>> you can't sign-extend one operand, zero-extend the other, and
> >>>> then add the result together.  Or you could extend them from
> >>>> different sizes (QI and HI).  All of those are supported
> >>>> (if the costing allows them).
> >>>
> >>> I see.  So why does the target the expose widen_[su]add<mode> at all?
> >>
> >> It shouldn't (need to) do that.  I don't think we should have an optab
> >> for the unsplit operation.
> >>
> >> At least on SVE, we really want the extensions to be fused with loads
> >> (where possible) rather than with arithmetic.
> >>
> >> We can still do the widening arithmetic in one go.  It's just that
> >> fusing with the loads works for the mixed-sign and mixed-size cases,
> >> and can handle more than just doubling the element size.
> >>
> >>>> If the target has operations to do combined extending and adding (or
> >>>> whatever), then at the moment we rely on combine to generate them.
> >>>>
> >>>> So I think this case is separate from Andre's work.  The addition
> >>>> itself is just an ordinary addition, and any widening happens by
> >>>> vectorising a CONVERT/NOP_EXPR.
> >>>>
> >>>>>   2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
> >>>>>      codes currently support (exclusively)
> >>>>>   3) similar, but widen_[su]add{_even,_odd}<mode>
> >>>>>
> >>>>> that said, things like decomposes_to_hilo_fn_p look to paint us into
> >>>>> a 2) corner without good reason.
> >>>>
> >>>> I suppose one question is: how much of the patch is really specific
> >>>> to HI/LO, and how much is just grouping two halves together?
> >>>
> >>> Yep, that I don't know for sure.
> >>>
> >>>>   The nice
> >>>> thing about the internal-fn grouping macros is that, if (3) is
> >>>> implemented in future, the structure will strongly encourage even/odd
> >>>> pairs to be supported for all operations that support hi/lo.  That is,
> >>>> I would expect the grouping macros to be extended to define even/odd
> >>>> ifns alongside hi/lo ones, rather than adding separate definitions
> >>>> for even/odd functions.
> >>>>
> >>>> If so, at least from the internal-fn.* side of things, I think the
> >>>> question
> >>>> is whether it's OK to stick with hilo names for now, or whether we should
> >>>> use more forward-looking names.
> >>>
> >>> I think for parts that are independent we could use a more
> >>> forward-looking name.  Maybe _halves?
> >>
> >> Using _halves for the ifn macros sounds good to me FWIW.
> >>
> >>> But I'm also not sure
> >>> how much of that is really needed (it seems to be tied around
> >>> optimizing optabs space?)
> >>
> >> Not sure what you mean by "this".  Optabs space shouldn't be a problem
> >> though.  The optab encoding gives us a full int to play with, and it
> >> could easily go up to 64 bits if necessary/convenient.
> >>
> >> At least on the internal-fn.* side, the aim is really just to establish
> >> a regular structure, so that we don't have arbitrary differences between
> >> different widening operations, or too much cut-&-paste.
> > 
> > Hmm, I'm looking at the need for the std::map and
> > internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
> > The vectorizer pieces contain
> > 
> > +  if (code.is_fn_code ())
> > +     {
> > +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> > +      gcc_assert (decomposes_to_hilo_fn_p (ifn));
> > +
> > +      internal_fn lo, hi;
> > +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> > +      *code1 = as_combined_fn (lo);
> > +      *code2 = as_combined_fn (hi);
> > +      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> > +      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
> > 
> > so that tries to automatically associate the scalar widening IFN
> > with the set(s) of IFN pairs we can split to.  But then this
> > list should be static and there's no need to create a std::map?
> > Maybe gencfn-macros.cc can be enhanced to output these static
> > cases?  Or the vectorizer could (as it did previously) simply
> > open-code the handled cases (I guess since we deal with two
> > cases only now I'd prefer that).
> > 
> > Thanks,
> > Richard.
> > 
> > 
> >> Thanks,
> >> Richard
> >>
> > 
> The patch I uploaded last no longer has std::map nor
> internal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I've attached
> it again)

Whoops, too many patches ...

> I'm not sure I understand the _halves, do you mean that for the case where I
> had _hilo or _HILO before we rename that to _halves/_HALVES such that it later
> represents both _hi/_lo separation and _even/_odd?

I don't see much shared stuff, but I guess we'd see when we add a case
for EVEN/ODD.  The verifier contains

+      else if (decomposes_to_hilo_fn_p (ifn))
+       {
+         /* Non decomposed HILO stmts should not appear in IL, these are
+            merely used as an internal representation to the 
auto-vectorizer
+            pass and should have been expanded to their _LO _HI variants.  
*/
+         error ("gimple call has an non decomposed HILO IFN");
+         debug_generic_stmt (fn);
+         return true;

I think to support case 1) that's not wanted.  Instead what you could
check is that the types involved are vector types, so a subset of
what you check for IFN_VEC_WIDEN_PLUS_LO etc. (but oddly it's not
verified those are all operating on vector types only?)

+/*  Given an internal_fn IFN that is a HILO function, return its 
corresponding
+    LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn 
*hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);

that might become fragile if we add EVEN/ODD besides HI/LO unless
we merge those with a DEF_INTERNAL_OPTAB_WIDENING_HILO_EVENODD_FN
case, right?

> And am I correct to assume we are just giving up on having a INTERNAL_OPTAB_FN
> idea for 1)?

Well, I think we want all of them in the end (or at least support them
if target need arises).  full vector, hi/lo and even/odd.

Richard.

  reply	other threads:[~2023-05-15 12:21 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25  9:11 [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Joel Hutton
2022-05-27 13:23 ` Richard Biener
2022-05-31 10:07   ` Joel Hutton
2022-05-31 16:46     ` Tamar Christina
2022-06-01 10:11     ` Richard Biener
2022-06-06 17:20       ` Joel Hutton
2022-06-07  8:18         ` Richard Sandiford
2022-06-07  9:01           ` Joel Hutton
2022-06-09 14:03             ` Joel Hutton
2022-06-13  9:02             ` Richard Biener
2022-06-30 13:20               ` Joel Hutton
2022-07-12 12:32                 ` Richard Biener
2023-03-17 10:14                   ` Andre Vieira (lists)
2023-03-17 11:52                     ` Richard Biener
2023-04-20 13:23                       ` Andre Vieira (lists)
2023-04-24 11:57                         ` Richard Biener
2023-04-24 13:01                           ` Richard Sandiford
2023-04-25 12:30                             ` Richard Biener
2023-04-28 16:06                               ` Andre Vieira (lists)
2023-04-25  9:55                           ` Andre Vieira (lists)
2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
2023-05-03 11:55                               ` Richard Biener
2023-05-04 15:20                                 ` Andre Vieira (lists)
2023-05-05  6:09                                   ` Richard Biener
2023-05-12 12:14                                     ` Andre Vieira (lists)
2023-05-12 13:18                                       ` Richard Biener
2023-04-28 12:37                             ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists)
2023-05-03 12:11                               ` Richard Biener
2023-05-03 19:07                                 ` Richard Sandiford
2023-05-12 12:16                                   ` Andre Vieira (lists)
2023-05-12 13:28                                     ` Richard Biener
2023-05-12 13:55                                       ` Andre Vieira (lists)
2023-05-12 14:01                                       ` Richard Sandiford
2023-05-15 10:20                                         ` Richard Biener
2023-05-15 10:47                                           ` Richard Sandiford
2023-05-15 11:01                                             ` Richard Biener
2023-05-15 11:10                                               ` Richard Sandiford
2023-05-15 11:53                                               ` Andre Vieira (lists)
2023-05-15 12:21                                                 ` Richard Biener [this message]
2023-05-18 17:15                                                   ` Andre Vieira (lists)
2023-05-22 13:06                                                     ` Richard Biener
2023-06-01 16:27                                                       ` Andre Vieira (lists)
2023-06-02 12:00                                                         ` Richard Sandiford
2023-06-06 19:00                                                         ` Jakub Jelinek
2023-06-06 21:28                                                           ` [PATCH] modula2: Fix bootstrap Jakub Jelinek
2023-06-06 22:18                                                             ` Gaius Mulley
2023-06-07  8:42                                                             ` Andre Vieira (lists)
2023-06-13 14:48                                                               ` Jakub Jelinek
2023-04-28 12:37                             ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists)
2023-05-03 12:29                               ` Richard Biener
2023-05-10  9:15                                 ` Andre Vieira (lists)
2023-05-12 12:18                                   ` Andre Vieira (lists)
2022-06-13  9:18           ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.77.849.2305151216260.4723@jbgna.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).