Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <rguenther@suse.de>
To: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
Cc: gcc-patches@gcc.gnu.org, Richard.Sandiford@arm.com
Subject: Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE
Date: Thu, 1 Feb 2024 08:19:32 +0100 (CET)	[thread overview]
Message-ID: <n61p6q9n-pqp5-9o94-3n33-9988p61056r2@fhfr.qr> (raw)
In-Reply-To: <e0448c57-8071-49b0-a551-c3831cf68d63@arm.com>

On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:

> 
> 
> On 31/01/2024 14:35, Richard Biener wrote:
> > On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> > 
> >>
> >>
> >> On 31/01/2024 13:58, Richard Biener wrote:
> >>> On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> >>>
> >>>>
> >>>>
> >>>> On 31/01/2024 12:13, Richard Biener wrote:
> >>>>> On Wed, 31 Jan 2024, Richard Biener wrote:
> >>>>>
> >>>>>> On Tue, 30 Jan 2024, Andre Vieira wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure
> >>>>>>> the
> >>>>>>> target can reject a simd_clone based on the vector mode it is using.
> >>>>>>> This is needed because for VLS SVE vectorization the vectorizer
> >>>>>>> accepts
> >>>>>>> Advanced SIMD simd clones when vectorizing using SVE types because the
> >>>>>>> simdlens
> >>>>>>> might match.  This will cause type errors later on.
> >>>>>>>
> >>>>>>> Other targets do not currently need to use this argument.
> >>>>>>
> >>>>>> Can you instead pass down the mode?
> >>>>>
> >>>>> Thinking about that again the cgraph_simd_clone info in the clone
> >>>>> should have sufficient information to disambiguate.  If it doesn't
> >>>>> then we should amend it.
> >>>>>
> >>>>> Richard.
> >>>>
> >>>> Hi Richard,
> >>>>
> >>>> Thanks for the review, I don't think cgraph_simd_clone_info is the right
> >>>> place
> >>>> to pass down this information, since this is information about the caller
> >>>> rather than the simdclone itself. What we are trying to achieve here is
> >>>> making
> >>>> the vectorizer being able to accept or reject simdclones based on the ISA
> >>>> we
> >>>> are vectorizing for. To distinguish between SVE and Advanced SIMD ISAs we
> >>>> use
> >>>> modes, I am also not sure that's ideal but it is what we currently use.
> >>>> So
> >>>> to
> >>>> answer your earlier question, yes I can also pass down mode if that's
> >>>> preferable.
> >>>
> >>> Note cgraph_simd_clone_info has simdlen and we seem to check elsewhere
> >>> whether that's POLY or constant.  I wonder how aarch64_sve_mode_p
> >>> comes into play here which in the end classifies VLS SVE modes as
> >>> non-SVE?
> >>>
> >>
> >> Using -msve-vector-bits=128
> >> (gdb) p TYPE_MODE (STMT_VINFO_VECTYPE (stmt_vinfo))
> >> $4 = E_VNx4SImode
> >> (gdb) p  TYPE_SIZE (STMT_VINFO_VECTYPE (stmt_vinfo))
> >> $5 = (tree) 0xfffff741c1b0
> >> (gdb) p debug (TYPE_SIZE (STMT_VINFO_VECTYPE (stmt_vinfo)))
> >> 128
> >> (gdb) p aarch64_sve_mode_p (TYPE_MODE (STMT_VINFO_VECTYPE (stmt_vinfo)))
> >> $5 = true
> >>
> >> and for reference without vls codegen:
> >> (gdb) p TYPE_MODE (STMT_VINFO_VECTYPE (stmt_vinfo))
> >> $1 = E_VNx4SImode
> >> (gdb) p  debug (TYPE_SIZE (STMT_VINFO_VECTYPE (stmt_vinfo)))
> >> POLY_INT_CST [128, 128]
> >>
> >> Having said that I believe that the USABLE targethook implementation for
> >> aarch64 should also block other uses, like an Advanced SIMD mode being used
> >> as
> >> input for a SVE VLS SIMDCLONE. The reason being that for instance 'half'
> >> registers like VNx2SI are packed differently from V2SI.
> >>
> >> We could teach the vectorizer to support these of course, but that requires
> >> more work and is not extremely useful just yet. I'll add the extra check
> >> that
> >> to the patch once we agree on how to pass down the information we need.
> >> Happy
> >> to use either mode, or stmt_vec_info and extract the mode from it like it
> >> does
> >> now.
> > 
> > As said, please pass down 'mode'.  But I wonder how to document it,
> > which mode is that supposed to be?  Any of result or any argument
> > mode that happens to be a vector?  I think that we might be able
> > to mix Advanced SIMD modes and SVE modes with -msve-vector-bits=128
> > in the same loop?
> > 
> > Are the simd clones you don't want to use with -msve-vector-bits=128
> > having constant simdlen?  If so why do you generate them in the first
> > place?
> 
> So this is where things get a bit confusing and I will write up some text for
> these cases to put in our ABI document (currently in Beta and in need of some
> tlc).
> 
> Our intended behaviour is for a 'declare simd' without a simdlen to generate
> simdclones for:
> * Advanced SIMD 128 and 64-bit vectors, where possible (we don't allow for
> simdlen 1, Tamar fixed that in gcc recently),
> * SVE VLA vectors.
> 
> Let me illustrate this with an example:
> 
> __attribute__ ((simd (notinbranch), const)) float cosf(float);
> 
> Should tell the compiler the following simd clones are available:
> __ZGVnN4v_cosf 128-bit 4x4 float Advanced SIMD clone
> __ZGVnN2v_cosf 64-bit  4x2 float Advanced SIMD clone
> __ZGVsMxv_cosf [128, 128]-bit 4x4xN SVE SIMD clone
> 
> [To save you looking into the abi let me break this down, _ZGV is prefix, then
> 'n' or 's' picks between Advanced SIMD and SVE, 'N' or 'M' picks between Not
> Masked and Masked (SVE is always masked even if we ask for notinbranch), then
> a digit or 'x' picks between Vector Length or VLA, and after that you get a
> letter per argument, where v = vector mapped]
> 
> Regardless of -msve-vector-bits, however, the vectorizer (and any other part
> of the compiler) may assume that the VL of the VLA SVE clone is that specified
> by -msve-vector-bits, which if the clone is written in a VLA way will still
> work.
> 
> If the attribute is used with a function definition rather than declaration,
> so:
> 
> __attribute__ ((simd (notinbranch), const)) float fn0(float a)
> {
>   return a + 1.0f;
> }
> 
> the compiler should again generate the three simd clones:
> __ZGVnN4v_fn0 128-bit 4x4 float Advanced SIMD clone
> __ZGVnN2v_fn0 64-bit  4x2 float Advanced SIMD clone
> __ZGVsMxv_fn0 [128, 128]-bit 4x4xN SVE SIMD clone
> 
> However, in the last one it may assume a VL for the codegen of the body and
> it's the user's responsibility to only use it for targets with that length ,
> much like any other code produced this way.
> 
> So that's what we tell the compiler is available and what the compiler
> generates depending on where we use the attribute. The question at hand here
> is, what can the vectorizer use for a specific loop. If we are using Advanced
> SIMD modes then it needs to call an Advanced SIMD clone, and if we are using
> SVE modes then it needs to call an SVE clone. At least until we support the
> ABI conversion, because like I said for an unpacked argument they behave
> differently.
> 
> PS: In the future OpenMP may add specifications that allow us to define a
> specific VLA simdlen... in other words, whether we want [128, 128] or [256,
> 256], [512, 512] ... etc, but that still needs agreement on the OpenMP Spec,
> which is why for now we piggy back on the simdlen-less definition to provide
> us a VLA SVE simdclone with [128, 128] VL.
> 
> Hopefully this makes things a bit clearer :/

So where does it go wrong?  What case does the patch fix?  For
the non-definition case the SVE clone should have a POLY_INT simdlen
and as you say it should be fine to use that even with -msve-vector-bits.
For the definition case the SVE clone might have a constant simdlen
but so does the caller (unless we allow different setting between
functions/TUs?).  The only thing the vectorizer looks at is I think

        if (!constant_multiple_p (vf * group_size, n->simdclone->simdlen,
                                  &num_calls)
            || (!n->simdclone->inbranch && (masked_call_offset > 0))
            || (nargs != simd_nargs))
          continue;

plus your 2nd patch rejecting num_calls > 1 for variable-length SVE.

The patch didn't come with a testcase so it's really hard to tell
what goes wrong now and how it is fixed ...

Richard.

> > 
> > That said, I wonder how we end up mixing things up in the first place.
> > 
> > Richard.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

next prev parent reply	other threads:[~2024-02-01  7:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-30 14:31 [PATCH 0/3] vect, aarch64: Add SVE support for simdclones Andre Vieira
2024-01-30 14:31 ` [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE Andre Vieira
2024-01-31 12:11   ` Richard Biener
2024-01-31 12:13     ` Richard Biener
2024-01-31 13:52       ` Andre Vieira (lists)
2024-01-31 13:58         ` Richard Biener
2024-01-31 14:03           ` Richard Biener
2024-01-31 16:13             ` Andre Vieira (lists)
2024-01-31 14:35           ` Andre Vieira (lists)
2024-01-31 14:35             ` Richard Biener
2024-01-31 16:36               ` Andre Vieira (lists)
2024-02-01  7:19                 ` Richard Biener [this message]
2024-02-01 17:01                   ` Andre Vieira (lists)
2024-02-05  9:56                     ` Richard Biener
2024-02-26 16:56                       ` Andre Vieira (lists)
2024-02-27  8:47                         ` Richard Biener
2024-02-28 17:25                           ` Andre Vieira (lists)
2024-02-29  7:26                             ` Richard Biener
2024-02-01  7:59                 ` Richard Sandiford
2024-01-30 14:31 ` [PATCH 2/3] vect: disable multiple calls of poly simdclones Andre Vieira
2024-01-31 12:13   ` Richard Biener
2024-01-30 14:31 ` [PATCH 3/3] aarch64: Add SVE support for simd clones [PR 96342] Andre Vieira
2024-02-01 21:59   ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=n61p6q9n-pqp5-9o94-3n33-9988p61056r2@fhfr.qr \
    --to=rguenther@suse.de \
    --cc=Richard.Sandiford@arm.com \
    --cc=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).