public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "ams at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f   -O  execution test
Date: Wed, 26 Jun 2024 13:34:30 +0000	[thread overview]
Message-ID: <bug-115640-4-abvPO6wSgq@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-115640-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640

--- Comment #14 from Andrew Stubbs <ams at gcc dot gnu.org> ---
On 26/06/2024 13:34, rguenth at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
> 
> --- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #12)
>> (In reply to Andrew Stubbs from comment #10)
>>> GFX10 has more limited permutation capabilities than GFX9 because it
>>> only has 32-lane vectors natively, even though we're using the 64-lane
>>> "compatibility" mode.
>>>
>>> However, in theory, the permutation capabilities on V32 and below should
>>> be the same, and some permutations on V64 are allowed, so I don't know
>>> why it doesn't use it. It's possible I broke the logic in
>>> gcn_vectorize_vec_perm_const:
>>>
>>>     /* RDNA devices can only do permutations within each group of 32-lanes.
>>>        Reject permutations that cross the boundary.  */
>>>     if (TARGET_RDNA2_PLUS)
>>>       for (unsigned int i = 0; i < nelt; i++)
>>>         if (i < 31 ? perm[i] > 31 : perm[i] < 32)
>>>           return false;
>>>
>>> It looks right to me though?
>>
>> nelt == 32 so I think the last element has the wrong check applied?
>>
>> It should be
>>
>>>         if (i < 32 ? perm[i] > 31 : perm[i] < 32)
>>
>> I think.  With that the vectorization happens in a similar way but the
>> failure still doesn't reproduce (without the patch, of course).

Oops, I think you're right.

> Btw, the above looks quite odd for nelt == 32 anyway - we are permuting
> two vectors src0 and src1 into one 32 element dst vector (it's no longer
> required that src0 and src1 line up with the dst vector size btw, they
> might have different nelt).  So the loop would reject interleaving
> the low parts of two 32 element vectors, a permute that would look like
> { 0, 32, 1, 33, 2, 34 ... } so does "within each group of 32-lanes"
> mean you can never mix the two vector inputs?  Or does GCN not have
> a two-to-one vector permute instruction?

GCN does not have two-to-one vector permute in hardware, so we do two 
permutes and a vec_merge to get the same effect.

GFX9 can permute all the elements within a 64 lane vector arbitrarily.

GFX10 and GFX11 can permute the low-32 and high-32 elements freely, but 
no value may cross the boundary. AFAIK there's no way to do that via any 
vector instruction (i.e. without writing to memory, or extracting values 
element-wise).

In theory, we could implement permutes with different sized inputs and 
outputs, but right now those are rejected early. The interleave example 
wouldn't work in hardware, for GFX10, but we could have it for GFX9.

However, I think you might be right about the numbering of the "perm" 
array; we probably need to be testing "(perm[i] % nelt) > 31" if we are 
to support two-to-one permutations.

Thanks for looking at this.

Andrew

  parent reply	other threads:[~2024-06-26 13:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-25 11:39 [Bug target/115640] New: " tschwinge at gcc dot gnu.org
2024-06-25 11:56 ` [Bug target/115640] " tschwinge at gcc dot gnu.org
2024-06-25 12:30 ` rguenth at gcc dot gnu.org
2024-06-25 12:36 ` ams at gcc dot gnu.org
2024-06-25 13:08 ` rguenther at suse dot de
2024-06-25 17:55 ` [Bug target/115640] [15 Regression] " tschwinge at gcc dot gnu.org
2024-06-26  6:38 ` rguenther at suse dot de
2024-06-26  7:40 ` rguenth at gcc dot gnu.org
2024-06-26 11:05 ` rguenth at gcc dot gnu.org
2024-06-26 11:06 ` rguenth at gcc dot gnu.org
2024-06-26 11:06 ` rguenth at gcc dot gnu.org
2024-06-26 11:30 ` ams at gcc dot gnu.org
2024-06-26 11:54 ` tschwinge at gcc dot gnu.org
2024-06-26 12:26 ` rguenth at gcc dot gnu.org
2024-06-26 12:34 ` rguenth at gcc dot gnu.org
2024-06-26 13:34 ` ams at gcc dot gnu.org [this message]
2024-06-26 13:41 ` rguenther at suse dot de
2024-06-26 14:05 ` ams at gcc dot gnu.org
2024-06-28 11:08 ` cvs-commit at gcc dot gnu.org
2024-06-28 11:12 ` ams at gcc dot gnu.org
2024-06-28 11:44 ` cvs-commit at gcc dot gnu.org
2024-06-28 11:45 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-115640-4-abvPO6wSgq@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).