public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f   -O  execution test
Date: Wed, 26 Jun 2024 11:05:56 +0000	[thread overview]
Message-ID: <bug-115640-4-zRcH9QwlaW@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-115640-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> I will have a look (and for run validation try to reproduce with gfx1036).

OK, so with gfx1036 we end up using 16 byte vectors and the testcase
passes.  The difference with gfx908 is

/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
note:   ==> examining statement: _14 = aa[_13];
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
note:   vect_model_load_cost: aligned.
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
note:   vect_model_load_cost: inside_cost = 2, prologue_cost = 0 .

vs.

/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
note:   ==> examining statement: _14 = aa[_13];
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
missed:   unsupported vect permute { 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10
10 11 11 12 12 13 13 14 14 15 15 }
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
missed:   unsupported load permutation
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:19:72:
missed:   not vectorized: relevant stmt not supported: _14 = aa[_13];
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
note:   removing SLP instance operations starting from: REALPART_EXPR
<(*hadcur_24(D))[_2]> = _86;
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
missed:  unsupported SLP instances
/space/rguenther/src/gcc-autopar_devel/gcc/testsuite/gfortran.dg/vect/pr115528.f:16:12:
note:  re-trying with SLP disabled

so gfx1036 cannot do such permutes but gfx908 can?

On aarch64 with SVE we are using non-SLP and we're doing load-lanes in the
outer loop.  The reason seems to be also the unsupported load permutation,
but that's possibly because of VLA vectors - GCN uses fixed size but
loop masking.  So the better equivalent would have been x86-64 with loop
masking.

So looking again I think the loop mask in the inner loop is wrong.  We have

      do i = 1,4
         do j = 1,4
            HADCUR(I)=
     $         HADCUR(I)+CMPLX(COEF1)*FORM1*AA(I,J)
         end do
      end do

and the vectorizer sees

  <bb 3> [local count: 214748368]:
  # i_35 = PHI <i_27(7), 1(2)>
  # ivtmp_82 = PHI <ivtmp_81(7), 4(2)>
  _1 = (integer(kind=8)) i_35;
  _2 = _1 + -1;
  hadcur__I_RE_lsm.15_8 = REALPART_EXPR <(*hadcur_24(D))[_2]>;
  hadcur__I_IM_lsm.16_9 = IMAGPART_EXPR <(*hadcur_24(D))[_2]>;

  <bb 4> [local count: 858993456]:
  # j_36 = PHI <j_26(8), 1(3)>
...
  _10 = (integer(kind=8)) j_36;
  _11 = _10 * 4;
  _12 = _1 + _11;
  _13 = _12 + -5;
  _14 = aa[_13];
...
  j_26 = j_36 + 1;

  <bb 5> [local count: 214748368]:
  # _86 = PHI <_49(4)>
  # _85 = PHI <_50(4)>
  REALPART_EXPR <(*hadcur_24(D))[_2]> = _86;
  IMAGPART_EXPR <(*hadcur_24(D))[_2]> = _85;
  i_27 = i_35 + 1;

the loop mask { -1, -1, -1, -1, -1, -1, -1, -1, 0, .... } is OK for
the outer loop grouped load

  vect_hadcur__I_RE_lsm.20_76 = .MASK_LOAD (vectp_hadcur.18_79, 64B,
loop_mask_77);

but for the inner loop we do

  vect__14.23_71 = .MASK_LOAD (vectp_aa.21_73, 64B, loop_mask_77);

with the same mask.  This fails to be pruned for the GAP which means
that my improving of gap handling relies for this case to not end up
in the masked load handling.  In fact get_group_load_store_type doesn't
seem to be prepared for outer loop vectorization.  OTOH the inner loop
isn't "unrolled" (it has a VF of 1), and this might be a mistake of
loop mask handling and bad re-use.

As was said elsewhere outer loop vectorization with inner loop datarefs
is compensating for a missed interchange.

  parent reply	other threads:[~2024-06-26 11:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-25 11:39 [Bug target/115640] New: " tschwinge at gcc dot gnu.org
2024-06-25 11:56 ` [Bug target/115640] " tschwinge at gcc dot gnu.org
2024-06-25 12:30 ` rguenth at gcc dot gnu.org
2024-06-25 12:36 ` ams at gcc dot gnu.org
2024-06-25 13:08 ` rguenther at suse dot de
2024-06-25 17:55 ` [Bug target/115640] [15 Regression] " tschwinge at gcc dot gnu.org
2024-06-26  6:38 ` rguenther at suse dot de
2024-06-26  7:40 ` rguenth at gcc dot gnu.org
2024-06-26 11:05 ` rguenth at gcc dot gnu.org [this message]
2024-06-26 11:06 ` rguenth at gcc dot gnu.org
2024-06-26 11:06 ` rguenth at gcc dot gnu.org
2024-06-26 11:30 ` ams at gcc dot gnu.org
2024-06-26 11:54 ` tschwinge at gcc dot gnu.org
2024-06-26 12:26 ` rguenth at gcc dot gnu.org
2024-06-26 12:34 ` rguenth at gcc dot gnu.org
2024-06-26 13:34 ` ams at gcc dot gnu.org
2024-06-26 13:41 ` rguenther at suse dot de
2024-06-26 14:05 ` ams at gcc dot gnu.org
2024-06-28 11:08 ` cvs-commit at gcc dot gnu.org
2024-06-28 11:12 ` ams at gcc dot gnu.org
2024-06-28 11:44 ` cvs-commit at gcc dot gnu.org
2024-06-28 11:45 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-115640-4-zRcH9QwlaW@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).