public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "vries at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/95654] nvptx offloading: FAIL: libgomp.fortran/pr66199-5.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
Date: Thu, 17 Sep 2020 15:19:40 +0000	[thread overview]
Message-ID: <bug-95654-4-JzwNxx1bsE@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-95654-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95654

--- Comment #13 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #11)
> My guess at this point, is that duplicating the block with VOTE_ANY has the
> effect that the JIT compiler doesn't recognize control flow divergence
> before XCHG_IDX, and fails to insert the proper barrier.

Turns out, it's not that complicated.

Before ftracer we have:
...
  <bb 4> [local count: 268435456]:
  _30 = _18 + _27;
  _31 = _18 + _28;
  _46 = .GOMP_SIMT_ENTER_ALLOC (0, 1);
  _47 = .GOMP_SIMT_LANE ();
  _48 = (int) _47;
  _49 = _30 + _48;
  if (_31 > _49)
    goto <bb 8>; [87.50%]
  else
    goto <bb 5>; [12.50%]

  <bb 8> [local count: 117440512]:
  ...
  goto <bb 5>; [100.00%]

  <bb 5> [local count: 134217728]:
  # _54 = PHI <_50(D)(4), _67(8)>
  # _34 = PHI <_49(4), _71(8)>
  _55 = _34 == 63;
  _56 = (int) _55;
  _57 = .GOMP_SIMT_VOTE_ANY (_56);
  if (_57 != 0)
    goto <bb 7>; [50.00%]
  else
    goto <bb 6>; [50.00%]

  <bb 7> [local count: 67108864]:
  _58 = .GOMP_SIMT_LAST_LANE (_56);
  _60 = .GOMP_SIMT_XCHG_IDX (_54, _58);
  _61 = _60 + 1;
  goto <bb 6>; [100.00%]

  <bb 6> [local count: 268435456]:
  # d1_6 = PHI <_61(7), d1_29(D)(5)>
  *_46 ={v} {CLOBBER};
  .GOMP_SIMT_EXIT (_46);
  if (_31 == 32)
    goto <bb 11>; [34.00%]
  else
    goto <bb 9>; [66.00%]
...

At bb4 entry, we have unified control flow (that is, all threads in the warp
execute the same code in lockstep).

That's no longer the case at bb5/bb8.  In team 0, threads 0..15 execute the
loop body (bb8), and threads 16..31 don't.  In team 1, it's the opposite.

However, at bb5 the control flow from bb4 and bb8 joins, so control flow is
once again unified.

Then VOTE_ANY is executed in bb5, with team 1 subsequently going to the block
with XCHG_IDX (bb 7), and team 0, skipping straight to bb6.

After ftracer, we have:
...
  <bb 5> [local count: 16777216]:
  # _54 = PHI <_50(D)(4)>
  # _34 = PHI <_49(4)>
  _55 = _34 == 63;
  _56 = (int) _55;
  _57 = .GOMP_SIMT_VOTE_ANY (_56);
  if (_57 != 0)
    goto <bb 7>; [50.00%]
  else
    goto <bb 6>; [50.00%]

  <bb 8> [local count: 117440512]:
  ...
  _80 = _71 == 63;
  _81 = (int) _80;
  _82 = .GOMP_SIMT_VOTE_ANY (_81);
  if (_82 != 0)
    goto <bb 7>; [50.00%]
  else
    goto <bb 6>; [50.00%]
...

Now control flow no longer is unified at bb 5, and consequently it's not in bb7
when executing XCHG_IDX.  And that's the root cause for the failure we're
seeing.

So, one way to handle this it to consider VOTE_ANY as a "join" to the "fork" of
ENTER_ALLOC (which means: don't duplicate, unless you duplicate the pair).

But, after reading this:
...
/* Allocate per-lane storage and begin non-uniform execution region.  */

static void
expand_GOMP_SIMT_ENTER_ALLOC (internal_fn, gcall *stmt)
...
and this:
...
/* Deallocate per-lane storage and leave non-uniform execution region.  */

static void
expand_GOMP_SIMT_EXIT (internal_fn, gcall *stmt)
...
it seems that spot is already taken.

So I wonder, isn't the problem that we do the lastprivate stuff before
SIMT_EXIT. [ Of course after fixing that we might run into SIMT_EXIT being
duplicated by ftracer. But there at least the description of the internal-fn
would make it clear why we don't want to duplicate it. ]

  parent reply	other threads:[~2020-09-17 15:19 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-12 11:33 [Bug fortran/95654] New: " tschwinge at gcc dot gnu.org
2020-06-12 14:18 ` [Bug fortran/95654] " tschwinge at gcc dot gnu.org
2020-06-12 15:40 ` burnus at gcc dot gnu.org
2020-09-09 11:27 ` burnus at gcc dot gnu.org
2020-09-15 23:29 ` burnus at gcc dot gnu.org
2020-09-16  7:24 ` burnus at gcc dot gnu.org
2020-09-16 13:23 ` vries at gcc dot gnu.org
2020-09-16 13:33 ` vries at gcc dot gnu.org
2020-09-16 15:54 ` vries at gcc dot gnu.org
2020-09-16 15:57 ` vries at gcc dot gnu.org
2020-09-16 21:11 ` burnus at gcc dot gnu.org
2020-09-17  9:33 ` burnus at gcc dot gnu.org
2020-09-17 10:20 ` vries at gcc dot gnu.org
2020-09-17 13:53 ` vries at gcc dot gnu.org
2020-09-17 15:19 ` vries at gcc dot gnu.org [this message]
2020-09-22 17:16 ` cvs-commit at gcc dot gnu.org
2020-09-25  8:55 ` burnus at gcc dot gnu.org
2020-10-05  6:54 ` cvs-commit at gcc dot gnu.org
2020-10-05  7:01 ` vries at gcc dot gnu.org
2020-10-05  7:01 ` vries at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-95654-4-JzwNxx1bsE@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).