public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "tschwinge at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/99555] [OpenMP/nvptx] Execution-time hang for simple nested OpenMP 'target'/'parallel'/'task' constructs
Date: Fri, 13 May 2022 13:16:26 +0000	[thread overview]
Message-ID: <bug-99555-4-xsSmUZHxZH@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-99555-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99555

--- Comment #14 from Thomas Schwinge <tschwinge at gcc dot gnu.org> ---
Regarding my previous report that after
commit r12-7332-g5ed77fb3ed1ee0289a0ec9499ef52b99b39421f1
"[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end"...

(In reply to Thomas Schwinge from comment #13)
> [...] on one system (only!), I'm [...] seeing regressions as follows:
> 
>     PASS: libgomp.c/../libgomp.c-c++-common/task-detach-10.c (test for excess errors)
>     {+WARNING: program timed out.+}
>     [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/task-detach-10.c execution test

..., and similar for all 'libgomp.c-c++-common/task-detach-10.c',
'libgomp.c-c++-common/task-detach-8.c', 'libgomp.fortran/task-detach-10.f90',
'libgomp.fortran/task-detach-8.f90' test cases:

> (Accumulated over a few runs; not always seeing all of those.)
> 
> That's with a Nvidia Tesla K20c GPU, Driver Version: 346.46.
> As that version is "a bit old", I shall first update this, before we spend
> any further time on analyzing this.

Cross-checking on another system with Nvidia Tesla K20c GPU but more recent
Driver Version I'm not seeing such an issue.

On the "old" system, gradually upgrading Driver Version: 346.46 to 352.99,
361.93.02, 375.88 (always the latest (?) version of the respective series),
these all did not resolve the problem.

Only starting with 384.59 (that is, early version of the 384.X series), that
then did resolve the issue.  That's still using the GCC/nvptx '-mptx=3.1'
multilib.

(We couldn't with earlier series, but given this is 384.X, we may now also
cross-check with the default multilib, and that also was fine.)

Now, I don't know if at all we would like to spend any more effort on this
issue, given that it only appears with rather old pre-384.X versions -- but on
the other hand, the GCC/nvptx '-mptx=3.1' multilib is meant to keep these
supported?  (... which is why I'm running such testing; and certainly the
timeouts are annoying there.)

It might be another issue with pre-384.X versions of the Nvidia PTX JIT, or is
there the slight possibility that GCC is generating/libgomp contains some
"weird" code that post-384.X version happen to "fix up" -- probably the former
rather than the latter?  (Or, the chance of GPU hardware/firmware or some other
system weirdness -- unlikely, otherwise behaves totally fine?)

I don't know where to find complete Nvidia Driver/JIT release notes, where the
375.X -> 384.X notes might provide an idea of what got fixed, and we might then
add another 'WORKAROUND_PTXJIT_BUG' for that -- maybe simple, maybe not.

Any thoughts, Tom?

  parent reply	other threads:[~2022-05-13 13:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 16:36 [Bug target/99555] New: " tschwinge at gcc dot gnu.org
2021-03-12 15:53 ` [Bug target/99555] " vries at gcc dot gnu.org
2021-03-25 12:00 ` cvs-commit at gcc dot gnu.org
2021-03-29  8:41 ` cvs-commit at gcc dot gnu.org
2021-04-15  8:02 ` vries at gcc dot gnu.org
2021-04-15  9:14 ` cvs-commit at gcc dot gnu.org
2021-04-17  8:07 ` vries at gcc dot gnu.org
2021-04-19 10:44 ` vries at gcc dot gnu.org
2021-04-19 11:15 ` vries at gcc dot gnu.org
2021-04-19 15:39 ` vries at gcc dot gnu.org
2021-04-20 11:24 ` vries at gcc dot gnu.org
2022-02-22 14:53 ` cvs-commit at gcc dot gnu.org
2022-02-22 14:54 ` vries at gcc dot gnu.org
2022-03-17 12:16 ` tschwinge at gcc dot gnu.org
2022-05-13 13:16 ` tschwinge at gcc dot gnu.org [this message]
2022-09-06 13:32 ` vries at gcc dot gnu.org
2022-12-21 13:59 ` cvs-commit at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-99555-4-xsSmUZHxZH@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).