public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Julian Brown <julian@codesourcery.com>
To: Chung-Lin Tang <cltang@codesourcery.com>
Cc: <gcc-patches@gcc.gnu.org>, Jakub Jelinek <jakub@redhat.com>,
	"Thomas Schwinge" <thomas@codesourcery.com>
Subject: Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch
Date: Wed, 28 Oct 2020 11:32:28 +0000	[thread overview]
Message-ID: <20201028113228.1f324665@squid.athome> (raw)
In-Reply-To: <96fedf15-f2cf-6914-0d1a-5d7eb14dfb5e@codesourcery.com>

On Wed, 28 Oct 2020 15:25:56 +0800
Chung-Lin Tang <cltang@codesourcery.com> wrote:

> On 2020/10/27 9:17 PM, Julian Brown wrote:
> >> And, in which context are cuStreamAddCallback registered callbacks
> >> run? E.g. if it is inside of asynchronous interrput, using locking
> >> in there might not be the best thing to do.  
> > The cuStreamAddCallback API is documented here:
> > 
> > https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__STREAM.html#group__CUDA__STREAM_1g613d97a277d7640f4cb1c03bd51c2483
> > 
> > We're quite limited in what we can do in the callback function since
> > "Callbacks must not make any CUDA API calls". So what*can*  a
> > callback function do? It is mentioned that the callback function's
> > execution will "pause" the stream it is logically running on. So
> > can we get deadlock, e.g. if multiple host threads are launching
> > offload kernels simultaneously? I don't think so, but I don't know
> > how to prove it!  
> 
> I think it's not deadlock that's a problem here, but that the locking
> acquiring in nvptx_stack_acquire will effectively serialize GPU
> kernel execution to just one host thread (since you're holding it
> till kernel completion). Also in that case, why do you need to use a
> CUDA callback? You can just call the unlock directly afterwards.

IIUC, there's a single GPU queue used for synchronous launches no
matter which host thread initiates the operation, and kernel execution
is serialised anyway, so that shouldn't be a problem. The only way to
get different kernels executing simultaneously is to use different CUDA
streams -- but I think that's still TBD for OpenMP ("TODO: Implement
GOMP_OFFLOAD_async_run").

> I think a better way is to use a list of stack blocks in ptx_dev, and
> quickly retrieve/unlock it in nvptx_stack_acquire, like how we did it
> in GOMP_OFFLOAD_alloc for general device memory allocation.

If it weren't for the serialisation, we could also keep a stack cache
per-host-thread in nvptx_thread. But as it is, I don't think we need the
extra complication. When we do OpenMP async support, maybe a stack
cache can be put per-stream in goacc_asyncqueue or the OpenMP
equivalent.

Thanks,

Julian

      reply	other threads:[~2020-10-28 11:32 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26 14:14 Julian Brown
2020-10-26 14:26 ` Jakub Jelinek
2020-11-09 21:32   ` Alexander Monakov
2020-11-13 20:54     ` Julian Brown
2020-12-08  1:13       ` Julian Brown
2020-12-08 17:11         ` Alexander Monakov
2020-12-15 13:39           ` Julian Brown
2020-12-15 13:49             ` Jakub Jelinek
2020-12-15 16:49               ` Julian Brown
2020-12-15 17:00                 ` Jakub Jelinek
2020-12-15 23:16                   ` Julian Brown
2021-01-05 12:13                     ` Julian Brown
2021-01-05 15:32                       ` Jakub Jelinek
2020-10-27 13:17 ` Julian Brown
2020-10-28  7:25   ` Chung-Lin Tang
2020-10-28 11:32     ` Julian Brown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201028113228.1f324665@squid.athome \
    --to=julian@codesourcery.com \
    --cc=cltang@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).