From: Alexander Monakov <amonakov@ispras.ru>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Julian Brown <julian@codesourcery.com>,
gcc-patches@gcc.gnu.org,
Thomas Schwinge <thomas@codesourcery.com>,
Tom de Vries <tdevries@suse.de>
Subject: Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch
Date: Tue, 10 Nov 2020 00:32:36 +0300 (MSK) [thread overview]
Message-ID: <alpine.LNX.2.20.13.2011100000050.9902@monopod.intra.ispras.ru> (raw)
In-Reply-To: <20201026142634.GI7080@tucnak>
On Mon, 26 Oct 2020, Jakub Jelinek wrote:
> On Mon, Oct 26, 2020 at 07:14:48AM -0700, Julian Brown wrote:
> > This patch adds caching for the stack block allocated for offloaded
> > OpenMP kernel launches on NVPTX. This is a performance optimisation --
> > we observed an average 11% or so performance improvement with this patch
> > across a set of accelerated GPU benchmarks on one machine (results vary
> > according to individual benchmark and with hardware used).
In this patch you're folding two changes together: reuse of allocated stacks
and removing one host-device synchronization. Why is that? Can you report
performance change separately for each change (and split out the patches)?
> > A given kernel launch will reuse the stack block from the previous launch
> > if it is large enough, else it is freed and reallocated. A slight caveat
> > is that memory will not be freed until the device is closed, so e.g. if
> > code is using highly variable launch geometries and large amounts of
> > GPU RAM, you might run out of resources slightly quicker with this patch.
> >
> > Another way this patch gains performance is by omitting the
> > synchronisation at the end of an OpenMP offload kernel launch -- it's
> > safe for the GPU and CPU to continue executing in parallel at that point,
> > because e.g. copies-back from the device will be synchronised properly
> > with kernel completion anyway.
I don't think this explanation is sufficient. My understanding is that OpenMP
forbids the host to proceed asynchronously after the target construct unless
it is a 'target nowait' construct. This may be observable if there's a printf
in the target region for example (or if it accesses memory via host pointers).
So this really needs to be a separate patch with more explanation why this is
okay (if it is okay).
> > In turn, the last part necessitates a change to the way "(perhaps abort
> > was called)" errors are detected and reported.
As already mentioned using callbacks is problematic. Plus, I'm sure the way
you lock out other threads is a performance loss when multiple threads have
target regions: even though they will not run concurrently on the GPU, you
still want to allow host threads to submit GPU jobs while the GPU is occupied.
I would suggest to have a small pool (up to 3 entries perhaps) of stacks. Then
you can arrange reuse without totally serializing host threads on target
regions.
Alexander
next prev parent reply other threads:[~2020-11-09 21:32 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-26 14:14 Julian Brown
2020-10-26 14:26 ` Jakub Jelinek
2020-11-09 21:32 ` Alexander Monakov [this message]
2020-11-13 20:54 ` Julian Brown
2020-12-08 1:13 ` Julian Brown
2020-12-08 17:11 ` Alexander Monakov
2020-12-15 13:39 ` Julian Brown
2020-12-15 13:49 ` Jakub Jelinek
2020-12-15 16:49 ` Julian Brown
2020-12-15 17:00 ` Jakub Jelinek
2020-12-15 23:16 ` Julian Brown
2021-01-05 12:13 ` Julian Brown
2021-01-05 15:32 ` Jakub Jelinek
2020-10-27 13:17 ` Julian Brown
2020-10-28 7:25 ` Chung-Lin Tang
2020-10-28 11:32 ` Julian Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LNX.2.20.13.2011100000050.9902@monopod.intra.ispras.ru \
--to=amonakov@ispras.ru \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=julian@codesourcery.com \
--cc=tdevries@suse.de \
--cc=thomas@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).