public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alexander Monakov <amonakov@ispras.ru>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Tom de Vries <tdevries@suse.de>,
	Tobias Burnus <tobias@codesourcery.com>,
	 gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libgomp, nvptx, v3: Honor OpenMP 5.1 num_teams lower bound
Date: Fri, 12 Nov 2021 22:16:11 +0300 (MSK)	[thread overview]
Message-ID: <cde518e-b369-c597-5ddd-1086bd7129c5@ispras.ru> (raw)
In-Reply-To: <20211112175804.GJ2710@tucnak>

Hello Jakub,

On Fri, 12 Nov 2021, Jakub Jelinek via Gcc-patches wrote:

> On Fri, Nov 12, 2021 at 02:27:16PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > On Fri, Nov 12, 2021 at 02:20:23PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > > This patch assumes that .shared variables are initialized to 0,
> > > https://docs.nvidia.com/cuda/parallel-thread-execution/index.html lists
> > > in Table 7. .shared as non-initializable.  If that isn't the case,
> > > we need to initialize it somewhere for the case of #pragma omp target
> > > without #pragma omp teams in it, maybe in libgcc/config/nvptx/crt0.c ?
> > 
> > A quick look at libgcc/config/nvptx/crt0.c shows the target supports
> > __attribute__((shared)), so perhaps either following instead, or, if
> > .shared isn't preinitialized to zero, defining the variable in
> > libgcc/config/nvptx/crt0.c , adding there __gomp_team_num = 0;
> > and adding extern keyword before int __gomp_team_num __attribute__((shared));
> > in libgomp/config/nvptx/target.c.
> 
> And finally here is a third version, which fixes a typo in the previous
> patch (in instead of int) and actually initializes the shared var because
> PTX documentation doesn't say anything about how the shared vars are
> initialized.
> 
> Tested on x86_64-linux with nvptx-none offloading, ok for trunk?

I suspect there may be a misunderstanding here, or maybe your explanation is
incomplete. I don't think the intention of the standard was to force such
complexity. You can launch as many blocks on the GPU as you like, limited only
by the bitwidth of the indexing register used in hardware, NVIDIA guarantees
at least INT_MAX blocks (in fact almost 1<<63 blocks if you launch a
three-dimensional grid with INT_MAX x 65535 x 65535 blocks).

The hardware will schedule blocks automatically (so for example if the hardware
can run 40 blocks simultaneously and you launch 100, the hardware may launch
blocks 0 to 39, then when one of those finishes it will launch the 40'th block
and so on).

So isn't the solution simply to adjust the logic around
nvptx_adjust_launch_bounds in GOMP_OFFLOAD_run, that is, if there's a lower
bound specified, use it instead of what adjust_launch_bounds is computing as
max_blocks?

Yours,
Alexander

  reply	other threads:[~2021-11-12 19:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-12 13:20 [PATCH] libgomp, nvptx: " Jakub Jelinek
2021-11-12 13:27 ` [PATCH] libgomp, nvptx, v2: " Jakub Jelinek
2021-11-12 17:58   ` [PATCH] libgomp, nvptx, v3: " Jakub Jelinek
2021-11-12 19:16     ` Alexander Monakov [this message]
2021-11-12 19:47       ` Jakub Jelinek
2021-11-12 19:49         ` Jakub Jelinek
2021-11-12 21:21           ` Alexander Monakov
2021-11-12 21:08     ` Alexander Monakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cde518e-b369-c597-5ddd-1086bd7129c5@ispras.ru \
    --to=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=tdevries@suse.de \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).