From: Tobias Burnus <tobias@codesourcery.com>
To: Alexander Monakov <amonakov@ispras.ru>
Cc: Jakub Jelinek <jakub@redhat.com>, Tom de Vries <tdevries@suse.de>,
gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling
Date: Sun, 2 Oct 2022 20:13:56 +0200 [thread overview]
Message-ID: <3ebce406-46e4-8f98-8c53-83b61423644e@codesourcery.com> (raw)
In-Reply-To: <30e3ed49-0d14-8015-57ef-3d70b1dea69a@codesourcery.com>
[-- Attachment #1: Type: text/plain, Size: 2135 bytes --]
On 27.09.22 11:23, Tobias Burnus wrote:
We do support
#if __PTX_SM__ >= 600 (CUDA >= 8.0, ptx isa >= 5.0)
and we also can configure GCC with
--with-arch=sm_70 (or sm_80 or ...)
Thus, adding atomics with .sys scope is possible.
See attached patch. This seems to work fine and I hope I got the
assembly right in terms of atomic use. (And I do believe that the
.release/.acquire do not need an additional __sync_syncronize()/"membar.sys".)
Regarding this:
While 'atom.op' (op = and/or/xor/cas/exch/add/inc/dec/min/max)
with scope is a sm_60 feature, the used 'st/ld' with scope qualifier
and .relaxed, .release / .relaxed, .acquire require sm_70.
(Does not really matter as only ..., sm_53 and sm_70, ... is currently
supported but not sm_60, but the #if should be obviously fixed.)
* * *
Looking at the generated code for without inline assembler, we have instead of
st.global.release.sys.u64 [%r27],%r39;
and
ld.acquire.sys.global.u64 %r62,[%r27];
for the older-systems (__PTX_SM < 700) the code:
@ %r69 membar.sys;
@ %r69 atom.exch.b64 _,[%r27],%r41;
and
ld.global.u64 %r64,[__gomp_rev_offload_var];
ld.u64 %r36,[%r64];
membar.sys;
In my understanding, the membar.sys ensures - similar to
st.release / ld.acquire
that the memory handling is done in the correct order in scope .sys.
As the 'fn' variable is initially 0 - and then only set via the device
i.e. there is eventually a DMA write device->host, which is atomically
as the will int64_t is written at once (and not first, e.g. the lower
and then the upper half). The 'st'/'atom.exch' should work fine, despite
having no .sys scope.
Likewise, the membar.sys applies also in the other direction. Or did I
miss something. If so, would an explicit __sync_synchronize() (= membar.sys)
help between the 'st' and the 'ld'?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
next prev parent reply other threads:[~2022-10-02 18:14 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 9:07 Tobias Burnus
2022-08-26 9:07 ` Tobias Burnus
2022-08-26 14:56 ` Alexander Monakov
2022-09-09 15:49 ` Jakub Jelinek
2022-09-09 15:51 ` Jakub Jelinek
2022-09-13 7:07 ` Tobias Burnus
2022-09-21 20:06 ` Alexander Monakov
2022-09-26 15:07 ` Tobias Burnus
2022-09-26 17:45 ` Alexander Monakov
2022-09-27 9:23 ` Tobias Burnus
2022-09-28 13:16 ` Alexander Monakov
2022-10-02 18:13 ` Tobias Burnus [this message]
2022-10-07 14:26 ` [Patch][v5] " Tobias Burnus
2022-10-11 10:49 ` Jakub Jelinek
2022-10-11 11:12 ` Alexander Monakov
2022-10-12 8:55 ` Tobias Burnus
2022-10-17 7:35 ` *ping* / " Tobias Burnus
2022-10-19 15:53 ` Alexander Monakov
2022-10-24 14:07 ` Jakub Jelinek
2022-10-24 19:05 ` Thomas Schwinge
2022-10-24 19:11 ` Thomas Schwinge
2022-10-24 19:46 ` Tobias Burnus
2022-10-24 19:51 ` libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs (was: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling) Thomas Schwinge
2023-03-21 15:53 ` libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation (was: [Patch] " Thomas Schwinge
2023-03-24 15:43 ` [og12] " Thomas Schwinge
2023-04-28 8:48 ` Tobias Burnus
2023-04-28 9:31 ` Thomas Schwinge
2023-04-28 10:51 ` Tobias Burnus
2023-04-04 14:40 ` [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling Thomas Schwinge
2023-04-28 8:28 ` Tobias Burnus
2023-04-28 9:23 ` Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3ebce406-46e4-8f98-8c53-83b61423644e@codesourcery.com \
--to=tobias@codesourcery.com \
--cc=amonakov@ispras.ru \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=tdevries@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).