From: Jakub Jelinek <jakub@redhat.com>
To: Tobias Burnus <tobias@codesourcery.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
Alexander Monakov <amonakov@ispras.ru>,
Tom de Vries <tdevries@suse.de>
Subject: Re: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling
Date: Tue, 11 Oct 2022 12:49:00 +0200 [thread overview]
Message-ID: <Y0VKHHCGSI7ajDIG@tucnak> (raw)
In-Reply-To: <798d7ee1-2ffa-a591-38cb-a9ad421265d0@codesourcery.com>
On Fri, Oct 07, 2022 at 04:26:58PM +0200, Tobias Burnus wrote:
> libgomp/nvptx: Prepare for reverse-offload callback handling
>
> This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
> later handle the reverse offload.
> For nvptx, it adds support for forwarding the offload gomp_target_ext call
> to the host by setting values in a struct on the device and querying it on
> the host - invoking gomp_target_rev on the result.
>
> For host-device consistency guarantee reasons, reverse offload is currently
> limited -march=sm_70 (for libgomp).
>
> gcc/ChangeLog:
>
> * config/nvptx/mkoffload.cc (process): Warn if the linked-in libgomp.a
> has not been compiled with sm_70 or higher and disable code gen then.
>
> include/ChangeLog:
>
> * cuda/cuda.h (enum CUdevice_attribute): Add
> CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
> (CU_MEMHOSTALLOC_DEVICEMAP): Define.
> (cuMemHostAlloc): Add prototype.
>
> libgomp/ChangeLog:
>
> * config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
> 'static' for this variable.
> * config/nvptx/libgomp-nvptx.h: New file.
> * config/nvptx/target.c: Include it.
> (GOMP_ADDITIONAL_ICVS): Declare extern var.
> (GOMP_REV_OFFLOAD_VAR): Declare var.
> (GOMP_target_ext): Handle reverse offload.
> * libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
> * libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
> * target.c (gomp_target_rev): ... this new stub function.
> * libgomp.h (gomp_target_rev): Declare.
> * libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
> * plugin/cuda-lib.def (cuMemHostAlloc): Add.
> * plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
> (struct ptx_device): Add rev_data member.
> (nvptx_open_device): #if 0 unused check; add
> unified address assert check.
> (GOMP_OFFLOAD_get_num_devices): Claim unified address
> support.
> (GOMP_OFFLOAD_load_image): Free rev_fn_table if no
> offload functions exist. Make offload var available
> on host and device.
> (rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
> (GOMP_OFFLOAD_run): Handle reverse offload.
So, does this mean one has to have gcc configured --with-arch=sm_70
or later to make reverse offloading work (and then on the other
side no support for older PTX arches at all)?
If yes, I was kind of hoping we could arrange for it to be more
user-friendly, build libgomp.a normally (sm_35 or what is the default),
build the single TU in libgomp that needs the sm_70 stuff with -march=sm_70
and arrange for mkoffload to link in the sm_70 stuff only if the user
wants reverse offload (or has requires reverse_offload?). In that case
ignore sm_60 and older devices, if reverse offload isn't wanted, don't link
in the part that needs sm_70 and make stuff working on sm_35 and later.
Or perhaps have 2 versions of target.o, one sm_35 and one sm_70 and let
mkoffload choose among them.
> + /* The code for nvptx for GOMP_target_ext in libgomp/config/nvptx/target.c
> + for < sm_70 exists but is disabled here as it is unclear whether there
> + is the required consistency between host and device.
> + See https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602715.html
> + for details. */
> + warning_at (input_location, 0,
> + "Disabling offload-code generation for this device type: "
> + "%<omp requires reverse_offload%> can only be fulfilled "
> + "for %<sm_70%> or higher");
> + inform (UNKNOWN_LOCATION,
> + "Reverse offload requires that GCC is configured with "
> + "%<--with-arch=sm_70%> or higher and not overridden by a lower "
> + "value for %<-foffload-options=nvptx-none=-march=%>");
Diagnostics (sure, Fortran FE is an exception) shouldn't start with capital
letters).
> @@ -519,10 +523,20 @@ nvptx_open_device (int n)
> CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, dev);
> ptx_dev->max_threads_per_multiprocessor = pi;
>
> +#if 0
> + int async_engines;
> r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &async_engines,
> CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT, dev);
> if (r != CUDA_SUCCESS)
> async_engines = 1;
> +#endif
Please avoid #if 0 code.
> +
> + /* Required below for reverse offload as implemented, but with compute
> + capability >= 2.0 and 64bit device processes, this should be universally be
> + the case; hence, an assert. */
> + r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi,
> + CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING, dev);
> + assert (r == CUDA_SUCCESS && pi);
>
> for (int i = 0; i != GOMP_DIM_MAX; i++)
> ptx_dev->default_dims[i] = 0;
> @@ -1179,8 +1193,10 @@ GOMP_OFFLOAD_get_num_devices (unsigned int omp_requires_mask)
> {
> int num_devices = nvptx_get_num_devices ();
> /* Return -1 if no omp_requires_mask cannot be fulfilled but
> - devices were present. */
> - if (num_devices > 0 && omp_requires_mask != 0)
> + devices were present. Unified-shared address: see comment in
2 spaces after . rather than 1.
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -2925,6 +2925,25 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
> htab_free (refcount_set);
> }
>
> +/* Handle reverse offload. This is called by the device plugins for a
> + reverse offload; it is not called if the outer target runs on the host. */
Likewise.
Jakub
next prev parent reply other threads:[~2022-10-11 10:49 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 9:07 [Patch] " Tobias Burnus
2022-08-26 9:07 ` Tobias Burnus
2022-08-26 14:56 ` Alexander Monakov
2022-09-09 15:49 ` Jakub Jelinek
2022-09-09 15:51 ` Jakub Jelinek
2022-09-13 7:07 ` Tobias Burnus
2022-09-21 20:06 ` Alexander Monakov
2022-09-26 15:07 ` Tobias Burnus
2022-09-26 17:45 ` Alexander Monakov
2022-09-27 9:23 ` Tobias Burnus
2022-09-28 13:16 ` Alexander Monakov
2022-10-02 18:13 ` Tobias Burnus
2022-10-07 14:26 ` [Patch][v5] " Tobias Burnus
2022-10-11 10:49 ` Jakub Jelinek [this message]
2022-10-11 11:12 ` Alexander Monakov
2022-10-12 8:55 ` Tobias Burnus
2022-10-17 7:35 ` *ping* / " Tobias Burnus
2022-10-19 15:53 ` Alexander Monakov
2022-10-24 14:07 ` Jakub Jelinek
2022-10-24 19:05 ` Thomas Schwinge
2022-10-24 19:11 ` Thomas Schwinge
2022-10-24 19:46 ` Tobias Burnus
2022-10-24 19:51 ` libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs (was: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling) Thomas Schwinge
2023-03-21 15:53 ` libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation (was: [Patch] " Thomas Schwinge
2023-03-24 15:43 ` [og12] " Thomas Schwinge
2023-04-28 8:48 ` Tobias Burnus
2023-04-28 9:31 ` Thomas Schwinge
2023-04-28 10:51 ` Tobias Burnus
2023-04-04 14:40 ` [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling Thomas Schwinge
2023-04-28 8:28 ` Tobias Burnus
2023-04-28 9:23 ` Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0VKHHCGSI7ajDIG@tucnak \
--to=jakub@redhat.com \
--cc=amonakov@ispras.ru \
--cc=gcc-patches@gcc.gnu.org \
--cc=tdevries@suse.de \
--cc=tobias@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).