Re: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jakub Jelinek <jakub@redhat.com>
To: Tobias Burnus <tobias@codesourcery.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Alexander Monakov <amonakov@ispras.ru>,
	Tom de Vries <tdevries@suse.de>
Subject: Re: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling
Date: Tue, 11 Oct 2022 12:49:00 +0200	[thread overview]
Message-ID: <Y0VKHHCGSI7ajDIG@tucnak> (raw)
In-Reply-To: <798d7ee1-2ffa-a591-38cb-a9ad421265d0@codesourcery.com>

On Fri, Oct 07, 2022 at 04:26:58PM +0200, Tobias Burnus wrote:
> libgomp/nvptx: Prepare for reverse-offload callback handling
> 
> This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
> later handle the reverse offload.
> For nvptx, it adds support for forwarding the offload gomp_target_ext call
> to the host by setting values in a struct on the device and querying it on
> the host - invoking gomp_target_rev on the result.
> 
> For host-device consistency guarantee reasons, reverse offload is currently
> limited -march=sm_70 (for libgomp).
> 
> gcc/ChangeLog:
> 
> 	* config/nvptx/mkoffload.cc (process): Warn if the linked-in libgomp.a
> 	has not been compiled with sm_70 or higher and disable code gen then.
> 
> include/ChangeLog:
> 
> 	* cuda/cuda.h (enum CUdevice_attribute): Add
> 	CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
> 	(CU_MEMHOSTALLOC_DEVICEMAP): Define.
> 	(cuMemHostAlloc): Add prototype.
> 
> libgomp/ChangeLog:
> 
> 	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
> 	'static' for this variable.
> 	* config/nvptx/libgomp-nvptx.h: New file.
> 	* config/nvptx/target.c: Include it.
> 	(GOMP_ADDITIONAL_ICVS): Declare extern var.
> 	(GOMP_REV_OFFLOAD_VAR): Declare var.
> 	(GOMP_target_ext): Handle reverse offload.
> 	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
> 	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
> 	* target.c (gomp_target_rev): ... this new stub function.
> 	* libgomp.h (gomp_target_rev): Declare.
> 	* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
> 	* plugin/cuda-lib.def (cuMemHostAlloc): Add.
> 	* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
> 	(struct ptx_device): Add rev_data member. 
> 	(nvptx_open_device): #if 0 unused check; add
> 	unified address assert check.
> 	(GOMP_OFFLOAD_get_num_devices): Claim unified address
> 	support.
> 	(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
> 	offload functions exist. Make offload var available
> 	on host and device.
> 	(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
> 	(GOMP_OFFLOAD_run): Handle reverse offload.

So, does this mean one has to have gcc configured --with-arch=sm_70
or later to make reverse offloading work (and then on the other
side no support for older PTX arches at all)?
If yes, I was kind of hoping we could arrange for it to be more
user-friendly, build libgomp.a normally (sm_35 or what is the default),
build the single TU in libgomp that needs the sm_70 stuff with -march=sm_70
and arrange for mkoffload to link in the sm_70 stuff only if the user
wants reverse offload (or has requires reverse_offload?).  In that case
ignore sm_60 and older devices, if reverse offload isn't wanted, don't link
in the part that needs sm_70 and make stuff working on sm_35 and later.
Or perhaps have 2 versions of target.o, one sm_35 and one sm_70 and let
mkoffload choose among them.

> +      /* The code for nvptx for GOMP_target_ext in libgomp/config/nvptx/target.c
> +	 for < sm_70 exists but is disabled here as it is unclear whether there
> +	 is the required consistency between host and device.
> +	 See https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602715.html
> +	 for details.  */
> +      warning_at (input_location, 0,
> +		  "Disabling offload-code generation for this device type: "
> +		  "%<omp requires reverse_offload%> can only be fulfilled "
> +		  "for %<sm_70%> or higher");
> +      inform (UNKNOWN_LOCATION,
> +	      "Reverse offload requires that GCC is configured with "
> +	      "%<--with-arch=sm_70%> or higher and not overridden by a lower "
> +	      "value for %<-foffload-options=nvptx-none=-march=%>");

Diagnostics (sure, Fortran FE is an exception) shouldn't start with capital
letters).

> @@ -519,10 +523,20 @@ nvptx_open_device (int n)
>  		  CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, dev);
>    ptx_dev->max_threads_per_multiprocessor = pi;
>  
> +#if 0
> +  int async_engines;
>    r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &async_engines,
>  			 CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT, dev);
>    if (r != CUDA_SUCCESS)
>      async_engines = 1;
> +#endif

Please avoid #if 0 code.

> +
> +  /* Required below for reverse offload as implemented, but with compute
> +     capability >= 2.0 and 64bit device processes, this should be universally be
> +     the case; hence, an assert.  */
> +  r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi,
> +			 CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING, dev);
> +  assert (r == CUDA_SUCCESS && pi);
>  
>    for (int i = 0; i != GOMP_DIM_MAX; i++)
>      ptx_dev->default_dims[i] = 0;
> @@ -1179,8 +1193,10 @@ GOMP_OFFLOAD_get_num_devices (unsigned int omp_requires_mask)
>  {
>    int num_devices = nvptx_get_num_devices ();
>    /* Return -1 if no omp_requires_mask cannot be fulfilled but
> -     devices were present.  */
> -  if (num_devices > 0 && omp_requires_mask != 0)
> +     devices were present. Unified-shared address: see comment in

2 spaces after . rather than 1.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -2925,6 +2925,25 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
>      htab_free (refcount_set);
>  }
>  
> +/* Handle reverse offload. This is called by the device plugins for a
> +   reverse offload; it is not called if the outer target runs on the host.  */

Likewise.

	Jakub

next prev parent reply	other threads:[~2022-10-11 10:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26  9:07 [Patch] " Tobias Burnus
2022-08-26  9:07 ` Tobias Burnus
2022-08-26 14:56 ` Alexander Monakov
2022-09-09 15:49   ` Jakub Jelinek
2022-09-09 15:51 ` Jakub Jelinek
2022-09-13  7:07 ` Tobias Burnus
2022-09-21 20:06   ` Alexander Monakov
2022-09-26 15:07     ` Tobias Burnus
2022-09-26 17:45       ` Alexander Monakov
2022-09-27  9:23         ` Tobias Burnus
2022-09-28 13:16           ` Alexander Monakov
2022-10-02 18:13           ` Tobias Burnus
2022-10-07 14:26             ` [Patch][v5] " Tobias Burnus
2022-10-11 10:49               ` Jakub Jelinek [this message]
2022-10-11 11:12                 ` Alexander Monakov
2022-10-12  8:55                   ` Tobias Burnus
2022-10-17  7:35                     ` *ping* / " Tobias Burnus
2022-10-19 15:53                     ` Alexander Monakov
2022-10-24 14:07                     ` Jakub Jelinek
2022-10-24 19:05                       ` Thomas Schwinge
2022-10-24 19:11                         ` Thomas Schwinge
2022-10-24 19:46                           ` Tobias Burnus
2022-10-24 19:51                           ` libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs (was: [Patch][v5] libgomp/nvptx: Prepare for reverse-offload callback handling) Thomas Schwinge
2023-03-21 15:53 ` libgomp: Simplify OpenMP reverse offload host <-> device memory copy implementation (was: [Patch] " Thomas Schwinge
2023-03-24 15:43   ` [og12] " Thomas Schwinge
2023-04-28  8:48   ` Tobias Burnus
2023-04-28  9:31     ` Thomas Schwinge
2023-04-28 10:51       ` Tobias Burnus
2023-04-04 14:40 ` [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling Thomas Schwinge
2023-04-28  8:28   ` Tobias Burnus
2023-04-28  9:23     ` Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0VKHHCGSI7ajDIG@tucnak \
    --to=jakub@redhat.com \
    --cc=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=tdevries@suse.de \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).