public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <tschwinge@baylibre.com>
To: Tobias Burnus <tburnus@baylibre.com>,
	Kwok Cheung Yeung <kcyeung@baylibre.com>
Cc: gcc-patches@gcc.gnu.org, Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++
Date: Thu, 11 Apr 2024 12:10:09 +0200	[thread overview]
Message-ID: <87v84odxby.fsf@euler.schwinge.ddns.net> (raw)
In-Reply-To: <7ed57b08-3895-49b8-8b10-642609f0a8d2@codesourcery.com>

Hi!

I've filed <https://gcc.gnu.org/PR114690>
"OpenMP 'indirect' clause: dynamic image loading/unloading" for the
following issue:

On 2023-11-13T12:47:04+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 13.11.23 11:59, Thomas Schwinge wrote:
>>>> Also, for my understanding: why is 'build_indirect_map' done at kernel
>>>> invocation time (here) instead of at image load time?
>>> The splay_tree is generated on the device itself - and we currently do
>>> not start a kernel during GOMP_OFFLOAD_load_image. We could, the
>>> question is whether it makes sense. (Generating the splay_tree on the
>>> host for the device is a hassle and error prone as it needs to use
>>> device pointers at the end.)
>> Hmm.  It seems conceptually cleaner to me to set this up upfront, and
>> avoids potentially slowing down every device kernel invocation (at least
>> another function call, and 'gomp_mutex_lock' check).  Though, I agree
>> this may be "in the noise" with regards to all the other stuff going on
>> in 'gomp_gcn_enter_kernel' and elsewhere...
>
> I think the most common case is GOMP_INDIRECT_ADDR_MAP == NULL.
>
> The question is whether the lock should/could be moved inside  if (!indirect_array)
> or not. Probably yes:
> * doing an atomic load for the outer '!indirect array', work on a local array for
> the build up and only assign it at the end - and just after the lock check again
> whether '!indirect array'.
>
> That way, it is lock free once build but when build there is no race.
>
>> What I just realize, what's also unclear to me is how the current
>> implementation works with regards to several images getting loaded --
>> don't we then overwrite 'GOMP_INDIRECT_ADDR_MAP' instead of
>> (conceptually) appending to it?
>
> Yes, I think that will happen - but it looks as if the same issue exists
> also the other code? I think that's not the first variable that has that
> issue?
>
> I think we should try to cleanup that handling, also to support calling
> a device function in a shared library from a target region in the main
> program, which currently also fails.
>
> All device routines that are in normal static libraries and in the
> object files of the main program should simply work thanks to offload
> LTO such that there is only a single GOMP_offload_register_ver call (per
> device type) and GOMP_OFFLOAD_load_image call (per device).
>
> Likewise if the offloading is only done via a single shared library. —
> Any mixing will currently fail, unfortunately. This patch just adds
> another item which does not handle it properly.
>
> (Not good but IMHO also not a showstopper for this patch.)
>
>> In the general case, additional images may also get loaded during
>> execution.  We thus need proper locking of the shared data structure, uh?
>> Or, can we have separate on-device data structures per image?  (I've not
>> yet thought about that in detail.)
>
> I think we could - but in the main-program 'omp target' case that calls
> a shared-library 'declare target' function means that we need to handle
> multiple GOMP_offload_register_ver / load_image calls such that they can
> work together.
>
> Obviously, it gets harder if the user keeps doing dlopen() / dlclose()
> of libraries containing offload code where a target/compute region is
> run before, between, and after those calls (but hopefully not running
> when calling dlopen/dlclose).
>
>> Relatedly then, when images are unloaded, we also need to remove stale
>> items from the table, and release resources (for example, the
>> 'GOMP_OFFLOAD_alloc' for 'map_target_addr').
>
> True. I think the general assumption is that images only get unloaded at
> the very end, which matches most but not all code. Yet another work item.
>
> I think we should open a new PR about this topic and collect work items
> there.


Grüße
 Thomas

  reply	other threads:[~2024-04-11 10:10 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-08 13:13 Kwok Cheung Yeung
2023-10-17 13:12 ` Tobias Burnus
2023-10-17 13:34   ` Jakub Jelinek
2023-10-17 14:41     ` Tobias Burnus
2023-11-03 19:53   ` Kwok Cheung Yeung
2023-11-06  8:48     ` Tobias Burnus
2023-11-07 21:37       ` Joseph Myers
2023-11-07 21:51         ` Jakub Jelinek
2023-11-07 21:59           ` Kwok Cheung Yeung
2023-11-09 12:24     ` Thomas Schwinge
2023-11-09 16:00       ` Tobias Burnus
2023-11-13 10:59         ` Thomas Schwinge
2023-11-13 11:47           ` Tobias Burnus
2024-04-11 10:10             ` Thomas Schwinge [this message]
2024-01-03 14:47       ` [committed] " Kwok Cheung Yeung
2024-01-03 15:54       ` Kwok Cheung Yeung
2024-01-22 20:33     ` [PATCH] openmp: Change to using a hashtab to lookup offload target addresses for indirect function calls Kwok Cheung Yeung
2024-01-24  7:06       ` rep.dot.nop
2024-01-29 17:48         ` [PATCH v2] " Kwok Cheung Yeung
2024-03-08 13:40           ` Thomas Schwinge
2024-03-14 11:38           ` Tobias Burnus
2024-01-22 20:41     ` [PATCH] openmp, fortran: Add Fortran support for indirect clause on the declare target directive Kwok Cheung Yeung
2024-01-23 19:14       ` Tobias Burnus
2024-02-05 21:37         ` [PATCH v2] " Kwok Cheung Yeung
2024-02-06  9:03           ` Tobias Burnus
2024-02-06  9:50             ` Kwok Cheung Yeung
2024-02-12  8:51               ` Tobias Burnus
2024-02-15 21:37                 ` [COMMITTED] libgomp: Update documentation for indirect calls in target regions Kwok Cheung Yeung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v84odxby.fsf@euler.schwinge.ddns.net \
    --to=tschwinge@baylibre.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=kcyeung@baylibre.com \
    --cc=tburnus@baylibre.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).