public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <tschwinge@baylibre.com>
To: Andrew Stubbs <ams@baylibre.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Tobias Burnus <tburnus@baylibre.com>,
	gcc-patches@gcc.gnu.org
Subject: Re: GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal
Date: Thu, 07 Mar 2024 14:37:02 +0100	[thread overview]
Message-ID: <87edcm6u4h.fsf@euler.schwinge.ddns.net> (raw)
In-Reply-To: <b9e8f674-a7d2-4140-ad6e-c3c89c3d0e2e@baylibre.com>

Hi Andrew!

On 2024-03-07T11:38:27+0000, Andrew Stubbs <ams@baylibre.com> wrote:
> On 07/03/2024 11:29, Thomas Schwinge wrote:
>> On 2019-11-12T13:29:16+0000, Andrew Stubbs <ams@codesourcery.com> wrote:
>>> This patch contributes the GCN libgomp plugin, with the various
>>> configure and make bits to go with it.
>> 
>> An issue with libgomp GCN plugin 'GCN_SUPPRESS_HOST_FALLBACK' (which is
>> different from the libgomp-level host-fallback execution):
>> 
>>> --- /dev/null
>>> +++ b/libgomp/plugin/plugin-gcn.c
>> 
>>> +/* Flag to decide if the runtime should suppress a possible fallback to host
>>> +   execution.  */
>>> +
>>> +static bool suppress_host_fallback;
>> 
>>> +static void
>>> +init_environment_variables (void)
>>> +{
>>> +  [...]
>>> +  if (secure_getenv ("GCN_SUPPRESS_HOST_FALLBACK"))
>>> +    suppress_host_fallback = true;
>>> +  else
>>> +    suppress_host_fallback = false;
>> 
>>> +/* Return true if the HSA runtime can run function FN_PTR.  */
>>> +
>>> +bool
>>> +GOMP_OFFLOAD_can_run (void *fn_ptr)
>>> +{
>>> +  struct kernel_info *kernel = (struct kernel_info *) fn_ptr;
>>> +
>>> +  init_kernel (kernel);
>>> +  if (kernel->initialization_failed)
>>> +    goto failure;
>>> +
>>> +  return true;
>>> +
>>> +failure:
>>> +  if (suppress_host_fallback)
>>> +    GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
>>> +  GCN_WARNING ("GCN target cannot be launched, doing a host fallback\n");
>>> +  return false;
>>> +}
>> 
>> This originates in the libgomp HSA plugin, where the idea was -- in my
>> understanding -- that you wouldn't have device code available for all
>> 'fn_ptr's, and in that case transparently (shared-memory system!) do
>> host-fallback execution.  Or, with 'GCN_SUPPRESS_HOST_FALLBACK' set,
>> you'd get those diagnosed.
>> 
>> This has then been copied into the libgomp GCN plugin (see above).
>> However, is it really still applicable there; don't we assume that we're
>> generating device code for all relevant functions?  (I suppose everyone
>> really is testing with 'GCN_SUPPRESS_HOST_FALLBACK' set?)  Should we thus
>> actually remove 'suppress_host_fallback' (that is, make it
>> always-'true'), including removal of the 'can_run' hook?  (I suppose that
>> even in a future shared-memory "GCN" configuration, we're not expecting
>> to use this again; expecting always-'true' for 'can_run'?)
>> 
>> 
>> Now my actual issue: the libgomp GCN plugin then invented an additional
>> use of 'GCN_SUPPRESS_HOST_FALLBACK':
>> 
>>> +/* Initialize hsa_context if it has not already been done.
>>> +   Return TRUE on success.  */
>>> +
>>> +static bool
>>> +init_hsa_context (void)
>>> +{
>>> +  hsa_status_t status;
>>> +  int agent_index = 0;
>>> +
>>> +  if (hsa_context.initialized)
>>> +    return true;
>>> +  init_environment_variables ();
>>> +  if (!init_hsa_runtime_functions ())
>>> +    {
>>> +      GCN_WARNING ("Run-time could not be dynamically opened\n");
>>> +      if (suppress_host_fallback)
>>> +	GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
>>> +      return false;
>>> +    }
>> 
>> That is, if 'GCN_SUPPRESS_HOST_FALLBACK' is (routinely) set (for its
>> original purpose), and you have the libgomp GCN plugin configured, but
>> don't have 'libhsa-runtime64.so.1' available, you run into a fatal error.
>> 
>> The libgomp nvptx plugin in such cases silently disables the
>> plugin/device (and thus lets libgomp proper do its thing), and I propose
>> we do the same here.  OK to push the attached
>> "GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal"?
>
> If you try to run the offload testsuite on a device that is not properly 
> configured then we want FAIL

Exactly, and that's what I'm working towards.  (Currently we're not
implementing that properly.)

But why is 'GCN_SUPPRESS_HOST_FALLBACK' controlling
'init_hsa_runtime_functions' relevant for that?  As you know, that
function only deals with dynamically loading 'libhsa-runtime64.so.1', and
Failure to load that one (because it doesn't exist) should have the
agreed-upon behavior of *not* raising an error.  (Any other, later errors
should be fatal, I certainly agree.)

> not pass-via-fallback. You're breaking that.

Sorry, I don't follow, please explain?


Grüße
 Thomas

  reply	other threads:[~2024-03-07 13:37 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12 13:29 [PATCH 0/7 libgomp,amdgcn] AMD GCN Offloading Support Andrew Stubbs
2019-11-12 13:29 ` [PATCH 1/7 libgomp,nvptx] Move generic libgomp files from nvptx to accel Andrew Stubbs
2019-11-12 13:43   ` Jakub Jelinek
2019-11-12 13:29 ` [PATCH 2/7 amdgcn] GCN mkoffload Andrew Stubbs
2019-11-12 13:29 ` [PATCH 3/7 libgomp,nvptx] Add device number to GOMP_OFFLOAD_openacc_async_construct Andrew Stubbs
2019-11-12 13:46   ` Jakub Jelinek
2019-11-12 13:30 ` [PATCH 4/7 libgomp,amdgcn] GCN libgomp port Andrew Stubbs
2019-11-12 13:48   ` Jakub Jelinek
2019-11-12 14:21     ` Andrew Stubbs
2019-12-02 14:43   ` Thomas Schwinge
2019-12-02 14:50     ` Julian Brown
2019-12-03  9:33       ` Which OpenACC 'acc_device_t' to use for AMD GPU offloading (was: [PATCH 4/7 libgomp,amdgcn] GCN libgomp port) Thomas Schwinge
2019-12-03 14:20         ` Julian Brown
2019-12-03 14:42           ` Which OpenACC 'acc_device_t' to use for AMD GPU offloading Thomas Schwinge
2019-12-03 15:00             ` Tobias Burnus
2019-12-03 13:13     ` [PATCH 4/7 libgomp,amdgcn] GCN libgomp port Andrew Stubbs
2019-12-03 14:07       ` "gcn" vs. "amdgcn" etc. (was: [PATCH 4/7 libgomp,amdgcn] GCN libgomp port) Thomas Schwinge
2019-12-03 15:53         ` Julian Brown
2020-01-17 19:20     ` [committed, amdgcn/openacc] Rename acc_device_gcn to acc_device_radeon Andrew Stubbs
2020-04-21 12:24       ` [AMD GCN] Use 'radeon' for the environment variable 'ACC_DEVICE_TYPE' (was: [committed, amdgcn/openacc] Rename acc_device_gcn to acc_device_radeon) Thomas Schwinge
2020-04-23 16:27         ` [AMD GCN] Use 'radeon' for the environment variable 'ACC_DEVICE_TYPE' Andrew Stubbs
2020-04-29  8:08           ` Thomas Schwinge
2019-12-16 22:28   ` [PATCH 4/7 libgomp,amdgcn] GCN libgomp port Thomas Schwinge
2019-11-12 13:30 ` [PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin Andrew Stubbs
2019-11-12 14:11   ` Jakub Jelinek
2019-11-12 14:42     ` Andrew Stubbs
2021-01-14 19:19   ` [gcn offloading] Only supported in 64-bit configurations (was: [PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin) Thomas Schwinge
2024-03-07 11:29   ` GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal " Thomas Schwinge
2024-03-07 11:38     ` GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal Andrew Stubbs
2024-03-07 13:37       ` Thomas Schwinge [this message]
2024-03-07 13:47         ` Andrew Stubbs
2024-03-07 11:43     ` Tobias Burnus
2024-03-07 13:37       ` Thomas Schwinge
2024-03-07 14:07         ` Tobias Burnus
2024-03-08 10:21           ` Thomas Schwinge
2024-03-08 10:16         ` GCN: The original meaning of 'GCN_SUPPRESS_HOST_FALLBACK' isn't applicable (non-shared memory system) (was: GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal) Thomas Schwinge
2024-03-08 12:42           ` GCN: The original meaning of 'GCN_SUPPRESS_HOST_FALLBACK' isn't applicable (non-shared memory system) Andrew Stubbs
2019-11-12 13:30 ` [PATCH 5/7 libgomp,amdgcn] Optimize GCN OpenMP malloc performance Andrew Stubbs
2019-11-12 14:02   ` Jakub Jelinek
2019-11-12 17:54     ` Andrew Stubbs
2019-11-12 22:51       ` Jakub Jelinek
2019-11-12 13:31 ` [PATCH 6/7 amdgcn] Use a single worker for OpenACC on AMD GCN Andrew Stubbs
2021-06-08 10:07   ` Thomas Schwinge
2019-11-13 13:05 ` [PATCH 0/7 libgomp,amdgcn] AMD GCN Offloading Support Andrew Stubbs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edcm6u4h.fsf@euler.schwinge.ddns.net \
    --to=tschwinge@baylibre.com \
    --cc=ams@baylibre.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=tburnus@baylibre.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).