From: Thomas Schwinge <tschwinge@baylibre.com>
To: Andrew Stubbs <ams@baylibre.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
Tobias Burnus <tburnus@baylibre.com>,
gcc-patches@gcc.gnu.org
Subject: Re: GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal
Date: Thu, 07 Mar 2024 14:37:02 +0100 [thread overview]
Message-ID: <87edcm6u4h.fsf@euler.schwinge.ddns.net> (raw)
In-Reply-To: <b9e8f674-a7d2-4140-ad6e-c3c89c3d0e2e@baylibre.com>
Hi Andrew!
On 2024-03-07T11:38:27+0000, Andrew Stubbs <ams@baylibre.com> wrote:
> On 07/03/2024 11:29, Thomas Schwinge wrote:
>> On 2019-11-12T13:29:16+0000, Andrew Stubbs <ams@codesourcery.com> wrote:
>>> This patch contributes the GCN libgomp plugin, with the various
>>> configure and make bits to go with it.
>>
>> An issue with libgomp GCN plugin 'GCN_SUPPRESS_HOST_FALLBACK' (which is
>> different from the libgomp-level host-fallback execution):
>>
>>> --- /dev/null
>>> +++ b/libgomp/plugin/plugin-gcn.c
>>
>>> +/* Flag to decide if the runtime should suppress a possible fallback to host
>>> + execution. */
>>> +
>>> +static bool suppress_host_fallback;
>>
>>> +static void
>>> +init_environment_variables (void)
>>> +{
>>> + [...]
>>> + if (secure_getenv ("GCN_SUPPRESS_HOST_FALLBACK"))
>>> + suppress_host_fallback = true;
>>> + else
>>> + suppress_host_fallback = false;
>>
>>> +/* Return true if the HSA runtime can run function FN_PTR. */
>>> +
>>> +bool
>>> +GOMP_OFFLOAD_can_run (void *fn_ptr)
>>> +{
>>> + struct kernel_info *kernel = (struct kernel_info *) fn_ptr;
>>> +
>>> + init_kernel (kernel);
>>> + if (kernel->initialization_failed)
>>> + goto failure;
>>> +
>>> + return true;
>>> +
>>> +failure:
>>> + if (suppress_host_fallback)
>>> + GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
>>> + GCN_WARNING ("GCN target cannot be launched, doing a host fallback\n");
>>> + return false;
>>> +}
>>
>> This originates in the libgomp HSA plugin, where the idea was -- in my
>> understanding -- that you wouldn't have device code available for all
>> 'fn_ptr's, and in that case transparently (shared-memory system!) do
>> host-fallback execution. Or, with 'GCN_SUPPRESS_HOST_FALLBACK' set,
>> you'd get those diagnosed.
>>
>> This has then been copied into the libgomp GCN plugin (see above).
>> However, is it really still applicable there; don't we assume that we're
>> generating device code for all relevant functions? (I suppose everyone
>> really is testing with 'GCN_SUPPRESS_HOST_FALLBACK' set?) Should we thus
>> actually remove 'suppress_host_fallback' (that is, make it
>> always-'true'), including removal of the 'can_run' hook? (I suppose that
>> even in a future shared-memory "GCN" configuration, we're not expecting
>> to use this again; expecting always-'true' for 'can_run'?)
>>
>>
>> Now my actual issue: the libgomp GCN plugin then invented an additional
>> use of 'GCN_SUPPRESS_HOST_FALLBACK':
>>
>>> +/* Initialize hsa_context if it has not already been done.
>>> + Return TRUE on success. */
>>> +
>>> +static bool
>>> +init_hsa_context (void)
>>> +{
>>> + hsa_status_t status;
>>> + int agent_index = 0;
>>> +
>>> + if (hsa_context.initialized)
>>> + return true;
>>> + init_environment_variables ();
>>> + if (!init_hsa_runtime_functions ())
>>> + {
>>> + GCN_WARNING ("Run-time could not be dynamically opened\n");
>>> + if (suppress_host_fallback)
>>> + GOMP_PLUGIN_fatal ("GCN host fallback has been suppressed");
>>> + return false;
>>> + }
>>
>> That is, if 'GCN_SUPPRESS_HOST_FALLBACK' is (routinely) set (for its
>> original purpose), and you have the libgomp GCN plugin configured, but
>> don't have 'libhsa-runtime64.so.1' available, you run into a fatal error.
>>
>> The libgomp nvptx plugin in such cases silently disables the
>> plugin/device (and thus lets libgomp proper do its thing), and I propose
>> we do the same here. OK to push the attached
>> "GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal"?
>
> If you try to run the offload testsuite on a device that is not properly
> configured then we want FAIL
Exactly, and that's what I'm working towards. (Currently we're not
implementing that properly.)
But why is 'GCN_SUPPRESS_HOST_FALLBACK' controlling
'init_hsa_runtime_functions' relevant for that? As you know, that
function only deals with dynamically loading 'libhsa-runtime64.so.1', and
Failure to load that one (because it doesn't exist) should have the
agreed-upon behavior of *not* raising an error. (Any other, later errors
should be fatal, I certainly agree.)
> not pass-via-fallback. You're breaking that.
Sorry, I don't follow, please explain?
Grüße
Thomas
next prev parent reply other threads:[~2024-03-07 13:37 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-12 13:29 [PATCH 0/7 libgomp,amdgcn] AMD GCN Offloading Support Andrew Stubbs
2019-11-12 13:29 ` [PATCH 1/7 libgomp,nvptx] Move generic libgomp files from nvptx to accel Andrew Stubbs
2019-11-12 13:43 ` Jakub Jelinek
2019-11-12 13:29 ` [PATCH 2/7 amdgcn] GCN mkoffload Andrew Stubbs
2019-11-12 13:29 ` [PATCH 3/7 libgomp,nvptx] Add device number to GOMP_OFFLOAD_openacc_async_construct Andrew Stubbs
2019-11-12 13:46 ` Jakub Jelinek
2019-11-12 13:30 ` [PATCH 4/7 libgomp,amdgcn] GCN libgomp port Andrew Stubbs
2019-11-12 13:48 ` Jakub Jelinek
2019-11-12 14:21 ` Andrew Stubbs
2019-12-02 14:43 ` Thomas Schwinge
2019-12-02 14:50 ` Julian Brown
2019-12-03 9:33 ` Which OpenACC 'acc_device_t' to use for AMD GPU offloading (was: [PATCH 4/7 libgomp,amdgcn] GCN libgomp port) Thomas Schwinge
2019-12-03 14:20 ` Julian Brown
2019-12-03 14:42 ` Which OpenACC 'acc_device_t' to use for AMD GPU offloading Thomas Schwinge
2019-12-03 15:00 ` Tobias Burnus
2019-12-03 13:13 ` [PATCH 4/7 libgomp,amdgcn] GCN libgomp port Andrew Stubbs
2019-12-03 14:07 ` "gcn" vs. "amdgcn" etc. (was: [PATCH 4/7 libgomp,amdgcn] GCN libgomp port) Thomas Schwinge
2019-12-03 15:53 ` Julian Brown
2020-01-17 19:20 ` [committed, amdgcn/openacc] Rename acc_device_gcn to acc_device_radeon Andrew Stubbs
2020-04-21 12:24 ` [AMD GCN] Use 'radeon' for the environment variable 'ACC_DEVICE_TYPE' (was: [committed, amdgcn/openacc] Rename acc_device_gcn to acc_device_radeon) Thomas Schwinge
2020-04-23 16:27 ` [AMD GCN] Use 'radeon' for the environment variable 'ACC_DEVICE_TYPE' Andrew Stubbs
2020-04-29 8:08 ` Thomas Schwinge
2019-12-16 22:28 ` [PATCH 4/7 libgomp,amdgcn] GCN libgomp port Thomas Schwinge
2019-11-12 13:30 ` [PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin Andrew Stubbs
2019-11-12 14:11 ` Jakub Jelinek
2019-11-12 14:42 ` Andrew Stubbs
2021-01-14 19:19 ` [gcn offloading] Only supported in 64-bit configurations (was: [PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin) Thomas Schwinge
2024-03-07 11:29 ` GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal " Thomas Schwinge
2024-03-07 11:38 ` GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal Andrew Stubbs
2024-03-07 13:37 ` Thomas Schwinge [this message]
2024-03-07 13:47 ` Andrew Stubbs
2024-03-07 11:43 ` Tobias Burnus
2024-03-07 13:37 ` Thomas Schwinge
2024-03-07 14:07 ` Tobias Burnus
2024-03-08 10:21 ` Thomas Schwinge
2024-03-08 10:16 ` GCN: The original meaning of 'GCN_SUPPRESS_HOST_FALLBACK' isn't applicable (non-shared memory system) (was: GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal) Thomas Schwinge
2024-03-08 12:42 ` GCN: The original meaning of 'GCN_SUPPRESS_HOST_FALLBACK' isn't applicable (non-shared memory system) Andrew Stubbs
2019-11-12 13:30 ` [PATCH 5/7 libgomp,amdgcn] Optimize GCN OpenMP malloc performance Andrew Stubbs
2019-11-12 14:02 ` Jakub Jelinek
2019-11-12 17:54 ` Andrew Stubbs
2019-11-12 22:51 ` Jakub Jelinek
2019-11-12 13:31 ` [PATCH 6/7 amdgcn] Use a single worker for OpenACC on AMD GCN Andrew Stubbs
2021-06-08 10:07 ` Thomas Schwinge
2019-11-13 13:05 ` [PATCH 0/7 libgomp,amdgcn] AMD GCN Offloading Support Andrew Stubbs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edcm6u4h.fsf@euler.schwinge.ddns.net \
--to=tschwinge@baylibre.com \
--cc=ams@baylibre.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=tburnus@baylibre.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).