From: Tom de Vries <tdevries@suse.de>
To: Thomas Schwinge <thomas@codesourcery.com>,
Tobias Burnus <tobias@codesourcery.com>,
gcc-patches@gcc.gnu.org, Jakub Jelinek <jakub@redhat.com>
Subject: Re: [Patch] nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]
Date: Tue, 4 Apr 2023 11:08:15 +0200 [thread overview]
Message-ID: <7d39bff2-71c2-176e-dd22-ec933d766db5@suse.de> (raw)
In-Reply-To: <871ql0xboa.fsf@euler.schwinge.homeip.net>
On 4/4/23 11:02, Thomas Schwinge wrote:
> Hi!
>
> Are we going to install such a work-around?
>
Hi,
LGTM.
Thanks,
- Tom
>
> Grüße
> Thomas
>
>
> On 2022-12-19T13:04:43+0100, I wrote:
>> Hi!
>>
>> On 2022-12-16T17:19:00+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
>>> Seems to be a CUDA JIT issue
>>
>> A Nvidia Driver JIT issue, more precisely. ;-)
>>
>>> which is fixed by adding a dummy procedure.
>>
>> Gah... :-|
>>
>>> Lightly tested with 4 systems at hand, where 2 failed before.
>>
>> I'm happy to confirm that indeed this does resolve the issue for all
>> configurations that I reported in <https://gcc.gnu.org/PR108098>
>> "OpenMP/nvptx reverse offload execution test FAILs".
>>
>>
>> As I said on IRC, #gcc, 2022-12-16:
>>
>>> [...] we're unlikely to reverse-engineer the exact version/conditions
>>> where this got fixed, so don't have a useful means for versioning the
>>> workaround. Fortunately, it doesn't "cost" anything really. (In
>>> constrast to some other GCC/nvptx back end workarounds, as I
>>> understand.)
>>
>>
>> Grüße
>> Thomas
>>
>>
>>> One had 10.2 and
>>> the other had some ancient CUDA where 'nvptx-smi' did not print a CUDA version
>>> and requires -mptx=3.1.
>>> (I did check that offloading indeed happened and no hostfallback was done.)
>>>
>>> OK for mainline?
>>>
>>> Tobias
>>
>>
>>> nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]
>>>
>>> Seemingly, the ptx JIT of CUDA <= 10.2 replaces function pointers in global
>>> variables by NULL if a translation does not contain any executable code. It
>>> works with CUDA 11.1. The code of this commit is about reverse offload;
>>> having NULL values disables the side of reverse offload during image load.
>>>
>>> Solution is the same as found by Thomas for a related issue: Adding a dummy
>>> procedure. Cf. the PR of this issue and Thomas' patch
>>> "nvptx: Support global constructors/destructors via 'collect2'"
>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607749.html
>>>
>>> As that approach also works here:
>>>
>>> Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
>>>
>>> gcc/
>>> PR libgomp/108098
>>>
>>> * config/nvptx/mkoffload.cc (process): Emit dummy procedure
>>> alongside reverse-offload function table to prevent NULL values
>>> of the function addresses.
>>>
>>> ---
>>> gcc/config/nvptx/mkoffload.cc | 14 ++++++++++++++
>>> 1 file changed, 14 insertions(+)
>>>
>>> diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
>>> index 5d89ba8..8306aa0 100644
>>> --- a/gcc/config/nvptx/mkoffload.cc
>>> +++ b/gcc/config/nvptx/mkoffload.cc
>>> @@ -357,6 +357,20 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
>>> fputc (sm_ver2[i], out);
>>> fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");
>>>
>>> + /* WORKAROUND - see PR 108098
>>> + It seems as if older CUDA JIT compiler optimizes the function pointers
>>> + in offload_func_table to NULL, which can be prevented by adding a
>>> + dummy procedure. With CUDA 11.1, it seems to work fine without
>>> + workaround while CUDA 10.2 as some ancient version have need the
>>> + workaround. Assuming CUDA 11.0 fixes it, emitting it could be
>>> + restricted to 'if (sm_ver2[0] < 8 && version2[0] < 7)' as sm_80 and
>>> + PTX ISA 7.0 are new in CUDA 11.0; for 11.1 it would be sm_86 and
>>> + PTX ISA 7.1. */
>>> + fprintf (out, "\n\t\".func __dummy$func ( );\"\n");
>>> + fprintf (out, "\t\".func __dummy$func ( )\"\n");
>>> + fprintf (out, "\t\"{\"\n");
>>> + fprintf (out, "\t\"}\"\n");
>>> +
>>> size_t fidx = 0;
>>> for (id = func_ids; id; id = id->next)
>>> {
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
prev parent reply other threads:[~2023-04-04 9:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-16 16:19 Tobias Burnus
2022-12-19 12:04 ` Thomas Schwinge
2023-04-04 9:02 ` Thomas Schwinge
2023-04-04 9:08 ` Tom de Vries [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d39bff2-71c2-176e-dd22-ec933d766db5@suse.de \
--to=tdevries@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=thomas@codesourcery.com \
--cc=tobias@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).