public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Tobias Burnus <tobias@codesourcery.com>
Cc: <gcc-patches@gcc.gnu.org>, Tom de Vries <tdevries@suse.de>,
	Jakub Jelinek <jakub@redhat.com>
Subject: Re: [Patch] nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]
Date: Mon, 19 Dec 2022 13:04:43 +0100	[thread overview]
Message-ID: <875ye78uqs.fsf@euler.schwinge.homeip.net> (raw)
In-Reply-To: <fd877978-48c4-4a9b-66f9-a105d9901ec1@codesourcery.com>

Hi!

On 2022-12-16T17:19:00+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> Seems to be a CUDA JIT issue

A Nvidia Driver JIT issue, more precisely.  ;-)

> which is fixed by adding a dummy procedure.

Gah...  :-|

> Lightly tested with 4 systems at hand, where 2 failed before.

I'm happy to confirm that indeed this does resolve the issue for all
configurations that I reported in <https://gcc.gnu.org/PR108098>
"OpenMP/nvptx reverse offload execution test FAILs".


As I said on IRC, #gcc, 2022-12-16:

> [...] we're unlikely to reverse-engineer the exact version/conditions
> where this got fixed, so don't have a useful means for versioning the
> workaround.  Fortunately, it doesn't "cost" anything really.  (In
> constrast to some other GCC/nvptx back end workarounds, as I
> understand.)


Grüße
 Thomas


> One had 10.2 and
> the other had some ancient CUDA where 'nvptx-smi' did not print a CUDA version
> and requires -mptx=3.1.
> (I did check that offloading indeed happened and no hostfallback was done.)
>
> OK for mainline?
>
> Tobias


> nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]
>
> Seemingly, the ptx JIT of CUDA <= 10.2 replaces function pointers in global
> variables by NULL if a translation does not contain any executable code. It
> works with CUDA 11.1.  The code of this commit is about reverse offload;
> having NULL values disables the side of reverse offload during image load.
>
> Solution is the same as found by Thomas for a related issue: Adding a dummy
> procedure. Cf. the PR of this issue and Thomas' patch
> "nvptx: Support global constructors/destructors via 'collect2'"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607749.html
>
> As that approach also works here:
>
> Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
>
> gcc/
>       PR libgomp/108098
>
>       * config/nvptx/mkoffload.cc (process): Emit dummy procedure
>       alongside reverse-offload function table to prevent NULL values
>       of the function addresses.
>
> ---
>  gcc/config/nvptx/mkoffload.cc | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
> index 5d89ba8..8306aa0 100644
> --- a/gcc/config/nvptx/mkoffload.cc
> +++ b/gcc/config/nvptx/mkoffload.cc
> @@ -357,6 +357,20 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
>       fputc (sm_ver2[i], out);
>        fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");
>
> +      /* WORKAROUND - see PR 108098
> +      It seems as if older CUDA JIT compiler optimizes the function pointers
> +      in offload_func_table to NULL, which can be prevented by adding a
> +      dummy procedure. With CUDA 11.1, it seems to work fine without
> +      workaround while CUDA 10.2 as some ancient version have need the
> +      workaround. Assuming CUDA 11.0 fixes it, emitting it could be
> +      restricted to 'if (sm_ver2[0] < 8 && version2[0] < 7)' as sm_80 and
> +      PTX ISA 7.0 are new in CUDA 11.0; for 11.1 it would be sm_86 and
> +      PTX ISA 7.1.  */
> +      fprintf (out, "\n\t\".func __dummy$func ( );\"\n");
> +      fprintf (out, "\t\".func __dummy$func ( )\"\n");
> +      fprintf (out, "\t\"{\"\n");
> +      fprintf (out, "\t\"}\"\n");
> +
>        size_t fidx = 0;
>        for (id = func_ids; id; id = id->next)
>       {
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

  reply	other threads:[~2022-12-19 12:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-16 16:19 Tobias Burnus
2022-12-19 12:04 ` Thomas Schwinge [this message]
2023-04-04  9:02   ` Thomas Schwinge
2023-04-04  9:08     ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875ye78uqs.fsf@euler.schwinge.homeip.net \
    --to=thomas@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=tdevries@suse.de \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).