public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
@ 2021-05-12 19:54 burnus at gcc dot gnu.org
  2021-05-12 22:08 ` [Bug libgomp/100573] " burnus at gcc dot gnu.org
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-12 19:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

            Bug ID: 100573
           Summary: [OpenMP] 'omp target teams' fails with nvptx and GCN
                    offloading: FAIL libgomp.c-c++-common/for-3.c +
                    for-9.c
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: openmp, wrong-code
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

The testcase
  libgomp.c-c++-common/for-9.c
is a thin wrapper around libgomp.c-c++-common/for-3.c with slightly different
settings. Hence, it is affected as well.

libgomp.c-c++-common/for-3.c has:

  int err = 0;
  #pragma omp target teams reduction(|:err)
    {
       err |= test_d_normal ();
...
    }
  if (err)
    abort ();

This will currently never fail because 'err' is mapped as firstprivate to the
target – at least until PR99928 is fixed (→ patch there).

Alternatively, manually adding a 'map(tofrom: err)' also works.

However, with that 'map added or the PR99928 patch applied:
* it still PASSES on x86-64-gnu-linux (-m64, -m32) with HOST FALLBACK
* BUT it FAILS both with AMD GCN and with NVPTX offloading.

But even without reduction and by checking each of the twenty-odd testcase
separately, it FAILS. Namely, I used the following:

  int err = 0, i = 1;

  #pragma omp target teams map(tofrom: err)
    { 
      err = test_d_normal ();
    }
  __builtin_printf("%d -> %d\n", i++, err);
  err = 0;

And the result is '-> 1' for all of them.

However, when I use 'omp target' instead of 'target teams', it passes.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
@ 2021-05-12 22:08 ` burnus at gcc dot gnu.org
  2021-05-18  8:00 ` vries at gcc dot gnu.org
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-12 22:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> ---
Created attachment 50803
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50803&action=edit
Reduced testcase - works with hostfall back but fails with GCN and nvptx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
  2021-05-12 22:08 ` [Bug libgomp/100573] " burnus at gcc dot gnu.org
@ 2021-05-18  8:00 ` vries at gcc dot gnu.org
  2021-05-18  8:48 ` vries at gcc dot gnu.org
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-18  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tobias Burnus from comment #1)
> Created attachment 50803 [details]
> Reduced testcase - works with hostfall back but fails with GCN and nvptx

Is this not an invalid test-case?

The semantics of omp teams is:
...
The teams construct creates a league of thread teams and the master thread of
each team executes the region.
...

So, given that there is no distribution of the work over all the teams, the
master thread of each team updates the entire array.

That is, is this not just a question of missing distribution keyword?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
  2021-05-12 22:08 ` [Bug libgomp/100573] " burnus at gcc dot gnu.org
  2021-05-18  8:00 ` vries at gcc dot gnu.org
@ 2021-05-18  8:48 ` vries at gcc dot gnu.org
  2021-05-18  9:09 ` jakub at gcc dot gnu.org
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-18  8:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
Hmm, I reproduced the problem on the original test-case:
libgomp.c-c++-common/for-3.c, and minimized from there:
...
$ cat libgomp/testsuite/libgomp.c-c++-common/for-3.c
/* { dg-additional-options "-std=gnu99" { target c } } */

#include <stdio.h>
extern void abort ();

#define N 1500

int a[N];

int
main (void)
{
  int err = 0;

#pragma omp target map(tofrom: err) map (tofrom: a)
#pragma omp teams num_teams (15) reduction(|:err)
  {
    int i;

    for (i = 0; i < N; i++)
      a[i] = 0;

    do {} while (0);

#pragma omp distribute
    for (i = 0; i < N; i++)
      a[i] += 2;

    do {} while (0);

    for (i = 0; i < N; i++)
      if (a[i] != 2)
        err |= 1;
  }

  int i;
  for (i = 0; i < N; i++)
    if (a[i] != 2)
      printf ("a[%d]: %d\n", i, a[i]);

  if (err)
    abort ();

  return 0;
}
...

Again, I think there's a problem with the test-case.

There's no implicit barrier at the end of a distribute construct, so we can
have:
- team 0 updates part of array
- team 0 checks entire array, finds that only it's part is done, sets error
  flag
- team 1 updates part of array ...

I wonder whether there's an implicit barrier at the start of the distribute
construct.  If not, then the same problem exists for the initialization.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-05-18  8:48 ` vries at gcc dot gnu.org
@ 2021-05-18  9:09 ` jakub at gcc dot gnu.org
  2021-05-18  9:18 ` jakub at gcc dot gnu.org
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-18  9:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #3)
> Hmm, I reproduced the problem on the original test-case:
> libgomp.c-c++-common/for-3.c, and minimized from there:
> ...
> $ cat libgomp/testsuite/libgomp.c-c++-common/for-3.c
> /* { dg-additional-options "-std=gnu99" { target c } } */
> 
> #include <stdio.h>
> extern void abort ();
> 
> #define N 1500
> 
> int a[N];
> 
> int
> main (void)
> {
>   int err = 0;
> 
> #pragma omp target map(tofrom: err) map (tofrom: a)
> #pragma omp teams num_teams (15) reduction(|:err)
>   {
>     int i;
> 
>     for (i = 0; i < N; i++)
>       a[i] = 0;

Even this loop is a problem because every initial thread in its team
clears the whole array.
> 
>     do {} while (0);
> 
> #pragma omp distribute
>     for (i = 0; i < N; i++)
>       a[i] += 2;
> 
>     do {} while (0);

And this of course too.
> 
>     for (i = 0; i < N; i++)
>       if (a[i] != 2)
>         err |= 1;
>   }

I think we want to fix both for-3.c and for-9.c similarly to
r11-2571-g916c7a201a9a1dc94f2c056a773826a26d1daca9
i.e.
#define DO_PRAGMA(x) _Pragma (#x)
#define OMPTEAMS DO_PRAGMA (omp target teams)
#define OMPFROM(v) DO_PRAGMA (omp target update from(v))
#define OMPTO(v) DO_PRAGMA (omp target update to(v))
and changing main not to do #pragma omp target teams reduction(|:err)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-05-18  9:09 ` jakub at gcc dot gnu.org
@ 2021-05-18  9:18 ` jakub at gcc dot gnu.org
  2021-05-24 18:22 ` burnus at gcc dot gnu.org
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-18  9:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The intent of this particular test is to test how orphaned distribute works and
was done before the host teams construct support has been added, which means we
do not want the target teams visible in the individual functions.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-05-18  9:18 ` jakub at gcc dot gnu.org
@ 2021-05-24 18:22 ` burnus at gcc dot gnu.org
  2021-05-24 18:39 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-24 18:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #6 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #5)
> I think we want to fix both for-3.c and for-9.c similarly to
> r11-2571-g916c7a201a9a1dc94f2c056a773826a26d1daca9 i.e.
> #define DO_PRAGMA(x) _Pragma (#x)
> #define OMPTEAMS DO_PRAGMA (omp target teams)
> #define OMPFROM(v) DO_PRAGMA (omp target update from(v))
> #define OMPTO(v) DO_PRAGMA (omp target update to(v))
> and changing main not to do #pragma omp target teams reduction(|:err)

When trying this, it (still) works as host fallback but offloading gives:

  219 |   OMPTO (a);
for-2.h:219:1: error: variable '.omp_data_kinds.116' has been referenced in
offloaded code but hasn't been marked to be included in the offloaded code

  222 |   OMPFROM (a);
for-2.h:222:1: error: variable '.omp_data_sizes.118' has been referenced in
offloaded code but hasn't been marked to be included in the offloaded code

(And likewise for additional OMPTO/OMPFROM.)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-05-24 18:22 ` burnus at gcc dot gnu.org
@ 2021-05-24 18:39 ` jakub at gcc dot gnu.org
  2021-05-25 14:39 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-24 18:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Guess that is because the functions that have #pragma omp target teams
directive in it are marked declare target to.
So, either we'd need to play with macros etc. to make sure that those functions
aren't declare target to but the ones with #pragma omp distribute are, or
better we should fix this.  Will have a look tomorrow.

Basically this is about
#pragma omp declare target
void
foo (void)
{
  int a = 0;
  #pragma omp target map(tofrom:+a)
  a++;
}
#pragma omp end declare target

int
main ()
{
  foo ();
}

Encountering #pragma omp target (except for reverse-offload) in a target region
is undefined, but if it is never called from there and only from host routines,
we should compile/link it (even when we can just abort or whatever else if
called in the offloaded code).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-05-24 18:39 ` jakub at gcc dot gnu.org
@ 2021-05-25 14:39 ` jakub at gcc dot gnu.org
  2021-05-25 15:55 ` burnus at gcc dot gnu.org
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 14:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50867
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50867&action=edit
gcc12-pr100573.patch

Lightly tested patch.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-05-25 14:39 ` jakub at gcc dot gnu.org
@ 2021-05-25 15:55 ` burnus at gcc dot gnu.org
  2021-05-25 16:02 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-25 15:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #9 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #8)
> Lightly tested patch.

Just quick manually testing "for-3.c" (I tried -O0 and -O3):

* With nvptx offloading, it compiles + links – but at run time, I get on two
systems:

  libgomp: cuLaunchKernel error: too many resources requested for launch

and, on the third system, a SEGFAULT – which sounds as if it could be the same
issue:

#0  memcpy () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:145
#1  0x00007ffff63b2552 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
when executing  libgomp/plugin/plugin-nvptx.c:2004
2004      r = CUDA_CALL_NOCHECK (cuLaunchKernel, function, teams, 1, 1,


* For amdgcn, I get at startup:
...
GCN debug: Released kernel dispatch: 0x7eb350
GCN debug: Copying 6000 bytes from host (0x7730c0) to device 0 (0x7ffeed8194d0)
GCN warning: Could not find symbol for kernel in the code object
Runtime message: HSA_STATUS_ERROR_INVALID_SYMBOL_NAME: There is no symbol with
the given name.
not found name: 'test_d_normal._omp_fn.0.kd'
...
not found name: 'test_d_ds128_normal._omp_fn.0.kd'
not found name: 'test_ds_normal._omp_fn.0.kd'
...

[The .kd" comes from plugin/plugin-gcn.c's:  sprintf (buf, "%s.kd",
kernel->name); ]

(I am now doing a full bootstrap now to ensure that that wasn't due to the
incremental build.)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-05-25 15:55 ` burnus at gcc dot gnu.org
@ 2021-05-25 16:02 ` jakub at gcc dot gnu.org
  2021-05-25 18:03 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 16:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I didn't have the nvidia binary module loaded and cuda installed when doing the
light testing, now I've installed that and see
FAIL: libgomp.c/../libgomp.c-c++-common/for-3.c execution test
FAIL: libgomp.c/../libgomp.c-c++-common/for-9.c execution test
XPASS: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors)
FAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c execution test
FAIL: libgomp.c/../libgomp.c-c++-common/target-41.c execution test
FAIL: libgomp.c/../libgomp.c-c++-common/target-42.c execution test
fail.
target-41.c and -42.c FAIL with the same error as for-3.c,
libgomp: cuLaunchKernel error: too many resources requested for launch
I'm puzzled about that message though, it really shouldn't request too many
resources, it should spawn a single thread doing a very simple kernel.
Maybe the __builtin_unreachable (); calls are the culprit?
I didn't know if I should use __builtin_trap (), __builtin_abort () and
__builtin_unreachable () is what has been used in task.c.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2021-05-25 16:02 ` jakub at gcc dot gnu.org
@ 2021-05-25 18:03 ` jakub at gcc dot gnu.org
  2021-05-25 18:28 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 18:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Using asm ("exit;"); instead of __builtin_unreachable (); doesn't help.
My current suspicion is that it is about taking address of the function that is
marked as PTX kernel entry point (functions with "omp target entrypoint"
attribute on the compiler side).
So maybe we need on the compiler side fold __builtin_GOMP_target_ext to
__builtin_unreachable or something similar or at least nullify the kernel
pointer argument in there.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2021-05-25 18:03 ` jakub at gcc dot gnu.org
@ 2021-05-25 18:28 ` jakub at gcc dot gnu.org
  2021-05-25 19:34 ` burnus at gcc dot gnu.org
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 18:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
With incremental
--- gcc/omp-offload.c.jj        2021-05-25 13:43:01.341137265 +0200
+++ gcc/omp-offload.c   2021-05-25 20:07:01.934506823 +0200
@@ -2696,8 +2696,16 @@ pass_omp_target_link::execute (function
     {
       gimple_stmt_iterator gsi;
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
-       if (walk_gimple_stmt (&gsi, NULL, find_link_var_op, NULL))
-         gimple_regimplify_operands (gsi_stmt (gsi), &gsi);
+       {
+         if (gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_GOMP_TARGET))
+           {
+             /* Nullify the second argument of __builtin_GOMP_target_ext.  */
+             gimple_call_set_arg (gsi_stmt (gsi), 1, null_pointer_node);
+             update_stmt (gsi_stmt (gsi));
+           }
+         if (walk_gimple_stmt (&gsi, NULL, find_link_var_op, NULL))
+           gimple_regimplify_operands (gsi_stmt (gsi), &gsi);
+       }
     }

   return 0;
I see it fail with
Linking
Link complete: 0.000000ms
Link log info    : 240 bytes gmem, 1414 bytes cmem[3]

libgomp: cuModuleGetFunction error: named symbol not found

libgomp: Cannot map target functions or variables (expected 9, have 4294967295)
(target-41.c with GOMP_DEBUG=1), but it is unclear from that which named symbol
wasn't found.

Any idea how to troubleshoot what is missing?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2021-05-25 18:28 ` jakub at gcc dot gnu.org
@ 2021-05-25 19:34 ` burnus at gcc dot gnu.org
  2021-05-25 19:38 ` amonakov at gcc dot gnu.org
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-25 19:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #13 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to Tobias Burnus from comment #9)
> not found name: 'test_d_normal._omp_fn.0.kd'

I think the problem is the following:

(a) working:
foo()
  #pragma target
    bar()

Here, 'foo._omp_fn.0' as as fndecl attribute: 'omp target entrypoint'

(b) failing:
foo()
  #pragma target
    foo()
while here 'foo._omp_fn.0' has 'omp declare target' which does not make sense.

I think we need in omp_discover_declare_target_tgt_fn_r a similar handling for
'omp declare target entrypoint' as we do for 'omp declare target host'.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2021-05-25 19:34 ` burnus at gcc dot gnu.org
@ 2021-05-25 19:38 ` amonakov at gcc dot gnu.org
  2021-05-25 19:39 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: amonakov at gcc dot gnu.org @ 2021-05-25 19:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #14 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
I would break in gdb on cuModuleGetFunction and

  x/s $rdx

to print the failing symbol (it's the third argument to the function).

It seems the "inner" entrypoint (which your patch attempted to nullify) is
still registered in offload tables, so the plugin takes its name from the
offload table and attempts to look it up in the offloaded code?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2021-05-25 19:38 ` amonakov at gcc dot gnu.org
@ 2021-05-25 19:39 ` jakub at gcc dot gnu.org
  2021-05-25 19:44 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 19:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Tobias Burnus from comment #13)
> (In reply to Tobias Burnus from comment #9)
> > not found name: 'test_d_normal._omp_fn.0.kd'
> 
> I think the problem is the following:
> 
> (a) working:
> foo()
>   #pragma target
>     bar()
> 
> Here, 'foo._omp_fn.0' as as fndecl attribute: 'omp target entrypoint'
> 
> (b) failing:
> foo()
>   #pragma target
>     foo()
> while here 'foo._omp_fn.0' has 'omp declare target' which does not make
> sense.
> 
> I think we need in omp_discover_declare_target_tgt_fn_r a similar handling
> for
> 'omp declare target entrypoint' as we do for 'omp declare target host'.

Sure, but I thought that would be fixed with the #c12 patch.  The only place
where those "omp target entrypoint" functions should be referenced are the
arguments of the GOMP_target_ext function (the second one), and I've swapped
that for the NULL pointer.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2021-05-25 19:39 ` jakub at gcc dot gnu.org
@ 2021-05-25 19:44 ` jakub at gcc dot gnu.org
  2021-05-25 20:06 ` amonakov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 19:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Alexander Monakov from comment #14)
> I would break in gdb on cuModuleGetFunction and
> 
>   x/s $rdx
> 
> to print the failing symbol (it's the third argument to the function).
> 
> It seems the "inner" entrypoint (which your patch attempted to nullify) is
> still registered in offload tables, so the plugin takes its name from the
> offload table and attempts to look it up in the offloaded code?

Thread 1 "target-41.exe" hit Breakpoint 1, 0x00007ffff66de530 in
cuModuleGetFunction () from /lib64/libcuda.so.1
(gdb) x/1s $rdx
0x477780:       "foo$_omp_fn$0"

Isn't that symbol in the offload tables normally though or do we treat it there
differently?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2021-05-25 19:44 ` jakub at gcc dot gnu.org
@ 2021-05-25 20:06 ` amonakov at gcc dot gnu.org
  2021-05-25 20:12 ` burnus at gcc dot gnu.org
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: amonakov at gcc dot gnu.org @ 2021-05-25 20:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #17 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Yes, I'd agree normally it's present in the offload table, but ideally if
you're trying to stub out the call, it should not be present in the offload
table.

I think Tobias is saying that on GIMPLE this function does not have 'omp target
entrypoint' attribute attached to it? If so, that's causing a problem, because
the backend will not synthesize the corresponding PTX .global function.

Each function named in the offload table should be 'omp target entrypoint'.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2021-05-25 20:06 ` amonakov at gcc dot gnu.org
@ 2021-05-25 20:12 ` burnus at gcc dot gnu.org
  2021-05-25 20:20 ` amonakov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-25 20:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #18 from Tobias Burnus <burnus at gcc dot gnu.org> ---
I think the problem is:

create_omp_child_function(omp_context*, bool)
...
1916      DECL_ATTRIBUTES (decl) = DECL_ATTRIBUTES (current_function_decl);

The code removes then 'omp declare simd' but not 'omp declare target' - hence,
the value is kept.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2021-05-25 20:12 ` burnus at gcc dot gnu.org
@ 2021-05-25 20:20 ` amonakov at gcc dot gnu.org
  2021-05-25 21:59 ` jakub at gcc dot gnu.org
  2021-05-26  9:28 ` cvs-commit at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: amonakov at gcc dot gnu.org @ 2021-05-25 20:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #19 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Ah, does the issue arise because foo._omp_fn.0 is (before the patch) callable
in two contexts, in one it's called from host and should be 'omp target
entrypoint', and in the other it's called from offloaded code and bears 'omp
declare target'?

If so, I think omp-expand code should make 'omp target entrypoint' prevail over
'omp declare target'?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2021-05-25 20:20 ` amonakov at gcc dot gnu.org
@ 2021-05-25 21:59 ` jakub at gcc dot gnu.org
  2021-05-26  9:28 ` cvs-commit at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-25 21:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Yeah, that is likely what happens, I'll debug tomorrow.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c
  2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2021-05-25 21:59 ` jakub at gcc dot gnu.org
@ 2021-05-26  9:28 ` cvs-commit at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-26  9:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573

--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:95d67762171f83277a5700b270c0d1e2756f83f4

commit r12-1066-g95d67762171f83277a5700b270c0d1e2756f83f4
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed May 26 11:18:07 2021 +0200

    openmp: Fix up handling of target constructs in offloaded routines
[PR100573]

    OpenMP Nesting of Regions restrictions say:
    - If a target update, target data, target enter data, or target exit data
    construct is encountered during execution of a target region, the behavior
is unspecified.
    - If a target construct is encountered during execution of a target region
and a device
    clause in which the ancestor device-modifier appears is not present on the
construct, the
    behavior is unspecified.
    That wording is about the dynamic (runtime) behavior, not about lexical
nesting,
    so while it is UB if omp target * is encountered in the target region, we
need to make
    it compile and link (for lexical nesting of target * inside of target we
actually
    emit a warning).

    To make this work, I had to do multiple changes.
    One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp
declare target".
    Another one was to add stub GOMP_target* entrypoints to nvptx and gcn
libgomp.a.
    The entrypoint functions shouldn't be called or passed in the offload
regions,
    otherwise
    libgomp: cuLaunchKernel error: too many resources requested for launch
    was reported; fixed by changing those arguments of calls to GOMP_target_ext
    to NULL.
    And we didn't mark the entrypoints "omp target entrypoint" when the caller
    has been "omp declare target".

    2021-05-26  Jakub Jelinek  <jakub@redhat.com>

            PR libgomp/100573
    gcc/
            * omp-low.c: Include omp-offload.h.
            (create_omp_child_function): If current_function_decl has
            "omp declare target" attribute and is_gimple_omp_offloaded,
            remove that attribute from the copy of attribute list and
            add "omp target entrypoint" attribute instead.
            (lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.*
            variables for offloading if in omp_maybe_offloaded_ctx.
            * omp-offload.c (pass_omp_target_link::execute): Nullify second
            argument to GOMP_target_data_ext in offloaded code.
    libgomp/
            * config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext,
            GOMP_target_end_data, GOMP_target_update_ext,
            GOMP_target_enter_exit_data): New dummy entrypoints.
            * config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext,
            GOMP_target_end_data, GOMP_target_update_ext,
            GOMP_target_enter_exit_data): Likewise.
            * testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS,
            OMPFROM, OMPTO): Define.
            (main): Remove #pragma omp target teams around all the tests.
            * testsuite/libgomp.c-c++-common/target-41.c: New test.
            * testsuite/libgomp.c-c++-common/target-42.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-05-26  9:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-12 19:54 [Bug libgomp/100573] New: [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c burnus at gcc dot gnu.org
2021-05-12 22:08 ` [Bug libgomp/100573] " burnus at gcc dot gnu.org
2021-05-18  8:00 ` vries at gcc dot gnu.org
2021-05-18  8:48 ` vries at gcc dot gnu.org
2021-05-18  9:09 ` jakub at gcc dot gnu.org
2021-05-18  9:18 ` jakub at gcc dot gnu.org
2021-05-24 18:22 ` burnus at gcc dot gnu.org
2021-05-24 18:39 ` jakub at gcc dot gnu.org
2021-05-25 14:39 ` jakub at gcc dot gnu.org
2021-05-25 15:55 ` burnus at gcc dot gnu.org
2021-05-25 16:02 ` jakub at gcc dot gnu.org
2021-05-25 18:03 ` jakub at gcc dot gnu.org
2021-05-25 18:28 ` jakub at gcc dot gnu.org
2021-05-25 19:34 ` burnus at gcc dot gnu.org
2021-05-25 19:38 ` amonakov at gcc dot gnu.org
2021-05-25 19:39 ` jakub at gcc dot gnu.org
2021-05-25 19:44 ` jakub at gcc dot gnu.org
2021-05-25 20:06 ` amonakov at gcc dot gnu.org
2021-05-25 20:12 ` burnus at gcc dot gnu.org
2021-05-25 20:20 ` amonakov at gcc dot gnu.org
2021-05-25 21:59 ` jakub at gcc dot gnu.org
2021-05-26  9:28 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).