public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/105001] New: If executing with non-nvptx offloading,  but nvptx offloading compilation is enabled: FAIL: libgomp.c/pr104783.c execution test
@ 2022-03-21 14:59 tschwinge at gcc dot gnu.org
  2022-03-21 16:26 ` [Bug middle-end/105001] " vries at gcc dot gnu.org
  2022-05-03  9:49 ` tschwinge at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: tschwinge at gcc dot gnu.org @ 2022-03-21 14:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105001

            Bug ID: 105001
           Summary: If executing with non-nvptx offloading, but nvptx
                    offloading compilation is enabled: FAIL:
                    libgomp.c/pr104783.c execution test
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: openmp
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tschwinge at gcc dot gnu.org
                CC: ams at gcc dot gnu.org, jakub at gcc dot gnu.org, jules at gcc dot gnu.org,
                    vries at gcc dot gnu.org
  Target Milestone: ---

If executing with non-nvptx offloading (tested: GCN, Intel MIC (emulated)), but
nvptx offloading compilation is enabled, the PR104783 test case
'libgomp.c/pr104783.c' FAILs its execution test, SIGSEGV-like.

GCN:

    Memory access fault by GPU node-1 (Agent handle: 0x1a93b40) on address
(nil). Reason: Page not present or supervisor privilege.

Intel MIC (emulated):

    offload error: process on the device 0 unexpectedly exited with code 0

The problem goes away if compiling without nvptx offloading:
'-foffload=amdgcn-amdhsa', for example.

So, I suppose a problem with the middle end SIMT transformations?


Compiling manually, also I see:

    source-gcc/libgomp/testsuite/libgomp.c/pr104783.c: In function
‘main._omp_fn.0’:
    source-gcc/libgomp/testsuite/libgomp.c/pr104783.c:10:9: warning:
‘<anonymous>’ is used uninitialized [-Wuninitialized]
       10 | #pragma omp atomic update
          |         ^~~
    source-gcc/libgomp/testsuite/libgomp.c/pr104783.c:10:9: note: ‘<anonymous>’
was declared here
       10 | #pragma omp atomic update
          |         ^~~

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/105001] If executing with non-nvptx offloading, but nvptx offloading compilation is enabled: FAIL: libgomp.c/pr104783.c execution test
  2022-03-21 14:59 [Bug middle-end/105001] New: If executing with non-nvptx offloading, but nvptx offloading compilation is enabled: FAIL: libgomp.c/pr104783.c execution test tschwinge at gcc dot gnu.org
@ 2022-03-21 16:26 ` vries at gcc dot gnu.org
  2022-05-03  9:49 ` tschwinge at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2022-03-21 16:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105001

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Interesting.

Can you compare dump files to see where the difference comes from?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/105001] If executing with non-nvptx offloading, but nvptx offloading compilation is enabled: FAIL: libgomp.c/pr104783.c execution test
  2022-03-21 14:59 [Bug middle-end/105001] New: If executing with non-nvptx offloading, but nvptx offloading compilation is enabled: FAIL: libgomp.c/pr104783.c execution test tschwinge at gcc dot gnu.org
  2022-03-21 16:26 ` [Bug middle-end/105001] " vries at gcc dot gnu.org
@ 2022-05-03  9:49 ` tschwinge at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: tschwinge at gcc dot gnu.org @ 2022-05-03  9:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105001

Thomas Schwinge <tschwinge at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #2 from Thomas Schwinge <tschwinge at gcc dot gnu.org> ---
Likewise:

    +PASS: libgomp.c/pr104783-2.c (test for excess errors)
    +FAIL: libgomp.c/pr104783-2.c execution test

... that got added in commit r12-7765-ga624388b9546b066250be8baa118b7d50c403c25
"[nvptx] Add warp sync at simt exit".

---

Looking at 'libgomp.c/pr104783.c' (added in commit
r12-7586-gf07178ca3c1e5dff799fb5016bb3767571db3165 "[nvptx] Disable warp sync
in simt region").

The problem disappears with '-O0'.  This does coincide with the non-appearance
('-O0') vs. appearance ('-O1' and higher) of the '-Wuninitialized' diagnostic:

    source-gcc/libgomp/testsuite/libgomp.c/pr104783.c: In function
‘main._omp_fn.0’:
    source-gcc/libgomp/testsuite/libgomp.c/pr104783.c:10:9: warning:
‘<anonymous>’ is used uninitialized [-Wuninitialized]
       10 | #pragma omp atomic update
          |         ^~~
    source-gcc/libgomp/testsuite/libgomp.c/pr104783.c:10:9: note: ‘<anonymous>’
was declared here
       10 | #pragma omp atomic update
          |         ^~~

This one is emitted by the host compiler (!) -- and again disappears if
omitting nvptx offloading compilation, via explicit '-foffload=amdgcn-amdhsa',
for example.

Therefore, 'diff' of host-side '-O1 -fdump-tree-all' without vs. with nvptx
offloading, already in 'a-pr104783.c.009t.omplower':

    @@ -13,30 +13,57 @@
           {
             val = 0;
             {
    -          .omp_data_arr.2.val = &val;
    +          .omp_data_arr.4.val = &val;
    -          #pragma omp target num_teams(1) thread_limit(0) map(tofrom:val
[len: 4]) [child fn: main._omp_fn.0 (.omp_data_arr.2, .omp_data_sizes.3,
.omp_data_kinds.4)]
    +          #pragma omp target num_teams(1) thread_limit(0) map(tofrom:val
[len: 4]) [child fn: main._omp_fn.0 (.omp_data_arr.4, .omp_data_sizes.5,
.omp_data_kinds.6)]
                 {
    -              .omp_data_i = (const struct .omp_data_t.1 & restrict)
&.omp_data_arr.2;
    +              .omp_data_i = (const struct .omp_data_t.1 & restrict)
&.omp_data_arr.4;
                   {
                     int i;

                     {
    +                  int D.2127;
    +
    +                  D.2127 = .GOMP_USE_SIMT ();
    +                  if (D.2127 != 0) goto <D.2128>; else goto <D.2129>;
    +                  <D.2128>:
    +                  {
    +                    void * simduid.2;
    +                    void * .omp_simt.3;
    +                    int i;
    +
    +                    simduid.2 = .GOMP_SIMT_ENTER (simduid.2);
    +                    .omp_simt.3 = .GOMP_SIMT_ENTER_ALLOC (simduid.2);
    +                    #pragma omp simd _simduid_(simduid.2) _simt_
linear(i:1)
    +                    for (i = 0; i < 1; i = i + 1)
    +                    D.2135 = .omp_data_i->val;
    +                    #pragma omp atomic_load relaxed
    +                      D.2116 = *D.2135
    +                    D.2117 = D.2116 + 1;
    +                    #pragma omp atomic_store relaxed (D.2117)
    +                    #pragma omp continue (i, i)
    +                    .GOMP_SIMT_EXIT (.omp_simt.3);
    +                    #pragma omp return(nowait)
    +                  }
    +                  goto <D.2130>;
    +                  <D.2129>:
    +                  {
                       int i;

                       #pragma omp simd linear(i:1)
                       for (i = 0; i < 1; i = i + 1)
    -                  D.2128 = .omp_data_i->val;
                       #pragma omp atomic_load relaxed
    -                    D.2116 = *D.2128
    +                      D.2116 = *&*D.2135
                       D.2117 = D.2116 + 1;
                       #pragma omp atomic_store relaxed (D.2117)
                       #pragma omp continue (i, i)
                       #pragma omp return(nowait)
                     }
    +                  <D.2130>:
    +                }
                   }
                   #pragma omp return
                 }
    -          .omp_data_arr.2 = {CLOBBER};
    +          .omp_data_arr.4 = {CLOBBER};
             }
             val.0_1 = val;
             if (val.0_1 != 1) goto <D.2118>; else goto <D.2119>;

Notice code changes outside of the 'if ([.GOMP_USE_SIMT ()])':

    -                  D.2128 = .omp_data_i->val;
                       #pragma omp atomic_load relaxed
    -                    D.2116 = *D.2128
    +                      D.2116 = *&*D.2135

..., and notice that 'D.2135' however is only set in the 'if' branch, but now
also used in the 'else' branch!  That may well be the origin of the
'-Wuninitialized' and SIGSEGV observed for non-SIMT?

(I didn't think about it very much, but can't we have a GIMPLE-level
consistency check for such a thing?)

---

Then, I don't know very much about the host-side implementation of OpenMP
lowering for nvptx SIMT (which is what we're seeing here), but it seems
"non-optimal" to me that the general host-side IR changes (even if wrapped in
'if ([.GOMP_USE_SIMT ()])'), depending on which offloading targets happen to be
enabled?  Wouldn't it be better to clone the affected functions for nvptx SIMT
compilation, or -- even better? -- generally introduce appropriate
abstractions, that the host/offload targets then later lower appropriately
(whether SIMT or non-SIMT; for example, in 'pass_omp_device_lower')?  (That'd
be conceptually similar to what we're doing for OpenACC.)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-03  9:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21 14:59 [Bug middle-end/105001] New: If executing with non-nvptx offloading, but nvptx offloading compilation is enabled: FAIL: libgomp.c/pr104783.c execution test tschwinge at gcc dot gnu.org
2022-03-21 16:26 ` [Bug middle-end/105001] " vries at gcc dot gnu.org
2022-05-03  9:49 ` tschwinge at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).