public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/96932] New: [nvptx] atomic_exchange missing barrier
@ 2020-09-04  8:29 vries at gcc dot gnu.org
  2020-09-08  8:09 ` [Bug target/96932] " vries at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-09-04  8:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

            Bug ID: 96932
           Summary: [nvptx] atomic_exchange missing barrier
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

After digging into GOMP_atomic_start/end I realized these also imply barrier
semantics.

And looking at the source code used for nvptx in libgomp/config/accel/mutex.h,
that should be fine:
...
static inline void
gomp_mutex_lock (gomp_mutex_t *mutex)
{
  while (__sync_lock_test_and_set (mutex, 1))
    /* spin */ ;
}

static inline void
gomp_mutex_unlock (gomp_mutex_t *mutex)
{
  __sync_lock_release (mutex);
}
...

However, when looking at the resulting code in libgomp.a we see there's no
barrier for GOMP_atomic_start:
...
.visible .func GOMP_atomic_start
{
.reg .u32 %r22;
.reg .pred %r23;
$L2:
.loc 1 51 10
atom.global.exch.b32 %r22,[atomic_lock],1;
.loc 1 51 9
setp.ne.u32 %r23,%r22,0;
@ %r23 bra $L2;
.loc 2 43 1
ret;
}
...

While there is for GOMP_atomic_end:
...
.visible .func GOMP_atomic_end
{
.reg .u32 %r22;
.loc 1 58 3
membar.sys;
mov.u32 %r22,0;
st.global.u32 [atomic_lock],%r22;
.loc 2 49 1
ret;
}
...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/96932] [nvptx] atomic_exchange missing barrier
  2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
@ 2020-09-08  8:09 ` vries at gcc dot gnu.org
  2020-09-12  6:13 ` vries at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-09-08  8:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
FWIW, I've tried this test-case to trigger the problem, but it runs fine:
...
$ cat libgomp/testsuite/libgomp.oacc-c-c++-common/test.c  
/* { dg-do run } */

#include <stdlib.h>
#include <openacc.h>

#define assert(COND) \
  do { \
    if (!(COND)) \
      abort ();  \
  } while (0) 

int c;
#pragma acc declare copyin (c)

int
main (void)
{
  int *p = acc_malloc (4);

#pragma acc parallel num_gangs (1) deviceptr (p)
  {
    *p = 0;
  }

#pragma acc parallel num_gangs (100) deviceptr (p)
  {
    int i;
    for (i = 0; i < 10000; ++i)
      {
        int res;
        while (1)
          {
            res = __sync_lock_test_and_set (p, 1);
            if (res == 1)
              continue;
            break;
          }
        assert (res == 0);
        assert (*p == 1);

        assert (c == 0);
        c += 1;
        assert (c == 1);
        c -= 1;
        assert (c == 0);

        __sync_lock_release (p);
      }
  }
  return 0;
}
...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/96932] [nvptx] atomic_exchange missing barrier
  2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
  2020-09-08  8:09 ` [Bug target/96932] " vries at gcc dot gnu.org
@ 2020-09-12  6:13 ` vries at gcc dot gnu.org
  2021-05-12 11:30 ` burnus at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-09-12  6:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
It could be interesting to use a ptx simulator, perhaps that one could trigger
this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/96932] [nvptx] atomic_exchange missing barrier
  2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
  2020-09-08  8:09 ` [Bug target/96932] " vries at gcc dot gnu.org
  2020-09-12  6:13 ` vries at gcc dot gnu.org
@ 2021-05-12 11:30 ` burnus at gcc dot gnu.org
  2021-05-12 12:07 ` vries at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-05-12 11:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu.org

--- Comment #3 from Tobias Burnus <burnus at gcc dot gnu.org> ---
Crossref: PR100497 - fails on Volta without
  membar.sys;
before
  atom.global.exch.b32

Unfortunately, compared to pre-Volta, it is very slow - membar.gl is still slow
but a bit less.  Using (→ sm_70) fence.sys / fence.gnu instead of
fence.sc.{sys,gnu} (= membar.{sys,gl} on >= sm_70) does not seem to make a
performance difference for PR100497.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/96932] [nvptx] atomic_exchange missing barrier
  2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-05-12 11:30 ` burnus at gcc dot gnu.org
@ 2021-05-12 12:07 ` vries at gcc dot gnu.org
  2021-05-14  7:12 ` vries at gcc dot gnu.org
  2021-05-17 18:24 ` vries at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-12 12:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tobias Burnus from comment #3)
> Crossref: PR100497 - fails on Volta without
>   membar.sys;
> before
>   atom.global.exch.b32
> 
> Unfortunately, compared to pre-Volta, it is very slow - membar.gl is still
> slow but a bit less.  Using (→ sm_70) fence.sys / fence.gnu instead of
> fence.sc.{sys,gnu} (= membar.{sys,gl} on >= sm_70) does not seem to make a

fence.sc.gpu, funny typo :)

> performance difference for PR100497.

The GOMP_atomic_start/GOMP_atomic_end are fallbacks, and unfortunately cannot
be expected to be too optimal.

Following the introduction of -mptx=6.3 we can add support for atom.cas.b16
(well, once we also introduce misa=sm_70), and that should be the optimal
solution.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/96932] [nvptx] atomic_exchange missing barrier
  2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-05-12 12:07 ` vries at gcc dot gnu.org
@ 2021-05-14  7:12 ` vries at gcc dot gnu.org
  2021-05-17 18:24 ` vries at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-14  7:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

--- Comment #5 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 50811
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50811&action=edit
Tentative patch

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/96932] [nvptx] atomic_exchange missing barrier
  2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-05-14  7:12 ` vries at gcc dot gnu.org
@ 2021-05-17 18:24 ` vries at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-17 18:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96932

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #6 from Tom de Vries <vries at gcc dot gnu.org> ---
Duplicate.

*** This bug has been marked as a duplicate of bug 100497 ***

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-17 18:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04  8:29 [Bug target/96932] New: [nvptx] atomic_exchange missing barrier vries at gcc dot gnu.org
2020-09-08  8:09 ` [Bug target/96932] " vries at gcc dot gnu.org
2020-09-12  6:13 ` vries at gcc dot gnu.org
2021-05-12 11:30 ` burnus at gcc dot gnu.org
2021-05-12 12:07 ` vries at gcc dot gnu.org
2021-05-14  7:12 ` vries at gcc dot gnu.org
2021-05-17 18:24 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).