[Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times
@ 2021-10-12  3:35 han.wu@compiler-dev.com
  2021-10-12  6:31 ` [Bug fortran/102698] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: han.wu@compiler-dev.com @ 2021-10-12  3:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102698

            Bug ID: 102698
           Summary: omp atomic capture Abnormal results after running
                    multiple times
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: han.wu@compiler-dev.com
  Target Milestone: ---

test case：
program main
  interface
    subroutine aaaa_ieor(x, o, idx, idy, n)
      integer(2) :: x(8, 2), o(8, 2)
      integer :: n, idx(*), idy(*)
    end subroutine
  end interface
  integer, parameter :: n = 64, n1 = 8, n2 = 2
  integer(2) :: x(8, 2), o(8, 2), expect1(8, 2), a(64), b(64), c(64)
  integer :: idx(n), idy(n)
  logical(1) :: rst(32), res(32)
  integer :: i, j

  do i = 1, n
    idx(i) = mod(i, n1) + 1
    if (i > 32) then
      idy(i) = mod(i, n2) + 1
    else
      idy(i) = mod(i + 1, n2) + 1
    end if
  end do

  expect1 = reshape([120 ,25, 58, 27, 60, 29, 62, 31, &
    56, 57, 26, 59, 28, 61, 30, 63], (/8, 2/))
  x = 0
  call aaaa_ieor(x, o, idx, idy, n)

  res(1:16) = reshape((x .eq. expect1), (/16/))
  res(17:32) = reshape((o .eq. expect1), (/16/))
  !print *, x
  do i = 1, 32
    if (res(i) .neqv. .true.) print *,  i 
  end do

  if (any(res .neqv. .true.)) stop 1
    !print *, "PASS"

end program

function fun(i) result(c)
  integer :: i
  integer(2) :: c
  c = i
end function

subroutine aaaa_ieor(x, o, idx, idy, n)
  integer(2) :: x(8, 2), o(8, 2)
  integer :: n, idx(*), idy(*)

  interface
    function fun(i) result(c)
      integer :: i
      integer(2) :: c
    end function
  end interface
  !num_threads(4)
  !$omp parallel do shared(x, o)
  do i = 1, 64
    !$omp atomic capture
    x(idx(i), idy(i)) = ior(x(idx(i), idy(i)), fun(i))
    o(idx(i), idy(i)) = x(idx(i), idy(i))
    !$omp end atomic
  end do
end subroutine


next is the shell to exec:
#！/bin/bash
gfortran  -fopenmp    ieor1.f90
for((i=1;i<=1000;i++))
do
 ./a.out
done

Once executed 1,000 times in one shell, the result may be wrong one time
If you are lucky， run a.out 6000 times may only one result is wrong

How did the error happen？ Looking forward to your help

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug fortran/102698] omp atomic capture Abnormal results after running multiple times
  2021-10-12  3:35 [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times han.wu@compiler-dev.com
@ 2021-10-12  6:31 ` rguenth at gcc dot gnu.org
  2021-10-12  8:14 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-12  6:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102698

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
                 CC|                            |jakub at gcc dot gnu.org
   Last reconfirmed|                            |2021-10-12
           Keywords|                            |openmp

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  helgrind says

==1313== ----------------------------------------------------------------
==1313== 
==1313== Possible data race during write of size 4 at 0x64DEB90 by thread #1
==1313== Locks held: none
==1313==    at 0x566182B: gomp_barrier_wait_end (bar.c:40)
==1313==    by 0x566182B: gomp_barrier_wait_end (bar.c:35)
==1313==    by 0x565FA2D: gomp_simple_barrier_wait (simple-bar.h:60)
==1313==    by 0x565FA2D: gomp_team_start (team.c:853)
==1313==    by 0x56576BC: GOMP_parallel (parallel.c:169)
==1313==    by 0x400938: aaaa_ieor_ (t.f90:57)
==1313==    by 0x400A80: MAIN__ (t.f90:26)
==1313==    by 0x40102C: main (t.f90:38)
==1313== 
==1313== This conflicts with a previous read of size 4 by thread #24
==1313== Locks held: none
==1313==    at 0x566187B: gomp_barrier_wait_start (bar.h:98)
==1313==    by 0x566187B: gomp_barrier_wait (bar.c:56)
==1313==    by 0x565EFD2: gomp_simple_barrier_wait (simple-bar.h:60)
==1313==    by 0x565EFD2: gomp_thread_start (team.c:117)
==1313==    by 0x4C36016: ??? (in
/usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==1313==    by 0x5CEAA19: start_thread (in /lib64/libpthread-2.31.so)
==1313==    by 0x6002D0E: clone (in /lib64/libc-2.31.so)
==1313==  Address 0x64deb90 is 128 bytes inside a block of size 192 alloc'd
==1313==    at 0x4C328FF: malloc (in
/usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==1313==    by 0x5651308: gomp_malloc (alloc.c:38)
==1313==    by 0x565F218: gomp_get_thread_pool (pool.h:42)
==1313==    by 0x565F218: get_last_team (team.c:150)
==1313==    by 0x565F218: gomp_new_team (team.c:169)
==1313==    by 0x56576A5: GOMP_parallel (parallel.c:169)
==1313==    by 0x400938: aaaa_ieor_ (t.f90:57)
==1313==    by 0x400A80: MAIN__ (t.f90:26)
==1313==    by 0x40102C: main (t.f90:38)
==1313==  Block was alloc'd by thread #1

(and more of those)

and then

==1313== ----------------------------------------------------------------
==1313== 
==1313== Possible data race during read of size 4 at 0x64DEC94 by thread #1
==1313== Locks held: none
==1313==    at 0x5661963: do_spin (wait.h:57)
==1313==    by 0x5661963: do_wait (wait.h:66)
==1313==    by 0x5661963: gomp_team_barrier_wait_end (bar.c:112)
==1313==    by 0x56604A8: gomp_team_end (team.c:937)
==1313==    by 0x400938: aaaa_ieor_ (t.f90:57)
==1313==    by 0x400A80: MAIN__ (t.f90:26)
==1313==    by 0x40102C: main (t.f90:38)
==1313== 
==1313== This conflicts with a previous write of size 4 by thread #3
==1313== Locks held: none
==1313==    at 0x5661A25: gomp_team_barrier_wait_end (bar.c:102)
==1313==    by 0x565F011: gomp_thread_start (team.c:124)
==1313==    by 0x4C36016: ??? (in
/usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==1313==    by 0x5CEAA19: start_thread (in /lib64/libpthread-2.31.so)
==1313==    by 0x6002D0E: clone (in /lib64/libc-2.31.so)
==1313==  Address 0x64dec94 is 132 bytes inside a block of size 6,528 alloc'd
==1313==    at 0x4C328FF: malloc (in
/usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==1313==    by 0x5651308: gomp_malloc (alloc.c:38)
==1313==    by 0x565F077: gomp_new_team (team.c:174)
==1313==    by 0x56576A5: GOMP_parallel (parallel.c:169)
==1313==    by 0x400938: aaaa_ieor_ (t.f90:57)
==1313==    by 0x400A80: MAIN__ (t.f90:26)
==1313==    by 0x40102C: main (t.f90:38)
==1313==  Block was alloc'd by thread #1

and

==1313== ----------------------------------------------------------------
==1313== 
==1313== Possible data race during write of size 8 at 0x1FFEFFF778 by thread #1
==1313== Locks held: none
==1313==    at 0x4FDD826: _gfortran_reshape_4 (reshape_i4.c:47)
==1313== 
==1313== This conflicts with a previous read of size 8 by thread #3
==1313== Locks held: none
==1313==    at 0x4010D7: aaaa_ieor_._omp_fn.0 (t.f90:59)
==1313==    by 0x565F005: gomp_thread_start (team.c:123)
==1313==    by 0x4C36016: ??? (in
/usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==1313==    by 0x5CEAA19: start_thread (in /lib64/libpthread-2.31.so)
==1313==    by 0x6002D0E: clone (in /lib64/libc-2.31.so)
==1313==  Address 0x1ffefff778 is on thread #1's stack

so the thing is having data races, either because the GCC OMP runtime
is faulty or your expectations.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug fortran/102698] omp atomic capture Abnormal results after running multiple times
  2021-10-12  3:35 [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times han.wu@compiler-dev.com
  2021-10-12  6:31 ` [Bug fortran/102698] " rguenth at gcc dot gnu.org
@ 2021-10-12  8:14 ` jakub at gcc dot gnu.org
  2021-10-12  9:09 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-12  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102698

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Your testcase is racy and therefore anything can happen.
Please see
https://www.openmp.org/spec-html/5.1/openmpsu105.html#x138-1480002.19.7
"Only the read and write of the location designated by x are performed mutually
atomically. Neither the evaluation of expr or expr-list, nor the write to the
location designated by v, need be atomic with respect to the read or write of
the location designated by x."
But the testcase is expecting that the store to o(idx(i), idy(i)) will be
atomic and done whenever the atomic or succeeds.  That is not the case, the
store is done after the atomic or succeeded, but using a normal store
instruction and other threads can store the same location in between (e.g. even
though they performed the atomic instruction later, they can store the result
earlier).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug fortran/102698] omp atomic capture Abnormal results after running multiple times
  2021-10-12  3:35 [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times han.wu@compiler-dev.com
  2021-10-12  6:31 ` [Bug fortran/102698] " rguenth at gcc dot gnu.org
  2021-10-12  8:14 ` jakub at gcc dot gnu.org
@ 2021-10-12  9:09 ` jakub at gcc dot gnu.org
  2021-10-12 12:03 ` han.wu@compiler-dev.com
  2021-10-12 12:20 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-12  9:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102698

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Just one extra note:
    !$omp atomic capture
    x(idx(i), idy(i)) = ior(x(idx(i), idy(i)), fun(i))
    o(idx(i), idy(i)) = x(idx(i), idy(i))
    !$omp end atomic
is really (under the hood):
    temp_expr = fun(i)
    !$omp atomic capture
    x(idx(i), idy(i)) = ior(x(idx(i), idy(i)), temp_expr)
    temp_v = x(idx(i), idy(i))
    !$omp end atomic
    o(idx(i), idy(i)) = temp_v
and if you write it that way, it is more obvious your program is racy.
Even the x(idx(i), idy(i)) expression gets its address evaluated once before
the construct (if it has some side-effects) and then it is just
__atomic_or_fetch on that address.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug fortran/102698] omp atomic capture Abnormal results after running multiple times
  2021-10-12  3:35 [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times han.wu@compiler-dev.com
                   ` (2 preceding siblings ...)
  2021-10-12  9:09 ` jakub at gcc dot gnu.org
@ 2021-10-12 12:03 ` han.wu@compiler-dev.com
  2021-10-12 12:20 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: han.wu@compiler-dev.com @ 2021-10-12 12:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102698

--- Comment #4 from han.wu <han.wu@compiler-dev.com> ---
Thank you very much for your prompt reply！
@Jakub Jelinek
One thing I don't quite understand is that if we modify the test case as
suggested as the extra note, then the value of temp_v may vary at each
execution, then what is the occasion to use an atomic capture structure with 2
statements? Could you please give me an example of a practical usage of it?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug fortran/102698] omp atomic capture Abnormal results after running multiple times
  2021-10-12  3:35 [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times han.wu@compiler-dev.com
                   ` (3 preceding siblings ...)
  2021-10-12 12:03 ` han.wu@compiler-dev.com
@ 2021-10-12 12:20 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-10-12 12:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102698

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
!$omp atomic capture
v = x
x = ior(x, expr)
!$omp end atomic
etc. is conceptually similar to
!$omp atomic load
v = x
!$omp end atomic
!$omp atomic update
x = ior(x, expr)
!$omp end atomic
except that the capture form is atomic even together, the v is really the value
of x right before the successful atomic ior, while in the latter case some
other
thread could modify x in between those 2 atomic constructs.  And similarly,
the forms that have v = x at the end are similar to two atomics in the other
order.
There are many cases where it is useful, e.g. for the fetch before operation
you atomically or some flag but then you want to perform some code in the
current thread only if the flag wasn't set before, etc.  Similarly for the
fetch after the operation.  It really depends on what the code wants to
achieve.
What your testcase does is not useful though, you basically want to have 2
arrays the same content, you can achieve that by copying the atomically changed
array to the other one at the end of parallel region, or by doing the atomics
on both arrays separately.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-12 12:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12  3:35 [Bug fortran/102698] New: omp atomic capture Abnormal results after running multiple times han.wu@compiler-dev.com
2021-10-12  6:31 ` [Bug fortran/102698] " rguenth at gcc dot gnu.org
2021-10-12  8:14 ` jakub at gcc dot gnu.org
2021-10-12  9:09 ` jakub at gcc dot gnu.org
2021-10-12 12:03 ` han.wu@compiler-dev.com
2021-10-12 12:20 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).