public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
@ 2023-05-02 17:07 thomas.meltzer1 at gmail dot com
  2023-05-02 17:10 ` [Bug fortran/109701] " thomas.meltzer1 at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: thomas.meltzer1 at gmail dot com @ 2023-05-02 17:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

            Bug ID: 109701
           Summary: I have a MWE where an omp reduction breaks if I add
                    the option for GPU offloading (even if it isn't used).
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thomas.meltzer1 at gmail dot com
  Target Milestone: ---

Created attachment 54972
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54972&action=edit
source code to reproduce bug

I think I have identified a bug when using gfortran and openMP. I have tested
on gfortran versions:

* gfortran 10.3.0
* gfortran 11.3.0
* gfortran 12.2.0

I have posted a question on stackoverflow:
https://stackoverflow.com/questions/76119137/potential-gfortran-or-openmp-bug-when-using-omp-if-and-reduction

Here is mwe:
-----------------------------------
program test

  use omp_lib

  implicit none

  integer, parameter :: N=3 
  integer            :: i, j
  real               :: a(N,N), b(N,N), max_diff
  logical            :: is_GPU
  is_GPU = .false.
#ifdef USEGPU
  is_GPU = .true.
#endif

  !$omp target data if(is_GPU) map(to:a, b)
  !$omp target teams if(is_GPU)
  !$omp distribute parallel do simd collapse(2)
  do j = 1, N
    do i = 1, N
      a(i, j) = i*j 
      b(i, j) = i*j*0.9
    end do
  end do
  !$omp end target teams

  max_diff = 0.0 
  !$omp target teams if(is_GPU)     !<---- comment this
  !$omp distribute parallel do simd reduction(max:max_diff) collapse(2)
  do j = 1, N
    do i = 1, N
      max_diff = max(max_diff, abs(b(i, j) - a(i, j)))
    end do
  end do
  !$omp end target teams     !<---- comment this

  write (*,'("max_diff = ", F6.3)') max_diff
  !$omp end target data

end program
-----------------------------------

Here is the command to compile and run:
gfortran -cpp -fopenmp mwe.f90 && OMP_NUM_THREADS=2 ./a.out

I have also tried with extra flags (-Wall -Wextra) and there are no reported
warnings.

Expected output is:
max_diff =  0.900

but with gfortran I get:
max_diff =  0.000

It works with nvfortran 22.5-0 (from nvhpc toolkit) but not for gfortran.

Command for nvfortran is:
nvfortran -cpp -mp=multicore mwe.f90 && OMP_NUM_THREADS=2 ./a.out

I want to keep portability so that openMP handles whether I build with GPU or
not. I am aware I can workaround it without openMP "if" statements and instead
use pre-processor directives.

If I comment out the lines marked with (!<---- comment this) and remove
"distribute" from the line "!$omp distribute parallel do simd
reduction(max:max_diff) collapse(2)" then the code runs as expected.

Am I mis-using the openMP if statements or doing something else which is not
portable or is this a bug?

Please let me know if you need any further information.

gfortran -v 11.3.0 output:
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
11.3.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04) 


gfortran -v 12.2.0 output:
Reading specs from
/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/lib/gcc/x86_64-pc-linux-gnu/12.2.0/specs
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/libexec/gcc/x86_64-pc-linux-gnu/12.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/tmp/melt/spack-stage/spack-stage-gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/spack-src/configure
--prefix=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx
--with-pkgversion='Spack GCC'
--with-bugurl=https://github.com/spack/spack/issues --disable-multilib
--enable-languages=c,c++,fortran --disable-nls
--disable-canonical-system-headers --with-system-zlib
--with-zstd-include=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/zstd-1.5.2-4lqnadoditk6uhithspv7gaaleqkkzxs/include
--with-zstd-lib=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/zstd-1.5.2-4lqnadoditk6uhithspv7gaaleqkkzxs/lib
--enable-bootstrap
--with-mpfr-include=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpfr-4.1.0-3htwy6gdcb5iwcr6jpbev5yiltdjejfy/include
--with-mpfr-lib=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpfr-4.1.0-3htwy6gdcb5iwcr6jpbev5yiltdjejfy/lib
--with-gmp-include=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gmp-6.2.1-oc47phqrmnbll7y5xd5mgcffuy4uwewd/include
--with-gmp-lib=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gmp-6.2.1-oc47phqrmnbll7y5xd5mgcffuy4uwewd/lib
--with-mpc-include=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpc-1.2.1-7bswqqwsnkfwa6ojrkdhxveumijpchhz/include
--with-mpc-lib=/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpc-1.2.1-7bswqqwsnkfwa6ojrkdhxveumijpchhz/lib
--without-isl
--with-stage1-ldflags='-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/lib64
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gmp-6.2.1-oc47phqrmnbll7y5xd5mgcffuy4uwewd/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpc-1.2.1-7bswqqwsnkfwa6ojrkdhxveumijpchhz/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpfr-4.1.0-3htwy6gdcb5iwcr6jpbev5yiltdjejfy/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/zlib-1.2.13-kxewaohczdviv3z3yz2a45g3kwpd45yh/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/zstd-1.5.2-4lqnadoditk6uhithspv7gaaleqkkzxs/lib'
--with-boot-ldflags='-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gcc-12.2.0-7szeaw2tk7ndv3brjeitsqmi3r6cz2sx/lib64
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/gmp-6.2.1-oc47phqrmnbll7y5xd5mgcffuy4uwewd/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpc-1.2.1-7bswqqwsnkfwa6ojrkdhxveumijpchhz/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/mpfr-4.1.0-3htwy6gdcb5iwcr6jpbev5yiltdjejfy/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/zlib-1.2.13-kxewaohczdviv3z3yz2a45g3kwpd45yh/lib
-Wl,-rpath,/software/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/zstd-1.5.2-4lqnadoditk6uhithspv7gaaleqkkzxs/lib
-static-libstdc++ -static-libgcc' --with-build-config=spack
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (Spack GCC)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
@ 2023-05-02 17:10 ` thomas.meltzer1 at gmail dot com
  2023-05-02 19:05 ` anlauf at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.meltzer1 at gmail dot com @ 2023-05-02 17:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

--- Comment #1 from Thomas Meltzer <thomas.meltzer1 at gmail dot com> ---
Could be related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99928 but I am
not sure. In my case the GPU offloading should be ignored.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
  2023-05-02 17:10 ` [Bug fortran/109701] " thomas.meltzer1 at gmail dot com
@ 2023-05-02 19:05 ` anlauf at gcc dot gnu.org
  2023-05-02 19:35 ` anlauf at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-05-02 19:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

anlauf at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |openmp

--- Comment #2 from anlauf at gcc dot gnu.org ---
If I compile the code with -g -fsanitize=thread, I see a data race in the
second loop nest pointing to a possible issue with the reduction.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
  2023-05-02 17:10 ` [Bug fortran/109701] " thomas.meltzer1 at gmail dot com
  2023-05-02 19:05 ` anlauf at gcc dot gnu.org
@ 2023-05-02 19:35 ` anlauf at gcc dot gnu.org
  2023-05-03  8:30 ` thomas.meltzer1 at gmail dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-05-02 19:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

--- Comment #3 from anlauf at gcc dot gnu.org ---
(In reply to Thomas Meltzer from comment #1)
> Could be related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99928 but I
> am not sure. In my case the GPU offloading should be ignored.

Replacing the line

  !$omp target teams if(is_GPU)     !<---- comment this

by

  !$omp target teams if(is_GPU) map(max_diff)

seems to make a difference for me.  So it might be related to pr99928.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
                   ` (2 preceding siblings ...)
  2023-05-02 19:35 ` anlauf at gcc dot gnu.org
@ 2023-05-03  8:30 ` thomas.meltzer1 at gmail dot com
  2023-05-03 17:45 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.meltzer1 at gmail dot com @ 2023-05-03  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

--- Comment #4 from Thomas Meltzer <thomas.meltzer1 at gmail dot com> ---
Thanks adding map(max_diff) works for me but my guess is that this should not
be required and there is a potential bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
                   ` (3 preceding siblings ...)
  2023-05-03  8:30 ` thomas.meltzer1 at gmail dot com
@ 2023-05-03 17:45 ` jakub at gcc dot gnu.org
  2023-05-04 16:08 ` thomas.meltzer1 at gmail dot com
  2023-05-04 16:41 ` jakub at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-03 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|UNCONFIRMED                 |RESOLVED
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
This is a user error.
The OpenMP standard makes reduction vars mapped on target only on combined
constructs with the target construct, which is not the case here.
If you'd use
!$omp target teams distribute parallel do simd if(target:is_GPU)
reduction(max:max_diff) collapse(2)
then max_diff would be mapped.  But as it is on a different constructs, the
standard rules apply there and max_diff is firstprivatized instead (as it is
scalar, no defaultmap clause is used etc.).  So, if that isn't what you want,
you need to map
it explicitly map(tofrom:max_diff) or so on the target.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
                   ` (4 preceding siblings ...)
  2023-05-03 17:45 ` jakub at gcc dot gnu.org
@ 2023-05-04 16:08 ` thomas.meltzer1 at gmail dot com
  2023-05-04 16:41 ` jakub at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.meltzer1 at gmail dot com @ 2023-05-04 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

--- Comment #6 from tommelt <thomas.meltzer1 at gmail dot com> ---
Thank you.

Interestingly, I tried your suggestion:
"!$omp target teams distribute parallel do simd if(target:is_GPU)
reduction(max:max_diff) collapse(2)"

It works for gfortran v 12.2.0 

but it does not work for:
* gfortran v 11.3.0 or;
* gfortran v 10.3.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/109701] I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used).
  2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
                   ` (5 preceding siblings ...)
  2023-05-04 16:08 ` thomas.meltzer1 at gmail dot com
@ 2023-05-04 16:41 ` jakub at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-04 16:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109701

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That is not that surprising.  As mentioned in the linked in PR99928, that
behavior is there only in OpenMP 5.0 and wasn't like that in OpenMP 4.5 and
earlier; in OpenMP 4.5
you really need separate target (or target teams) with explicit map clause and
then distribute ... with reduction.  And this part of OpenMP 5.0 was only
implemented in GCC starting with GCC 12, older versions supported the 4.5
behavior.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-04 16:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-02 17:07 [Bug fortran/109701] New: I have a MWE where an omp reduction breaks if I add the option for GPU offloading (even if it isn't used) thomas.meltzer1 at gmail dot com
2023-05-02 17:10 ` [Bug fortran/109701] " thomas.meltzer1 at gmail dot com
2023-05-02 19:05 ` anlauf at gcc dot gnu.org
2023-05-02 19:35 ` anlauf at gcc dot gnu.org
2023-05-03  8:30 ` thomas.meltzer1 at gmail dot com
2023-05-03 17:45 ` jakub at gcc dot gnu.org
2023-05-04 16:08 ` thomas.meltzer1 at gmail dot com
2023-05-04 16:41 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).