public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/95150] New: Some offloaded programs crash with openmp
@ 2020-05-15 11:36 chinoune.mehdi at hotmail dot com
  2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-15 11:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

            Bug ID: 95150
           Summary: Some offloaded programs crash with openmp
           Product: gcc
           Version: 10.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: chinoune.mehdi at hotmail dot com
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

This is the reduced program:

$ cat matmul.F90
program main
  implicit none
  integer, parameter :: sp = selected_real_kind(6,37)
  integer, parameter :: l = 1024, m = 1024, n = 1024
  real(sp), allocatable :: a(:,:), b(:,:), c(:,:)
  integer :: i, j, k, t1, t2
  real(sp) :: tic
  !
  call system_clock( t1, tic)
  !
  allocate( a(l,m), b(m,n), c(l,n) )
  !
  call random_number(a)
  call random_number(b)
  c = 0._sp
  !
  !$acc data copyin(a,b) copyout(c)
  !$acc parallel loop collapse(3)
  !$omp target teams distribute collapse(3) map( to:a,b ) map( tofrom:c)
  do j = 1, n
    do k = 1, m
      do i = 1, l
        c(i,j) = a(i,k)*b(k,j) + c(i,j)
      end do
    end do
  end do
  !$acc end data
  !
  call system_clock(t2)
  print*, (t2-t1)/tic
  !
end program main

This program compiles successfully with both OpenMP and OpenACC but it crashs
with OpenMP after a short time of running, throwing this error message:

$ gfortran-10 -fopenmp -foffload=nvptx-none="-lm -lgfortran" matmul.F90 -o
test.x
$ $ ./test.x

libgomp: cuCtxSynchronize error: the launch timed out and was terminated

The same message appears with gfortran-9

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
@ 2020-05-15 14:10 ` burnus at gcc dot gnu.org
  2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: burnus at gcc dot gnu.org @ 2020-05-15 14:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

--- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> ---
* You compilation uses "-O0" – I do not know whether that's intended.

* I did not see any timeout message although it did take a while to run
  with offloading. (See timing results below.)
  I wonder what causes the problem you are seeing.

  You could try whether setting the environment variable
    GOMP_DEBUG=1
  shows some useful details for the launch.

* The OpenACC test case is wrong as "c" has to be "copy" not "copyout"
  as the initial value is used (→ NaN)

On the technical side, at startup, one calls:
  cuLaunchKernel
and when that has succeeded, one calls
  cuCtxSynchronize
and if that fails, the error message is printed with
  cuda_error
which shows the time-out message:
  libgomp: cuCtxSynchronize error: the launch timed out and was terminated


I added a ", sum(c)" to the print output and did some tests:

On AMDGCN:
== -O0 ==                                 3.56800008       268048112.    
== -Ofast ==                              0.109999999      268698816.    
== -fopenmp -O0 ==                      193.227997         268186448.    
== -fopenmp -Ofast ==                    43.1559982        268455872.    
== -fopenacc -O0 ==                     186.399002         268531136.    
== -fopenacc -Ofast ==                   43.4970016        268206464.    
== -fopenmp -foffload=disable -O0 ==      7.27299976       268241776.    
== -fopenmp -foffload=disable -Ofast ==   1.49000001       268171680.    


On NVidia:
== -O0 ==                                8.00599957        268253520.    
== -Ofast ==                             0.254999995       268399056.    
== -fopenmp -O0 ==                      64.2089996         268092608.    
== -fopenmp -Ofast ==                   33.6360016         268359952.    
== -fopenacc -O0 ==                      0.861999989             NaN (see note)
== -fopenacc -Ofast ==                   0.300000012             NaN (see note)
== -fopenmp -foffload=disable -O0 ==    15.2220001         268511968.    
== -fopenmp -foffload=disable -Ofast ==  3.52900004        268573568.    
== -fopenacc -foffload=disable -O0 ==   14.5790005         268442496.    
== -fopenacc -foffload=disable -Ofast == 4.41099977        268511968.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
  2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
@ 2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
  2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-15 15:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

--- Comment #2 from Chinoune <chinoune.mehdi at hotmail dot com> ---
Created attachment 48546
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48546&action=edit
debug ouput

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
  2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
  2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
@ 2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
  2020-05-21  7:12 ` chinoune.mehdi at hotmail dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-15 15:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

--- Comment #3 from Chinoune <chinoune.mehdi at hotmail dot com> ---
(In reply to Tobias Burnus from comment #1)
> * You compilation uses "-O0" – I do not know whether that's intended.
I didn't set any optimization flag, maybe the compiler default to "-O0".

>  
> * I did not see any timeout message although it did take a while to run
>   with offloading. (See timing results below.)
>   I wonder what causes the problem you are seeing.
> 
>   You could try whether setting the environment variable
>     GOMP_DEBUG=1
>   shows some useful details for the launch.
> 
I have attached the output with GOMP_DEBUG=1

> * The OpenACC test case is wrong as "c" has to be "copy" not "copyout"
>   as the initial value is used (→ NaN)
Thanks, I did observe after I reported the bug.

I am using a Kepler (sm_35) Graphics card, if this helps.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (2 preceding siblings ...)
  2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
@ 2020-05-21  7:12 ` chinoune.mehdi at hotmail dot com
  2020-10-30  9:32 ` mehdi.chinoune at hotmail dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-21  7:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

--- Comment #4 from Chinoune <chinoune.mehdi at hotmail dot com> ---
after some tests, It looks like it fails with only with small sizes.
The program doesn't crash when increasing matrices size. and It takes a shorter
time to execute!.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (3 preceding siblings ...)
  2020-05-21  7:12 ` chinoune.mehdi at hotmail dot com
@ 2020-10-30  9:32 ` mehdi.chinoune at hotmail dot com
  2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-10-30  9:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune <mehdi.chinoune at hotmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #5 from Chinoune <mehdi.chinoune at hotmail dot com> ---
Won't fix.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (4 preceding siblings ...)
  2020-10-30  9:32 ` mehdi.chinoune at hotmail dot com
@ 2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
  2020-12-13  3:33 ` mehdi.chinoune at hotmail dot com
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-12 20:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune <mehdi.chinoune at hotmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|WONTFIX                     |---
             Status|RESOLVED                    |UNCONFIRMED

--- Comment #6 from Chinoune <mehdi.chinoune at hotmail dot com> ---
Reopen, as I have reproduced the same crash with another GPU.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (5 preceding siblings ...)
  2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
@ 2020-12-13  3:33 ` mehdi.chinoune at hotmail dot com
  2020-12-13  4:21 ` mehdi.chinoune at hotmail dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-13  3:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune <mehdi.chinoune at hotmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|10.1.0                      |10.2.0
           Keywords|                            |openacc
            Version|10.1.0                      |10.2.0

--- Comment #7 from Chinoune <mehdi.chinoune at hotmail dot com> ---
with OpenACC, I got a similar message:

libgomp: cuStreamSynchronize error: the launch timed out and was terminated

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (6 preceding siblings ...)
  2020-12-13  3:33 ` mehdi.chinoune at hotmail dot com
@ 2020-12-13  4:21 ` mehdi.chinoune at hotmail dot com
  2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
  2021-07-28  8:26 ` mehdi.chinoune at hotmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-13  4:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune <mehdi.chinoune at hotmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #8 from Chinoune <mehdi.chinoune at hotmail dot com> ---
Adding "parallel do" to openmp directive solves the problem.
The crash reappears with "collapse(2)" with both OpenMP and OpenACC.

program main
  implicit none
  integer, parameter :: sp = selected_real_kind(6,37)
  real(sp), allocatable :: a(:,:), b(:,:), c(:,:)
  character( len=5 ) :: val
  integer :: n, l, m
  integer :: i, j, k
  integer :: t1, t2
  real(sp) :: tic
  !
  call get_command_argument( 1, val )
  read( val, *) n
  l = n
  m = n
  !
  call system_clock( t1, tic)
  !
  allocate( a(l,m), b(m,n), c(l,n) )
  !
  call random_number(a)
  call random_number(b)
  c = 0._sp
  !
  !$acc data copyin(a,b) copy(c)
  !$acc parallel loop collapse(3)
  !$omp target teams distribute parallel do collapse(3) map( to:a,b ) map(
tofrom:c )
  do j = 1, n
    do k = 1, m
      do i = 1, l
        c(i,j) = a(i,k)*b(k,j) + c(i,j)
      end do
    end do
  end do
  !$acc end data
  !
  call system_clock(t2)
  print*, n, (t2-t1)/tic, sum(c)
  !
end program main

$ gfortran -O3 -fopenmp -foffload=nvptx-none matmul.f90 -o test.x
$ for i in {1..5}; do ./test.x $((512*2**$i)); done
        1024  0.287999988       268377424.
        2048   7.40000010E-02   0.00000000
        4096  0.170000002       0.00000000
        8192  0.574000001       0.00000000
       16384   2.10400009       0.00000000

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (7 preceding siblings ...)
  2020-12-13  4:21 ` mehdi.chinoune at hotmail dot com
@ 2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
  2021-07-28  8:26 ` mehdi.chinoune at hotmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-19 12:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune <mehdi.chinoune at hotmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|WONTFIX                     |---

--- Comment #9 from Chinoune <mehdi.chinoune at hotmail dot com> ---
I get it with more examples.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug libgomp/95150] Some offloaded programs crash with openmp
  2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
                   ` (8 preceding siblings ...)
  2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
@ 2021-07-28  8:26 ` mehdi.chinoune at hotmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2021-07-28  8:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune <mehdi.chinoune at hotmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #10 from Chinoune <mehdi.chinoune at hotmail dot com> ---
No one has the intention to fix it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-07-28  8:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
2020-05-21  7:12 ` chinoune.mehdi at hotmail dot com
2020-10-30  9:32 ` mehdi.chinoune at hotmail dot com
2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
2020-12-13  3:33 ` mehdi.chinoune at hotmail dot com
2020-12-13  4:21 ` mehdi.chinoune at hotmail dot com
2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
2021-07-28  8:26 ` mehdi.chinoune at hotmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).