public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/95150] New: Some offloaded programs crash with openmp
@ 2020-05-15 11:36 chinoune.mehdi at hotmail dot com
2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-15 11:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Bug ID: 95150
Summary: Some offloaded programs crash with openmp
Product: gcc
Version: 10.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libgomp
Assignee: unassigned at gcc dot gnu.org
Reporter: chinoune.mehdi at hotmail dot com
CC: jakub at gcc dot gnu.org
Target Milestone: ---
This is the reduced program:
$ cat matmul.F90
program main
implicit none
integer, parameter :: sp = selected_real_kind(6,37)
integer, parameter :: l = 1024, m = 1024, n = 1024
real(sp), allocatable :: a(:,:), b(:,:), c(:,:)
integer :: i, j, k, t1, t2
real(sp) :: tic
!
call system_clock( t1, tic)
!
allocate( a(l,m), b(m,n), c(l,n) )
!
call random_number(a)
call random_number(b)
c = 0._sp
!
!$acc data copyin(a,b) copyout(c)
!$acc parallel loop collapse(3)
!$omp target teams distribute collapse(3) map( to:a,b ) map( tofrom:c)
do j = 1, n
do k = 1, m
do i = 1, l
c(i,j) = a(i,k)*b(k,j) + c(i,j)
end do
end do
end do
!$acc end data
!
call system_clock(t2)
print*, (t2-t1)/tic
!
end program main
This program compiles successfully with both OpenMP and OpenACC but it crashs
with OpenMP after a short time of running, throwing this error message:
$ gfortran-10 -fopenmp -foffload=nvptx-none="-lm -lgfortran" matmul.F90 -o
test.x
$ $ ./test.x
libgomp: cuCtxSynchronize error: the launch timed out and was terminated
The same message appears with gfortran-9
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
@ 2020-05-15 14:10 ` burnus at gcc dot gnu.org
2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: burnus at gcc dot gnu.org @ 2020-05-15 14:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
--- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> ---
* You compilation uses "-O0" – I do not know whether that's intended.
* I did not see any timeout message although it did take a while to run
with offloading. (See timing results below.)
I wonder what causes the problem you are seeing.
You could try whether setting the environment variable
GOMP_DEBUG=1
shows some useful details for the launch.
* The OpenACC test case is wrong as "c" has to be "copy" not "copyout"
as the initial value is used (→ NaN)
On the technical side, at startup, one calls:
cuLaunchKernel
and when that has succeeded, one calls
cuCtxSynchronize
and if that fails, the error message is printed with
cuda_error
which shows the time-out message:
libgomp: cuCtxSynchronize error: the launch timed out and was terminated
I added a ", sum(c)" to the print output and did some tests:
On AMDGCN:
== -O0 == 3.56800008 268048112.
== -Ofast == 0.109999999 268698816.
== -fopenmp -O0 == 193.227997 268186448.
== -fopenmp -Ofast == 43.1559982 268455872.
== -fopenacc -O0 == 186.399002 268531136.
== -fopenacc -Ofast == 43.4970016 268206464.
== -fopenmp -foffload=disable -O0 == 7.27299976 268241776.
== -fopenmp -foffload=disable -Ofast == 1.49000001 268171680.
On NVidia:
== -O0 == 8.00599957 268253520.
== -Ofast == 0.254999995 268399056.
== -fopenmp -O0 == 64.2089996 268092608.
== -fopenmp -Ofast == 33.6360016 268359952.
== -fopenacc -O0 == 0.861999989 NaN (see note)
== -fopenacc -Ofast == 0.300000012 NaN (see note)
== -fopenmp -foffload=disable -O0 == 15.2220001 268511968.
== -fopenmp -foffload=disable -Ofast == 3.52900004 268573568.
== -fopenacc -foffload=disable -O0 == 14.5790005 268442496.
== -fopenacc -foffload=disable -Ofast == 4.41099977 268511968.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
@ 2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-15 15:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
--- Comment #2 from Chinoune <chinoune.mehdi at hotmail dot com> ---
Created attachment 48546
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48546&action=edit
debug ouput
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
@ 2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
2020-05-21 7:12 ` chinoune.mehdi at hotmail dot com
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-15 15:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
--- Comment #3 from Chinoune <chinoune.mehdi at hotmail dot com> ---
(In reply to Tobias Burnus from comment #1)
> * You compilation uses "-O0" – I do not know whether that's intended.
I didn't set any optimization flag, maybe the compiler default to "-O0".
>
> * I did not see any timeout message although it did take a while to run
> with offloading. (See timing results below.)
> I wonder what causes the problem you are seeing.
>
> You could try whether setting the environment variable
> GOMP_DEBUG=1
> shows some useful details for the launch.
>
I have attached the output with GOMP_DEBUG=1
> * The OpenACC test case is wrong as "c" has to be "copy" not "copyout"
> as the initial value is used (→ NaN)
Thanks, I did observe after I reported the bug.
I am using a Kepler (sm_35) Graphics card, if this helps.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (2 preceding siblings ...)
2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
@ 2020-05-21 7:12 ` chinoune.mehdi at hotmail dot com
2020-10-30 9:32 ` mehdi.chinoune at hotmail dot com
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: chinoune.mehdi at hotmail dot com @ 2020-05-21 7:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
--- Comment #4 from Chinoune <chinoune.mehdi at hotmail dot com> ---
after some tests, It looks like it fails with only with small sizes.
The program doesn't crash when increasing matrices size. and It takes a shorter
time to execute!.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (3 preceding siblings ...)
2020-05-21 7:12 ` chinoune.mehdi at hotmail dot com
@ 2020-10-30 9:32 ` mehdi.chinoune at hotmail dot com
2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-10-30 9:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Chinoune <mehdi.chinoune at hotmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |WONTFIX
--- Comment #5 from Chinoune <mehdi.chinoune at hotmail dot com> ---
Won't fix.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (4 preceding siblings ...)
2020-10-30 9:32 ` mehdi.chinoune at hotmail dot com
@ 2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
2020-12-13 3:33 ` mehdi.chinoune at hotmail dot com
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-12 20:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Chinoune <mehdi.chinoune at hotmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|WONTFIX |---
Status|RESOLVED |UNCONFIRMED
--- Comment #6 from Chinoune <mehdi.chinoune at hotmail dot com> ---
Reopen, as I have reproduced the same crash with another GPU.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (5 preceding siblings ...)
2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
@ 2020-12-13 3:33 ` mehdi.chinoune at hotmail dot com
2020-12-13 4:21 ` mehdi.chinoune at hotmail dot com
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-13 3:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Chinoune <mehdi.chinoune at hotmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail|10.1.0 |10.2.0
Keywords| |openacc
Version|10.1.0 |10.2.0
--- Comment #7 from Chinoune <mehdi.chinoune at hotmail dot com> ---
with OpenACC, I got a similar message:
libgomp: cuStreamSynchronize error: the launch timed out and was terminated
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (6 preceding siblings ...)
2020-12-13 3:33 ` mehdi.chinoune at hotmail dot com
@ 2020-12-13 4:21 ` mehdi.chinoune at hotmail dot com
2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
2021-07-28 8:26 ` mehdi.chinoune at hotmail dot com
9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-13 4:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Chinoune <mehdi.chinoune at hotmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |WONTFIX
--- Comment #8 from Chinoune <mehdi.chinoune at hotmail dot com> ---
Adding "parallel do" to openmp directive solves the problem.
The crash reappears with "collapse(2)" with both OpenMP and OpenACC.
program main
implicit none
integer, parameter :: sp = selected_real_kind(6,37)
real(sp), allocatable :: a(:,:), b(:,:), c(:,:)
character( len=5 ) :: val
integer :: n, l, m
integer :: i, j, k
integer :: t1, t2
real(sp) :: tic
!
call get_command_argument( 1, val )
read( val, *) n
l = n
m = n
!
call system_clock( t1, tic)
!
allocate( a(l,m), b(m,n), c(l,n) )
!
call random_number(a)
call random_number(b)
c = 0._sp
!
!$acc data copyin(a,b) copy(c)
!$acc parallel loop collapse(3)
!$omp target teams distribute parallel do collapse(3) map( to:a,b ) map(
tofrom:c )
do j = 1, n
do k = 1, m
do i = 1, l
c(i,j) = a(i,k)*b(k,j) + c(i,j)
end do
end do
end do
!$acc end data
!
call system_clock(t2)
print*, n, (t2-t1)/tic, sum(c)
!
end program main
$ gfortran -O3 -fopenmp -foffload=nvptx-none matmul.f90 -o test.x
$ for i in {1..5}; do ./test.x $((512*2**$i)); done
1024 0.287999988 268377424.
2048 7.40000010E-02 0.00000000
4096 0.170000002 0.00000000
8192 0.574000001 0.00000000
16384 2.10400009 0.00000000
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (7 preceding siblings ...)
2020-12-13 4:21 ` mehdi.chinoune at hotmail dot com
@ 2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
2021-07-28 8:26 ` mehdi.chinoune at hotmail dot com
9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2020-12-19 12:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Chinoune <mehdi.chinoune at hotmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |UNCONFIRMED
Resolution|WONTFIX |---
--- Comment #9 from Chinoune <mehdi.chinoune at hotmail dot com> ---
I get it with more examples.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libgomp/95150] Some offloaded programs crash with openmp
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
` (8 preceding siblings ...)
2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
@ 2021-07-28 8:26 ` mehdi.chinoune at hotmail dot com
9 siblings, 0 replies; 11+ messages in thread
From: mehdi.chinoune at hotmail dot com @ 2021-07-28 8:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150
Chinoune <mehdi.chinoune at hotmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |WONTFIX
Status|UNCONFIRMED |RESOLVED
--- Comment #10 from Chinoune <mehdi.chinoune at hotmail dot com> ---
No one has the intention to fix it.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-07-28 8:26 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15 11:36 [Bug libgomp/95150] New: Some offloaded programs crash with openmp chinoune.mehdi at hotmail dot com
2020-05-15 14:10 ` [Bug libgomp/95150] " burnus at gcc dot gnu.org
2020-05-15 15:42 ` chinoune.mehdi at hotmail dot com
2020-05-15 15:49 ` chinoune.mehdi at hotmail dot com
2020-05-21 7:12 ` chinoune.mehdi at hotmail dot com
2020-10-30 9:32 ` mehdi.chinoune at hotmail dot com
2020-12-12 20:03 ` mehdi.chinoune at hotmail dot com
2020-12-13 3:33 ` mehdi.chinoune at hotmail dot com
2020-12-13 4:21 ` mehdi.chinoune at hotmail dot com
2020-12-19 12:18 ` mehdi.chinoune at hotmail dot com
2021-07-28 8:26 ` mehdi.chinoune at hotmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).