public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region
@ 2011-12-16 22:31 longb at cray dot com
2011-12-17 10:21 ` [Bug fortran/51591] " burnus at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: longb at cray dot com @ 2011-12-16 22:31 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
Bug #: 51591
Summary: Strange output from STOP statement in OpenMP region
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: minor
Priority: P3
Component: fortran
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: longb@cray.com
> cat testc.c
#include <unistd.h>
/* extern unsigned int sleep (unsigned int __seconds); */
int sleepc_ (unsigned int *sec)
{
sleep(*sec);
return 0;
}
> cat test.f90
use omp_lib
implicit none
integer i
print *,"Hello World"
call omp_set_num_threads(5)
!$omp parallel
!$omp do schedule(static,1)
do i=1,omp_get_num_threads()
!$omp critical
print *, "I am",omp_get_thread_num()," of",omp_get_num_threads()
!$omp end critical
select case (omp_get_thread_num())
case (0)
call sleep (1)
stop 0
case (1)
stop 1
case (2)
stop 2
case (3)
stop 3
case default
stop
end select
enddo
!$omp end do
!$omp barrier
!$omp end parallel
end
> cc -c testc.c
> ftn -fopenmp test.f90 testc.o
Sometimes output looks OK:
> aprun -n1 -d5 ./a.out
Hello World
STOP 1
I am 1 of 5
I am 2 of 5
Application 5777837 exit codes: 1
Application 5777837 resources: utime ~0s, stime ~0s
But more often there is some garbled text output:
> aprun -n1 -d5 ./a.out
Hello World
STOP 1
I am 1 of 5
0im `m5 <<<<----- What's this?
Application 5777838 exit codes: 1
Application 5777838 resources: utime ~0s, stime ~0s
<Nice to see that the STOP 1 results in an exit code of 1, though - new F08
feature.>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
@ 2011-12-17 10:21 ` burnus at gcc dot gnu.org
2011-12-17 11:32 ` jb at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu.org @ 2011-12-17 10:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
Tobias Burnus <burnus at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jb at gcc dot gnu.org
--- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-17 09:39:05 UTC ---
(In reply to comment #0)
> > cat testc.c
> int sleepc_ (unsigned int *sec)
This function is actually not used but gfortran's intrinsic "sleep". (The
Fortran program calls "sleep" instead of "sleepc".)
> Sometimes output looks OK:
> But more often there is some garbled text output:
> 0im `m5 <<<<----- What's this?
I can reproduce this - though for me the output is more often OK than garbled
(60% vs. 40% of the output of:
for((I=0;$I < 20; I++)); do ./a.out ; done
)
That's with GCC 4.6. In GCC 4.7, it works much more often (I have to run the
the line above about ~20 times, i.e. approx every 400th run it fails).
Additionally, in 4.7 I do not see garbled output but a segfault.
A backtrace of the core dump shows:
Program terminated with signal 11, Segmentation fault.
#0 _gfortrani_fbuf_flush (u=0x6055d0, mode=<optimized out>)
at /home/tob/projects/gcc-git/gcc/libgfortran/io/fbuf.c:166
166 if (u->fbuf->act > u->fbuf->pos && u->fbuf->pos > 0)
(gdb) bt
#1 0x00002b37379836bd in _gfortrani_next_record (dtp=0x2b373926dc50, done=1)
at /home/tob/projects/gcc-git/gcc/libgfortran/io/transfer.c:3397
#2 0x00002b3737983f79 in _gfortran_st_write_done (dtp=0x2b373926dc50)
at /home/tob/projects/gcc-git/gcc/libgfortran/io/transfer.c:3592
(gdb) p u->fbuf
$3 = (struct fbuf *) 0x7e7e7e7e7e7e7e7e
The value matches:
$ echo $MALLOC_PERTURB_
126
Thus, "fbuf" points to malloced memory, which has never been initialized.
> <Nice to see that the STOP 1 results in an exit code of 1, though - new F08
> feature.>
I think gfortran (like several other compilers) does so since years; new (since
4.6) is the support for constant character and integer expressions for (error)
stop.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
2011-12-17 10:21 ` [Bug fortran/51591] " burnus at gcc dot gnu.org
@ 2011-12-17 11:32 ` jb at gcc dot gnu.org
2012-02-03 22:09 ` bdavis at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jb at gcc dot gnu.org @ 2011-12-17 11:32 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
--- Comment #2 from Janne Blomqvist <jb at gcc dot gnu.org> 2011-12-17 11:27:40 UTC ---
Looks like some kind of race condition..
E.g. what about: STOP calls exit(), which leads to the library destructor being
called, which calls close_units(), which closes each open unit in the tree. But
somehow the print statement from another thread also thinks it has access to
the unit, and then tries to print something, which segfaults because the other
thread is in the process of shutting down the same unit?
Hmm, now that I quickly looked at the code, the above looks likely. So
close_units() acquires unit_lock (the global lock protecting the unit tree),
then closes each unit without acquiring the unit's own lock (u->lock).
For comparison, in normal IO statements, first we acquire unit_lock, find the
unit in the tree, acquire u->lock, then release unit_lock. Then do the IO with
u->lock held, and finally relase u->lock.
So it seems that it would be possible for the print statement to acquire the
u->lock before the close_units gets to lock unit_lock, and thus we have a race?
Of course, this is based on a very quick scan of the code, and I could be all
wrong. Perhaps Jakub knows better, as he designed the libgfortran locking
scheme?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
2011-12-17 10:21 ` [Bug fortran/51591] " burnus at gcc dot gnu.org
2011-12-17 11:32 ` jb at gcc dot gnu.org
@ 2012-02-03 22:09 ` bdavis at gcc dot gnu.org
2013-05-11 17:09 ` bdavis at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: bdavis at gcc dot gnu.org @ 2012-02-03 22:09 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
Bud Davis <bdavis at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bdavis at gcc dot gnu.org
--- Comment #3 from Bud Davis <bdavis at gcc dot gnu.org> 2012-02-03 22:08:10 UTC ---
Index: gcc/libgfortran/io/unit.c
===================================================================
--- gcc/libgfortran/io/unit.c (revision 183873)
+++ gcc/libgfortran/io/unit.c (working copy)
@@ -637,6 +637,7 @@
if (u->previous_nonadvancing_write)
finish_last_advance_record (u);
+ __gthread_mutex_lock (&u->lock);
rc = (u->s == NULL) ? 0 : sclose (u->s) == -1;
u->closed = 1;
As theorized, the above patch does seem to correct the problem with no
regressions in the testsuite.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
` (2 preceding siblings ...)
2012-02-03 22:09 ` bdavis at gcc dot gnu.org
@ 2013-05-11 17:09 ` bdavis at gcc dot gnu.org
2015-10-20 14:54 ` dominiq at lps dot ens.fr
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: bdavis at gcc dot gnu.org @ 2013-05-11 17:09 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
--- Comment #4 from Bud Davis <bdavis at gcc dot gnu.org> ---
Upon closer reflection, the underlying problems is the OpenMP threads doing I/O
while the units are being closed.
So, stop shows in the output, followed by output from threads whose units have
been destroyed, but the call to exit() handler has not yet terminated.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
` (3 preceding siblings ...)
2013-05-11 17:09 ` bdavis at gcc dot gnu.org
@ 2015-10-20 14:54 ` dominiq at lps dot ens.fr
2020-07-30 15:17 ` dominiq at lps dot ens.fr
2020-07-30 15:23 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: dominiq at lps dot ens.fr @ 2015-10-20 14:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
Dominique d'Humieres <dominiq at lps dot ens.fr> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2015-10-20
Ever confirmed|0 |1
--- Comment #5 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> Upon closer reflection, the underlying problems is the OpenMP threads doing
> I/O while the units are being closed.
> So, stop shows in the output, followed by output from threads whose units
> have been destroyed, but the call to exit() handler has not yet terminated.
Nay progress after more than two years?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
` (4 preceding siblings ...)
2015-10-20 14:54 ` dominiq at lps dot ens.fr
@ 2020-07-30 15:17 ` dominiq at lps dot ens.fr
2020-07-30 15:23 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: dominiq at lps dot ens.fr @ 2020-07-30 15:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
Dominique d'Humieres <dominiq at lps dot ens.fr> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |RESOLVED
Resolution|--- |WORKSFORME
--- Comment #8 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> Jakub, do you know what the OMP standard has to say on this?
> Is "STOP 1" in an OMP region defined behavior?
No answer after more than one year. Closing.
Open a new PR if the problem is still there.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug fortran/51591] Strange output from STOP statement in OpenMP region
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
` (5 preceding siblings ...)
2020-07-30 15:17 ` dominiq at lps dot ens.fr
@ 2020-07-30 15:23 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-07-30 15:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51591
--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
OpenMP just says that a structured block
"may contain STOP or ERROR STOP statements."
and nothing else, what the particular behavior for STOP is is covered in the
base language or is up to the implementation.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-07-30 15:23 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-16 22:31 [Bug fortran/51591] New: Strange output from STOP statement in OpenMP region longb at cray dot com
2011-12-17 10:21 ` [Bug fortran/51591] " burnus at gcc dot gnu.org
2011-12-17 11:32 ` jb at gcc dot gnu.org
2012-02-03 22:09 ` bdavis at gcc dot gnu.org
2013-05-11 17:09 ` bdavis at gcc dot gnu.org
2015-10-20 14:54 ` dominiq at lps dot ens.fr
2020-07-30 15:17 ` dominiq at lps dot ens.fr
2020-07-30 15:23 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).