* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
@ 2013-04-16 15:20 ` burnus at gcc dot gnu.org
2013-04-17 0:58 ` jvdelisle at gcc dot gnu.org
` (9 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-04-16 15:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
Tobias Burnus <burnus at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |burnus at gcc dot gnu.org
--- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> 2013-04-16 15:20:20 UTC ---
For unformatted pathf95 and g95 have (top 10 entries of strace):
3 brk 4 close
3 getrlimit 4 ioctl
8 read 4 rt_sigaction
10 close 6 fstat
13 fstat 6 mprotect
20 mprotect 7 brk
21 stat 10 mmap
22 mmap 17 stat
46 open 23 open
367 write 734 write
gfortran has the following syscalls, which are at least invoked twice:
3 brk
5 read
7 close
10 rt_sigaction
11 access
11 fstat
12 mprotect
16 stat
18 mmap
26 open
2000000 lseek
4000000 write
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
2013-04-16 15:20 ` [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well burnus at gcc dot gnu.org
@ 2013-04-17 0:58 ` jvdelisle at gcc dot gnu.org
2013-04-17 10:50 ` jb at gcc dot gnu.org
` (8 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2013-04-17 0:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #2 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> 2013-04-17 00:58:02 UTC ---
There is a seek inside next_record_w_unf. That function is used for DIRECT I/O.
Looks conceptually wrong to me for sequential unformatted. I won't have time
for a few days to look at this further.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
2013-04-16 15:20 ` [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well burnus at gcc dot gnu.org
2013-04-17 0:58 ` jvdelisle at gcc dot gnu.org
@ 2013-04-17 10:50 ` jb at gcc dot gnu.org
2013-04-17 14:50 ` burnus at gcc dot gnu.org
` (7 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: jb at gcc dot gnu.org @ 2013-04-17 10:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #4 from Janne Blomqvist <jb at gcc dot gnu.org> 2013-04-17 10:50:07 UTC ---
The reason why gfortran is slow here is that for non-regular files we use
unbuffered I/O. If you write to a regular file instead of /dev/null, you'll see
us doing ~8 KB writes at a time. On my system, timing writing to /dev/null
gives
real 0m0.727s
user 0m0.272s
sys 0m0.452s
whereas writing to a file gives
real 0m0.202s
user 0m0.180s
sys 0m0.020s
The reason for this is that non-regular files (a.k.a. special files) are
special in many ways wrt seeking. Some allow seeking just fine, some always
return 0, some return an error (and which special files behave in which way is
to some extent different on different OS'es). As the buffered IO keeps track of
the logical file pointer position, it can easily get out of sync with the
physical position if it doesn't behave as for a regular file.
Also, for special files users often expect non-buffered IO, e.g. they want
output on the terminal directly instead of waiting until the 8 KB buffer fills
up, programs communicating via pipes can deadlock if data sits in the buffers,
etc. One could of course make "unbuffered" I/O in gfortran really mean "flush
the buffer at the end of each I/O statement" rather than not using a buffer at
all and instead using the raw POSIX I/O syscalls. This would perhaps not be a
bad idea per se, but would require making the buffered I/O code handle special
files in some sensible way.
Another reason for gfortran slowness is that we do quite a lot of checking in
data_transfer_init(), which means that there's quite a lot of per-record
overhead. Writing a single element unformatted is thus the worst case. One way
to speed up data_transfer_init, I think, is that instead of checking each flag
bit (which says which I/O specifiers are present) separately, create a variable
with forbidden flags for each I/O type (unformatted/formatted,
sequential/direct/stream => 6x), and check the entire flag variable once (flag
& forbidden_flags == 0). Only if there is an error, do the bit-by-bit checking
in order to generate the error message.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2013-04-17 10:50 ` jb at gcc dot gnu.org
@ 2013-04-17 14:50 ` burnus at gcc dot gnu.org
2013-04-18 1:21 ` jvdelisle at gcc dot gnu.org
` (6 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-04-17 14:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #5 from Tobias Burnus <burnus at gcc dot gnu.org> 2013-04-17 14:50:16 UTC ---
(In reply to comment #4)
> The reason why gfortran is slow here is that for non-regular files we use
> unbuffered I/O. If you write to a regular file instead of /dev/null, you'll
> see us doing ~8 KB writes at a time.
>
> The reason for this is that non-regular files (a.k.a. special files) are
> special in many ways wrt seeking. Some allow seeking just fine, some always
> return 0, some return an error (and which special files behave in which way is
> to some extent different on different OS'es).
I do not understand the argument regarding seek. If seek doesn't work - why
should there be a problem with buffering but not without? At least with
SEQUENTIAL one cannot do without (buffer exceeded or no buffering) and with
STREAM no seek should be required.
> Also, for special files users often expect non-buffered IO, e.g. they want
> output on the terminal directly instead of waiting until the 8 KB buffer fills
> up, programs communicating via pipes can deadlock if data sits in the buffers,
> etc.
But the code should be able to wait until a complete record has been written?
That should be rather quick, unless one write a 2GB array. I am not talking
about flushing the data only when 8kB are filled or when the file is closed.
And doing buffering within a record avoids seeks.
> One could of course make "unbuffered" I/O in gfortran really mean "flush
> the buffer at the end of each I/O statement" rather than not using a buffer at
> all.
We should consider this.
* * *
I have now updated timings with writing to a file.
Results for the example in comment 0, but writing to a file ("test.dat",
tmpfs). Unformatted is much faster with a normal file, but some others
compilers are still significantly faster. And for formatted, all other
compilers are significantly faster.
---- Timing in sec ------------------------------------------------
Unformatted Formatted
real / user real / user Compiler
----------- ----------- -----------------------------------------
0.378/0.352 2.815/2.804 GCC 4.8.0 (-Ofast, 20130308, Rev. 196547)
0.307/0.296 1.303/1.288 g95 4.0.3 (g95 0.93!) Aug 17 2010 (-O3)
0.210/0.196 0.555/0.532 Sun Fortran 95 8.3 Linux_i386 2007/05/03
0.208/0.184 0.920/0.888 PathScale 3.2.99
0.176/0.152 2.185/2.168 NAGWare Fortran 5.1
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.127/0.125 1.091/1.080 GCC 4.9 (trunk, -Ofast)
0.120/0.118 0.465/0.459 g95 4.0.3 (g95 0.94!) Dec 17 2012
0.136/0.131 0.527/0.524 PathScale EKOPath 4.9.0
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0.335/0.316 2.866/2.860 GCC 4.7.2 20120920 (Cray Inc.)
0.204/0.188 0.659/0.628 Cray Fortran : Version 8.1.6
0.881/0.328 1.281/0.672 Intel 64, Version 13.1.1.163
0.444/0.432 0.884/0.864 pgf90 12.10-0
-------------------------------------------------------------------
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2013-04-17 14:50 ` burnus at gcc dot gnu.org
@ 2013-04-18 1:21 ` jvdelisle at gcc dot gnu.org
2013-04-19 10:34 ` jb at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2013-04-18 1:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #6 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> 2013-04-18 01:21:42 UTC ---
I like Jannes idea with the flags. Also, it seems that at the time we open a
file we know it is /dev/null or /dev/nul in some cases by the file name. It
would be very low overhead in a few cases to disable some or all checks and
even disable the writing completely. We would not get all situations, but the
low hanging fruit we could. It could be done by setting a "NULL" bit.
One could consider doing this at compile time in some cases where the frontend
could have more elaborate configuration checks that determine the name of the
null device on the target system and look for its use. (probably not really
worth if fur NULL I/O
The other idea to consider is a compiler flag, say -fast-IO or similar that
also disables the extra error checking that is not critical to runtime after a
program has been debugged.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2013-04-18 1:21 ` jvdelisle at gcc dot gnu.org
@ 2013-04-19 10:34 ` jb at gcc dot gnu.org
2013-04-29 9:35 ` burnus at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: jb at gcc dot gnu.org @ 2013-04-19 10:34 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
Janne Blomqvist <jb at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
URL| |http://gcc.gnu.org/ml/gcc-p
| |atches/2013-04/msg01127.htm
| |l
--- Comment #7 from Janne Blomqvist <jb at gcc dot gnu.org> 2013-04-19 10:34:20 UTC ---
Patch implementing the "unbuffered really means buffered but flush after every
write" idea: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01127.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2013-04-19 10:34 ` jb at gcc dot gnu.org
@ 2013-04-29 9:35 ` burnus at gcc dot gnu.org
2013-04-29 9:36 ` burnus at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-04-29 9:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #8 from Tobias Burnus <burnus at gcc dot gnu.org> 2013-04-29 09:34:58 UTC ---
Author: jb
Date: Mon Apr 29 08:42:00 2013
New Revision: 198390
URL: http://gcc.gnu.org/viewcvs?rev=198390&root=gcc&view=rev
Log:
PR 56981 Improve unbuffered performance on special files.
2013-04-29 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/56981
* io/transfer.c (next_record_w_unf): First fix head marker, then
write tail.
(next_record): Call flush_if_unbuffered.
* io/unix.c (struct unix_stream): Add field unbuffered.
(flush_if_unbuffered): New function.
(fd_to_stream): New argument.
(open_external): Fix fd_to_stream call.
(input_stream): Likewise.
(output_stream): Likewise.
(error_stream): Likewise.
* io/unix.h (flush_if_unbuffered): New prototype.
Modified:
trunk/libgfortran/ChangeLog
trunk/libgfortran/io/transfer.c
trunk/libgfortran/io/unix.c
trunk/libgfortran/io/unix.h
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2013-04-29 9:35 ` burnus at gcc dot gnu.org
@ 2013-04-29 9:36 ` burnus at gcc dot gnu.org
2013-12-21 20:15 ` dominiq at lps dot ens.fr
` (2 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-04-29 9:36 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #9 from Tobias Burnus <burnus at gcc dot gnu.org> 2013-04-29 09:36:04 UTC ---
Follow-up idea regarding the flushing of when the buffer is full to avoid
unnecessary seeks: http://gcc.gnu.org/ml/fortran/2013-04/msg00258.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2013-04-29 9:36 ` burnus at gcc dot gnu.org
@ 2013-12-21 20:15 ` dominiq at lps dot ens.fr
2014-06-08 18:08 ` jvdelisle at gcc dot gnu.org
2014-06-08 23:57 ` jvdelisle at gcc dot gnu.org
10 siblings, 0 replies; 11+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-12-21 20:15 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
Dominique d'Humieres <dominiq at lps dot ens.fr> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2013-12-21
Ever confirmed|0 |1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2013-12-21 20:15 ` dominiq at lps dot ens.fr
@ 2014-06-08 18:08 ` jvdelisle at gcc dot gnu.org
2014-06-08 23:57 ` jvdelisle at gcc dot gnu.org
10 siblings, 0 replies; 11+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2014-06-08 18:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #11 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
Janne, can you post some new benchmarks for comparison?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/56981] Slow I/O: Unformatted 5x slower, large sys component; formatted slow as well
[not found] <bug-56981-4@http.gcc.gnu.org/bugzilla/>
` (9 preceding siblings ...)
2014-06-08 18:08 ` jvdelisle at gcc dot gnu.org
@ 2014-06-08 23:57 ` jvdelisle at gcc dot gnu.org
10 siblings, 0 replies; 11+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2014-06-08 23:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56981
--- Comment #12 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
$ strace -c ./a.out
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.97 0.078667 0 1000000 write
0.03 0.000022 2 11 fstat
0.00 0.000000 0 5 read
0.00 0.000000 0 17 10 open
0.00 0.000000 0 7 close
0.00 0.000000 0 8 6 stat
0.00 0.000000 0 17 mmap
0.00 0.000000 0 12 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 10 rt_sigaction
0.00 0.000000 0 10 9 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 readlink
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.078689 1000104 25 total
This is looking pretty good now!
^ permalink raw reply [flat|nested] 11+ messages in thread