public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch, libfortran] Adjust block size for libgfortran for unformatted reads
@ 2019-07-08  4:38 Thomas Koenig
  2019-07-08  8:19 ` Manfred Schwarb
  2019-07-08 14:00 ` Janne Blomqvist
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Koenig @ 2019-07-08  4:38 UTC (permalink / raw)
  To: fortran, gcc-patches; +Cc: David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

Hello world,

the attached patch sets the I/O block size for unformatted files to
2**17 and makes this, and the block size for formatted files,
adjustable via environment variables.

The main reason is that -fconvert=big-endian was quite slow on
some HPC systems. A bigger buffer should eliminate that.  Also,
People who use unformatted files are likely to write large amounts
of data, so this seems like a good fit.  Finally, some benchmarking
showed that 131072 seemed like a good value to use. Thanks to Jerry
for support.

I didn't change the value for formatted files because, frankly, we are
using a lot of CPU for converting numbers there, so any gain
negligible (unless somebody comes up with a benchmark which says
otherwise).

As this is a change in behavior / new feature, I don't think that
backporting is indicated, but if somebody feels otherwise, please
speak up.

Regression-tested. OK for trunk?

Regards

	Thomas

2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
	(GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* io/unix.c (BUFFER_SIZE): Delete.
	(BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
	(BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
	(unix_stream): Add buffer_size.
	(buf_read): Use s->buffer_size instead of BUFFER_SIZE.
	(buf_write): Likewise.
	(buf_init): Add argument unformatted.  Handle block sizes
	for unformatted vs. formatted, using defaults if provided.
	(fd_to_stream): Add argument unformatted in call to buf_init.
	* libgfortran.h (options_t): Add buffer_size_formatted and
	buffer_size_unformatted.
	* runtime/environ.c (variable_table): Add
	GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.


[-- Attachment #2: p2.diff --]
[-- Type: text/x-patch, Size: 6715 bytes --]

Index: gcc/fortran/gfortran.texi
===================================================================
--- gcc/fortran/gfortran.texi	(Revision 273183)
+++ gcc/fortran/gfortran.texi	(Arbeitskopie)
@@ -611,6 +611,8 @@ Malformed environment variables are silently ignor
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
+* GFORTRAN_BUFFER_SIZE_FORMATTED:: Buffer size for formatted files.
+* GFORTRAN_BUFFER_SIZE_UNFORMATTED:: Buffer size for unformatted files.
 @end menu
 
 @node TMPDIR
@@ -782,6 +784,20 @@ the backtracing, set the variable to @samp{n}, @sa
 Default is to print a backtrace unless the @option{-fno-backtrace}
 compile option was used.
 
+@node GFORTRAN_BUFFER_SIZE_FORMATTED
+@section @env{GFORTRAN_BUFFER_SIZE_FORMATTED}---Set buffer size for formatted I/O
+
+The @env{GFORTRAN_BUFFER_SIZE_FORMATTED} environment variable
+specifies buffer size in bytes to be used for formatted output.
+The default value is 8192.
+
+@node GFORTRAN_BUFFER_SIZE_UNFORMATTED
+@section @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED}---Set buffer size for unformatted I/O
+
+The @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED} environment variable
+specifies buffer size in bytes to be used for unformatted output.
+The default value is 131072.
+
 @c =====================================================================
 @c PART II: LANGUAGE REFERENCE
 @c =====================================================================
Index: libgfortran/io/unix.c
===================================================================
--- libgfortran/io/unix.c	(Revision 273183)
+++ libgfortran/io/unix.c	(Arbeitskopie)
@@ -193,7 +193,8 @@ fallback_access (const char *path, int mode)
 
 /* Unix and internal stream I/O module */
 
-static const int BUFFER_SIZE = 8192;
+static const int BUFFER_SIZE_FORMATTED_DEFAULT = 8192;
+static const int BUFFER_SIZE_UNFORMATTED_DEFAULT = 128*1024;
 
 typedef struct
 {
@@ -205,6 +206,7 @@ typedef struct
   gfc_offset file_length;	/* Length of the file. */
 
   char *buffer;                 /* Pointer to the buffer.  */
+  ssize_t buffer_size;           /* Length of the buffer.  */
   int fd;                       /* The POSIX file descriptor.  */
 
   int active;			/* Length of valid bytes in the buffer */
@@ -592,9 +594,9 @@ buf_read (unix_stream *s, void *buf, ssize_t nbyte
           && raw_seek (s, new_logical, SEEK_SET) < 0)
         return -1;
       s->buffer_offset = s->physical_offset = new_logical;
-      if (to_read <= BUFFER_SIZE/2)
+      if (to_read <= s->buffer_size/2)
         {
-          did_read = raw_read (s, s->buffer, BUFFER_SIZE);
+          did_read = raw_read (s, s->buffer, s->buffer_size);
 	  if (likely (did_read >= 0))
 	    {
 	      s->physical_offset += did_read;
@@ -632,11 +634,11 @@ buf_write (unix_stream *s, const void *buf, ssize_
     s->buffer_offset = s->logical_offset;
 
   /* Does the data fit into the buffer?  As a special case, if the
-     buffer is empty and the request is bigger than BUFFER_SIZE/2,
+     buffer is empty and the request is bigger than s->buffer_size/2,
      write directly. This avoids the case where the buffer would have
      to be flushed at every write.  */
-  if (!(s->ndirty == 0 && nbyte > BUFFER_SIZE/2)
-      && s->logical_offset + nbyte <= s->buffer_offset + BUFFER_SIZE
+  if (!(s->ndirty == 0 && nbyte > s->buffer_size/2)
+      && s->logical_offset + nbyte <= s->buffer_offset + s->buffer_size
       && s->buffer_offset <= s->logical_offset
       && s->buffer_offset + s->ndirty >= s->logical_offset)
     {
@@ -651,7 +653,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
          the request is bigger than the buffer size, write directly
          bypassing the buffer.  */
       buf_flush (s);
-      if (nbyte <= BUFFER_SIZE/2)
+      if (nbyte <= s->buffer_size/2)
         {
           memcpy (s->buffer, buf, nbyte);
           s->buffer_offset = s->logical_offset;
@@ -688,7 +690,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
 static int
 buf_markeor (unix_stream *s)
 {
-  if (s->unbuffered || s->ndirty >= BUFFER_SIZE / 2)
+  if (s->unbuffered || s->ndirty >= s->buffer_size / 2)
     return buf_flush (s);
   return 0;
 }
@@ -765,11 +767,32 @@ static const struct stream_vtable buf_vtable = {
 };
 
 static int
-buf_init (unix_stream *s)
+buf_init (unix_stream *s, bool unformatted)
 {
   s->st.vptr = &buf_vtable;
 
-  s->buffer = xmalloc (BUFFER_SIZE);
+  /* Try to guess a good value for the buffer size.  For formatted
+     I/O, we use so many CPU cycles converting the data that there is
+     more sense in converving memory and especially cache.  For
+     unformatted, a bigger block can have a large impact in some
+     environments.  */
+
+  if (unformatted)
+    {
+      if (options.buffer_size_unformatted > 0)
+	s->buffer_size = options.buffer_size_unformatted;
+      else
+	s->buffer_size = BUFFER_SIZE_UNFORMATTED_DEFAULT;
+    }
+  else
+    {
+      if (options.buffer_size_formatted > 0)
+	s->buffer_size = options.buffer_size_formatted;
+      else
+	s->buffer_size = BUFFER_SIZE_FORMATTED_DEFAULT;
+    }
+
+  s->buffer = xmalloc (s->buffer_size);
   return 0;
 }
 
@@ -1120,13 +1143,13 @@ fd_to_stream (int fd, bool unformatted)
 	   (s->fd == STDIN_FILENO 
 	    || s->fd == STDOUT_FILENO 
 	    || s->fd == STDERR_FILENO)))
-    buf_init (s);
+    buf_init (s, unformatted);
   else
     {
       if (unformatted)
 	{
 	  s->unbuffered = true;
-	  buf_init (s);
+	  buf_init (s, unformatted);
 	}
       else
 	raw_init (s);
Index: libgfortran/libgfortran.h
===================================================================
--- libgfortran/libgfortran.h	(Revision 273183)
+++ libgfortran/libgfortran.h	(Arbeitskopie)
@@ -540,6 +540,7 @@ typedef struct
 
   int all_unbuffered, unbuffered_preconnected;
   int fpe, backtrace;
+  int buffer_size_unformatted, buffer_size_formatted;
 }
 options_t;
 
Index: libgfortran/runtime/environ.c
===================================================================
--- libgfortran/runtime/environ.c	(Revision 273183)
+++ libgfortran/runtime/environ.c	(Arbeitskopie)
@@ -198,6 +198,14 @@ static variable variable_table[] = {
   /* Print out a backtrace if possible on runtime error */
   { "GFORTRAN_ERROR_BACKTRACE", -1, &options.backtrace, init_boolean },
 
+  /* Buffer size for unformatted files.  */
+  { "GFORTRAN_BUFFER_SIZE_UNFORMATTED", 0, &options.buffer_size_unformatted,
+    init_integer },
+
+  /* Buffer size for formatted files.  */
+  { "GFORTRAN_BUFFER_SIZE_FORMATTED", 0, &options.buffer_size_formatted,
+    init_integer },
+
   { NULL, 0, NULL, NULL }
 };
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-08  4:38 [patch, libfortran] Adjust block size for libgfortran for unformatted reads Thomas Koenig
@ 2019-07-08  8:19 ` Manfred Schwarb
  2019-07-08 13:26   ` Janne Blomqvist
  2019-07-08 14:00 ` Janne Blomqvist
  1 sibling, 1 reply; 10+ messages in thread
From: Manfred Schwarb @ 2019-07-08  8:19 UTC (permalink / raw)
  To: Thomas Koenig, fortran, gcc-patches; +Cc: David Edelsohn

Am 07.07.19 um 22:13 schrieb Thomas Koenig:
> Hello world,
>
> the attached patch sets the I/O block size for unformatted files to
> 2**17 and makes this, and the block size for formatted files,
> adjustable via environment variables.
>
> The main reason is that -fconvert=big-endian was quite slow on
> some HPC systems. A bigger buffer should eliminate that.  Also,
> People who use unformatted files are likely to write large amounts
> of data, so this seems like a good fit.  Finally, some benchmarking
> showed that 131072 seemed like a good value to use. Thanks to Jerry
> for support.
>
> I didn't change the value for formatted files because, frankly, we are
> using a lot of CPU for converting numbers there, so any gain
> negligible (unless somebody comes up with a benchmark which says
> otherwise).

formatted write: Did you try writing to an USB stick or similar? I guess
for flash based devices anything below 64k will slow down operation.
Computers like Raspberry Pi and the like often have flash based storage,
and it is not extremely unrealistic to run fortran programs on them.


Cheers.
Manfred

>
> As this is a change in behavior / new feature, I don't think that
> backporting is indicated, but if somebody feels otherwise, please
> speak up.
>
> Regression-tested. OK for trunk?
>
> Regards
>
>     Thomas
>
> 2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>
>
>     PR libfortran/91030
>     * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
>     (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.
>
> 2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>
>
>     PR libfortran/91030
>     * io/unix.c (BUFFER_SIZE): Delete.
>     (BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
>     (BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
>     (unix_stream): Add buffer_size.
>     (buf_read): Use s->buffer_size instead of BUFFER_SIZE.
>     (buf_write): Likewise.
>     (buf_init): Add argument unformatted.  Handle block sizes
>     for unformatted vs. formatted, using defaults if provided.
>     (fd_to_stream): Add argument unformatted in call to buf_init.
>     * libgfortran.h (options_t): Add buffer_size_formatted and
>     buffer_size_unformatted.
>     * runtime/environ.c (variable_table): Add
>     GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-08  8:19 ` Manfred Schwarb
@ 2019-07-08 13:26   ` Janne Blomqvist
  2019-07-08 16:31     ` Steve Kargl
  0 siblings, 1 reply; 10+ messages in thread
From: Janne Blomqvist @ 2019-07-08 13:26 UTC (permalink / raw)
  To: Manfred Schwarb; +Cc: Thomas Koenig, fortran, gcc-patches, David Edelsohn

On Mon, Jul 8, 2019 at 11:18 AM Manfred Schwarb <manfred99@gmx.ch> wrote:
>
> Am 07.07.19 um 22:13 schrieb Thomas Koenig:
> > Hello world,
> >
> > the attached patch sets the I/O block size for unformatted files to
> > 2**17 and makes this, and the block size for formatted files,
> > adjustable via environment variables.
> >
> > The main reason is that -fconvert=big-endian was quite slow on
> > some HPC systems. A bigger buffer should eliminate that.  Also,
> > People who use unformatted files are likely to write large amounts
> > of data, so this seems like a good fit.  Finally, some benchmarking
> > showed that 131072 seemed like a good value to use. Thanks to Jerry
> > for support.
> >
> > I didn't change the value for formatted files because, frankly, we are
> > using a lot of CPU for converting numbers there, so any gain
> > negligible (unless somebody comes up with a benchmark which says
> > otherwise).
>
> formatted write: Did you try writing to an USB stick or similar? I guess
> for flash based devices anything below 64k will slow down operation.
> Computers like Raspberry Pi and the like often have flash based storage,
> and it is not extremely unrealistic to run fortran programs on them.

Good point. If you happen to have a USB stick handy, can you try the
simple C benchmark program at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?

(the kernel will coalesce IO's by itself, so the granularity of IO
syscalls is not necessarily the same as the actual IO to devices.
Network filesystems like NFS/Lustre/GPFS may have less latitude here
due to coherency requirements etc.)


--
Janne Blomqvist

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-08  4:38 [patch, libfortran] Adjust block size for libgfortran for unformatted reads Thomas Koenig
  2019-07-08  8:19 ` Manfred Schwarb
@ 2019-07-08 14:00 ` Janne Blomqvist
  1 sibling, 0 replies; 10+ messages in thread
From: Janne Blomqvist @ 2019-07-08 14:00 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: fortran, gcc-patches, David Edelsohn

On Sun, Jul 7, 2019 at 11:13 PM Thomas Koenig <tkoenig@netcologne.de> wrote:
>
> Hello world,
>
> the attached patch sets the I/O block size for unformatted files to
> 2**17 and makes this, and the block size for formatted files,
> adjustable via environment variables.

Should the default be affected by the page size
(sysconf(_SC_PAGESIZE)) and/or the IO blocksize (we already fstat() a
file when we open it, so we could get st_blksize for free)?

Though one could of course argue those aren't particularly useful;
except for power, everything uses a 4k page size (?), and the IO
blocksize varies from too small (4k, ext4) to too large (1 MB, NFS (or
whatever rsize/wsize is set to), or 4 MB for Lustre).

> 2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>
>
>         PR libfortran/91030
>         * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
>         (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

One of these should be _UNFORMATTED?

-- 
Janne Blomqvist

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-08 13:26   ` Janne Blomqvist
@ 2019-07-08 16:31     ` Steve Kargl
  2019-07-09 19:22       ` Bernhard Reutner-Fischer
  0 siblings, 1 reply; 10+ messages in thread
From: Steve Kargl @ 2019-07-08 16:31 UTC (permalink / raw)
  To: Janne Blomqvist
  Cc: Manfred Schwarb, Thomas Koenig, fortran, gcc-patches, David Edelsohn

On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote:
> 
> Good point. If you happen to have a USB stick handy, can you try the
> simple C benchmark program at
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?
> 
> (the kernel will coalesce IO's by itself, so the granularity of IO
> syscalls is not necessarily the same as the actual IO to devices.
> Network filesystems like NFS/Lustre/GPFS may have less latitude here
> due to coherency requirements etc.)
> 

Writing to USB is constrained be the speed of the bus.

kernel: ugen6.3: <Kingston DataTraveler 3.0> at usbus6
kernel: umass1 on uhub0
kernel: umass1: <Kingston DataTraveler 3.0, class 0/0, rev 2.10/0.01, addr 3> on usbus6
kernel: da1 at umass-sim1 bus 1 scbus5 target 0 lun 0
kernel: da1: <Kingston DataTraveler 3.0 > Removable Direct Access SPC-4 SCSI device
kernel: da1: Serial Number 0C9D927F0A77F330A89D1210
kernel: da1: 40.000MB/s transfers
kernel: da1: 29510MB (60437492 512 byte sectors)
kernel: da1: quirks=0x2<NO_6_BYTE>

Mounting the thumb drive as a MSDOSFS on FreeBSD.

Test using 100000000 bytes
Block size of file system: 16384
bs =       1024, 8.86 MiB/s
bs =       2048, 7.52 MiB/s
bs =       4096, 7.32 MiB/s
bs =       8192, 7.32 MiB/s
bs =      16384, 7.40 MiB/s
bs =      32768, 7.40 MiB/s
bs =      65536, 5.57 MiB/s
bs =     131072, 7.43 MiB/s
bs =     262144, 5.55 MiB/s
bs =     524288, 7.43 MiB/s
bs =    1048576, 5.56 MiB/s
bs =    2097152, 7.39 MiB/s
bs =    4194304, 7.62 MiB/s
bs =    8388608, 5.58 MiB/s
bs =   16777216, 7.42 MiB/s
bs =   33554432, 5.59 MiB/s
bs =   67108864, 4.77 MiB/s

For the same binary, writing to a UFS2 on a SATAII SSD

Test using 100000000 bytes
Block size of file system: 32768
bs =       1024, 123.09 MiB/s
bs =       2048, 210.30 MiB/s
bs =       4096, 184.75 MiB/s
bs =       8192, 237.28 MiB/s
bs =      16384, 244.42 MiB/s
bs =      32768, 243.20 MiB/s
bs =      65536, 253.77 MiB/s
bs =     131072, 243.32 MiB/s
bs =     262144, 238.81 MiB/s
bs =     524288, 243.87 MiB/s
bs =    1048576, 246.55 MiB/s
bs =    2097152, 242.39 MiB/s
bs =    4194304, 243.68 MiB/s
bs =    8388608, 243.88 MiB/s
bs =   16777216, 242.21 MiB/s
bs =   33554432, 253.19 MiB/s
bs =   67108864, 178.29 MiB/s

-- 
Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-08 16:31     ` Steve Kargl
@ 2019-07-09 19:22       ` Bernhard Reutner-Fischer
  2019-07-14 10:09         ` Thomas Koenig
  0 siblings, 1 reply; 10+ messages in thread
From: Bernhard Reutner-Fischer @ 2019-07-09 19:22 UTC (permalink / raw)
  To: Steve Kargl
  Cc: Janne Blomqvist, Manfred Schwarb, Thomas Koenig, fortran,
	gcc-patches, David Edelsohn, Bernhard Reutner-Fischer

On Mon, 8 Jul 2019 09:05:04 -0700
Steve Kargl <sgk@troutmask.apl.washington.edu> wrote:

> On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote:
> > 
> > Good point. If you happen to have a USB stick handy, can you try the
> > simple C benchmark program at
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?
> > 
> > (the kernel will coalesce IO's by itself, so the granularity of IO
> > syscalls is not necessarily the same as the actual IO to devices.
> > Network filesystems like NFS/Lustre/GPFS may have less latitude here
> > due to coherency requirements etc.)

GFORTRAN_BUFFER_SIZE_FORMATTED sounds a bit odd, maybe
GFORTRAN_(UN)FORMATTED_BUFFER_SIZE would read more natural.

And let me note 2 flaws in the benchmark:

>       left = N;
>       w = p;
>       t1 = walltime();
>       while (left > 0)
>         {
>           if (left >= blocksize)
>             to_write = blocksize;
>           else
>             to_write = left;
> 
>           write (fd, w, blocksize);

1) this should write to_write, not blocksize i assume.
2) you don't catch short writes

>           w += to_write;
>           left -= to_write;
>         }

So, short of using iozone, it should probably be more like (modulo
typos):
      left = N;
      w = p;
      t1 = walltime();
      while (left > 0)
        {
          if (left >= blocksize)
            to_write = blocksize;
          else
            to_write = left;
          while (to_write > 0) {
            errno = 0;
            ssize_t wrote = write (fd, w, to_write);
            if (wrote < 0 && errno != EINTR) /* retry EINTR or bail */
              break;
            w += wrote;
            left -= wrote;
            to_write -= wrote;
          }
        }
thanks,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-09 19:22       ` Bernhard Reutner-Fischer
@ 2019-07-14 10:09         ` Thomas Koenig
  2019-07-14 10:50           ` Thomas Koenig
  2019-07-14 18:55           ` Steve Kargl
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Koenig @ 2019-07-14 10:09 UTC (permalink / raw)
  To: Bernhard Reutner-Fischer, Steve Kargl
  Cc: Janne Blomqvist, Manfred Schwarb, fortran, gcc-patches, David Edelsohn

OK, so here is a new version.

I think the discussion has shown that enlaring the buffer makes sense,
and that the buffer size for unformatted seems to be too bad.

I've reversed the names of the environment variables according to
Behnard's suggestion.

So, OK for trunk?

Also, what should we do about gcc-9?  I have now come to think
that we should add the environment variables to set the buffer lengths,
but leave the old default (8192).

What do you think?

Regards

	Thomas

2019-07-14  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* gfortran.texi (GFORTRAN_FORMATTED_BUFFER_SIZE): Document
	(GFORTRAN_UNFORMATTED_BUFFER_SIZE): Likewise.

2019-07-14  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* io/unix.c (BUFFER_SIZE): Delete.
	(BUFFER_FORMATTED_SIZE_DEFAULT): New variable.
	(BUFFER_UNFORMATTED_SIZE_DEFAULT): New variable.
	(unix_stream): Add buffer_size.
	(buf_read): Use s->buffer_size instead of BUFFER_SIZE.
	(buf_write): Likewise.
	(buf_init): Add argument unformatted.  Handle block sizes
	for unformatted vs. formatted, using defaults if provided.
	(fd_to_stream): Add argument unformatted in call to buf_init.
	* libgfortran.h (options_t): Add buffer_size_formatted and
	buffer_size_unformatted.
	* runtime/environ.c (variable_table): Add
	GFORTRAN_UNFORMATTED_BUFFER_SIZE and
	GFORTRAN_FORMATTED_BUFFER_SIZE.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-14 10:09         ` Thomas Koenig
@ 2019-07-14 10:50           ` Thomas Koenig
  2019-07-14 18:55           ` Steve Kargl
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Koenig @ 2019-07-14 10:50 UTC (permalink / raw)
  To: Bernhard Reutner-Fischer, Steve Kargl
  Cc: Janne Blomqvist, Manfred Schwarb, fortran, gcc-patches, David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 46 bytes --]

... of course, better with the actual patch.


[-- Attachment #2: p3.diff --]
[-- Type: text/x-patch, Size: 6715 bytes --]

Index: gcc/fortran/gfortran.texi
===================================================================
--- gcc/fortran/gfortran.texi	(Revision 273183)
+++ gcc/fortran/gfortran.texi	(Arbeitskopie)
@@ -611,6 +611,8 @@ Malformed environment variables are silently ignor
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
+* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files.
+* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files.
 @end menu
 
 @node TMPDIR
@@ -782,6 +784,20 @@ the backtracing, set the variable to @samp{n}, @sa
 Default is to print a backtrace unless the @option{-fno-backtrace}
 compile option was used.
 
+@node GFORTRAN_FORMATTED_BUFFER_SIZE
+@section @env{GFORTRAN_FORMATTED_BUFFER_SIZE}---Set buffer size for formatted I/O
+
+The @env{GFORTRAN_FORMATTED_BUFFER_SIZE} environment variable
+specifies buffer size in bytes to be used for formatted output.
+The default value is 8192.
+
+@node GFORTRAN_UNFORMATTED_BUFFER_SIZE
+@section @env{GFORTRAN_UNFORMATTED_BUFFER_SIZE}---Set buffer size for unformatted I/O
+
+The @env{GFORTRAN_UNFORMATTED_BUFFER_SIZE} environment variable
+specifies buffer size in bytes to be used for unformatted output.
+The default value is 131072.
+
 @c =====================================================================
 @c PART II: LANGUAGE REFERENCE
 @c =====================================================================
Index: libgfortran/io/unix.c
===================================================================
--- libgfortran/io/unix.c	(Revision 273183)
+++ libgfortran/io/unix.c	(Arbeitskopie)
@@ -193,7 +193,8 @@ fallback_access (const char *path, int mode)
 
 /* Unix and internal stream I/O module */
 
-static const int BUFFER_SIZE = 8192;
+static const int FORMATTED_BUFFER_SIZE_DEFAULT = 8192;
+static const int UNFORMATTED_BUFFER_SIZE_DEFAULT = 128*1024;
 
 typedef struct
 {
@@ -205,6 +206,7 @@ typedef struct
   gfc_offset file_length;	/* Length of the file. */
 
   char *buffer;                 /* Pointer to the buffer.  */
+  ssize_t buffer_size;           /* Length of the buffer.  */
   int fd;                       /* The POSIX file descriptor.  */
 
   int active;			/* Length of valid bytes in the buffer */
@@ -592,9 +594,9 @@ buf_read (unix_stream *s, void *buf, ssize_t nbyte
           && raw_seek (s, new_logical, SEEK_SET) < 0)
         return -1;
       s->buffer_offset = s->physical_offset = new_logical;
-      if (to_read <= BUFFER_SIZE/2)
+      if (to_read <= s->buffer_size/2)
         {
-          did_read = raw_read (s, s->buffer, BUFFER_SIZE);
+          did_read = raw_read (s, s->buffer, s->buffer_size);
 	  if (likely (did_read >= 0))
 	    {
 	      s->physical_offset += did_read;
@@ -632,11 +634,11 @@ buf_write (unix_stream *s, const void *buf, ssize_
     s->buffer_offset = s->logical_offset;
 
   /* Does the data fit into the buffer?  As a special case, if the
-     buffer is empty and the request is bigger than BUFFER_SIZE/2,
+     buffer is empty and the request is bigger than s->buffer_size/2,
      write directly. This avoids the case where the buffer would have
      to be flushed at every write.  */
-  if (!(s->ndirty == 0 && nbyte > BUFFER_SIZE/2)
-      && s->logical_offset + nbyte <= s->buffer_offset + BUFFER_SIZE
+  if (!(s->ndirty == 0 && nbyte > s->buffer_size/2)
+      && s->logical_offset + nbyte <= s->buffer_offset + s->buffer_size
       && s->buffer_offset <= s->logical_offset
       && s->buffer_offset + s->ndirty >= s->logical_offset)
     {
@@ -651,7 +653,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
          the request is bigger than the buffer size, write directly
          bypassing the buffer.  */
       buf_flush (s);
-      if (nbyte <= BUFFER_SIZE/2)
+      if (nbyte <= s->buffer_size/2)
         {
           memcpy (s->buffer, buf, nbyte);
           s->buffer_offset = s->logical_offset;
@@ -688,7 +690,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
 static int
 buf_markeor (unix_stream *s)
 {
-  if (s->unbuffered || s->ndirty >= BUFFER_SIZE / 2)
+  if (s->unbuffered || s->ndirty >= s->buffer_size / 2)
     return buf_flush (s);
   return 0;
 }
@@ -765,11 +767,32 @@ static const struct stream_vtable buf_vtable = {
 };
 
 static int
-buf_init (unix_stream *s)
+buf_init (unix_stream *s, bool unformatted)
 {
   s->st.vptr = &buf_vtable;
 
-  s->buffer = xmalloc (BUFFER_SIZE);
+  /* Try to guess a good value for the buffer size.  For formatted
+     I/O, we use so many CPU cycles converting the data that there is
+     more sense in converving memory and especially cache.  For
+     unformatted, a bigger block can have a large impact in some
+     environments.  */
+
+  if (unformatted)
+    {
+      if (options.unformatted_buffer_size > 0)
+	s->buffer_size = options.unformatted_buffer_size;
+      else
+	s->buffer_size = UNFORMATTED_BUFFER_SIZE_DEFAULT;
+    }
+  else
+    {
+      if (options.formatted_buffer_size > 0)
+	s->buffer_size = options.formatted_buffer_size;
+      else
+	s->buffer_size = FORMATTED_BUFFER_SIZE_DEFAULT;
+    }
+
+  s->buffer = xmalloc (s->buffer_size);
   return 0;
 }
 
@@ -1120,13 +1143,13 @@ fd_to_stream (int fd, bool unformatted)
 	   (s->fd == STDIN_FILENO 
 	    || s->fd == STDOUT_FILENO 
 	    || s->fd == STDERR_FILENO)))
-    buf_init (s);
+    buf_init (s, unformatted);
   else
     {
       if (unformatted)
 	{
 	  s->unbuffered = true;
-	  buf_init (s);
+	  buf_init (s, unformatted);
 	}
       else
 	raw_init (s);
Index: libgfortran/libgfortran.h
===================================================================
--- libgfortran/libgfortran.h	(Revision 273183)
+++ libgfortran/libgfortran.h	(Arbeitskopie)
@@ -540,6 +540,7 @@ typedef struct
 
   int all_unbuffered, unbuffered_preconnected;
   int fpe, backtrace;
+  int unformatted_buffer_size, formatted_buffer_size;
 }
 options_t;
 
Index: libgfortran/runtime/environ.c
===================================================================
--- libgfortran/runtime/environ.c	(Revision 273183)
+++ libgfortran/runtime/environ.c	(Arbeitskopie)
@@ -198,6 +198,14 @@ static variable variable_table[] = {
   /* Print out a backtrace if possible on runtime error */
   { "GFORTRAN_ERROR_BACKTRACE", -1, &options.backtrace, init_boolean },
 
+  /* Buffer size for unformatted files.  */
+  { "GFORTRAN_UNFORMATTED_BUFFER_SIZE", 0, &options.unformatted_buffer_size,
+    init_integer },
+
+  /* Buffer size for formatted files.  */
+  { "GFORTRAN_FORMATTED_BUFFER_SIZE", 0, &options.formatted_buffer_size,
+    init_integer },
+
   { NULL, 0, NULL, NULL }
 };
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-14 10:09         ` Thomas Koenig
  2019-07-14 10:50           ` Thomas Koenig
@ 2019-07-14 18:55           ` Steve Kargl
  2019-07-19 21:23             ` Thomas Koenig
  1 sibling, 1 reply; 10+ messages in thread
From: Steve Kargl @ 2019-07-14 18:55 UTC (permalink / raw)
  To: Thomas Koenig
  Cc: Bernhard Reutner-Fischer, Janne Blomqvist, Manfred Schwarb,
	fortran, gcc-patches, David Edelsohn

On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:
> OK, so here is a new version.
> 
> I think the discussion has shown that enlaring the buffer makes sense,
> and that the buffer size for unformatted seems to be too bad.
> 
> I've reversed the names of the environment variables according to
> Behnard's suggestion.
> 
> So, OK for trunk?
> 
> Also, what should we do about gcc-9?  I have now come to think
> that we should add the environment variables to set the buffer lengths,
> but leave the old default (8192).
> 
> What do you think?
> 

If you are inclined to back port a portion of the patch to 9-branch,
then bumping up the old default would seem to be the most important
part.  As dje noted, users seem to have an aversion to reading the
documentation, so finding the environment variables may not happen.

Isn't 8192 an internal implementation detail for libgfortran?  Can
bumping it to larger value in 9-branch cause an issue for a normal
user?

-- 
Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads
  2019-07-14 18:55           ` Steve Kargl
@ 2019-07-19 21:23             ` Thomas Koenig
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Koenig @ 2019-07-19 21:23 UTC (permalink / raw)
  To: sgk
  Cc: Bernhard Reutner-Fischer, Janne Blomqvist, Manfred Schwarb,
	fortran, gcc-patches, David Edelsohn

Hi Steve,

> On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:
>> OK, so here is a new version.
>>
>> I think the discussion has shown that enlaring the buffer makes sense,
>> and that the buffer size for unformatted seems to be too bad.
>>
>> I've reversed the names of the environment variables according to
>> Behnard's suggestion.
>>
>> So, OK for trunk?
>>
>> Also, what should we do about gcc-9?  I have now come to think
>> that we should add the environment variables to set the buffer lengths,
>> but leave the old default (8192).
>>
>> What do you think?
>>
> 
> If you are inclined to back port a portion of the patch to 9-branch,
> then bumping up the old default would seem to be the most important
> part.  As dje noted, users seem to have an aversion to reading the
> documentation, so finding the environment variables may not happen.
> 
> Isn't 8192 an internal implementation detail for libgfortran?  Can
> bumping it to larger value in 9-branch cause an issue for a normal
> user?

Well, it allocates a bigger memory block, that's all.

Upon reconsideration, I think your point about people not reading
the docs is valid :-|

So, I will commit the patch to trunk over the weekend and to 9.2
a few days afterwards, unless somebody objects.

Regards

	Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-19 20:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08  4:38 [patch, libfortran] Adjust block size for libgfortran for unformatted reads Thomas Koenig
2019-07-08  8:19 ` Manfred Schwarb
2019-07-08 13:26   ` Janne Blomqvist
2019-07-08 16:31     ` Steve Kargl
2019-07-09 19:22       ` Bernhard Reutner-Fischer
2019-07-14 10:09         ` Thomas Koenig
2019-07-14 10:50           ` Thomas Koenig
2019-07-14 18:55           ` Steve Kargl
2019-07-19 21:23             ` Thomas Koenig
2019-07-08 14:00 ` Janne Blomqvist

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).