public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch, libfortran] Adjust block size for libgfortran for unformatted reads
@ 2019-07-08  4:38 Thomas Koenig
  2019-07-08  8:19 ` Manfred Schwarb
  2019-07-08 14:00 ` Janne Blomqvist
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Koenig @ 2019-07-08  4:38 UTC (permalink / raw)
  To: fortran, gcc-patches; +Cc: David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

Hello world,

the attached patch sets the I/O block size for unformatted files to
2**17 and makes this, and the block size for formatted files,
adjustable via environment variables.

The main reason is that -fconvert=big-endian was quite slow on
some HPC systems. A bigger buffer should eliminate that.  Also,
People who use unformatted files are likely to write large amounts
of data, so this seems like a good fit.  Finally, some benchmarking
showed that 131072 seemed like a good value to use. Thanks to Jerry
for support.

I didn't change the value for formatted files because, frankly, we are
using a lot of CPU for converting numbers there, so any gain
negligible (unless somebody comes up with a benchmark which says
otherwise).

As this is a change in behavior / new feature, I don't think that
backporting is indicated, but if somebody feels otherwise, please
speak up.

Regression-tested. OK for trunk?

Regards

	Thomas

2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
	(GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* io/unix.c (BUFFER_SIZE): Delete.
	(BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
	(BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
	(unix_stream): Add buffer_size.
	(buf_read): Use s->buffer_size instead of BUFFER_SIZE.
	(buf_write): Likewise.
	(buf_init): Add argument unformatted.  Handle block sizes
	for unformatted vs. formatted, using defaults if provided.
	(fd_to_stream): Add argument unformatted in call to buf_init.
	* libgfortran.h (options_t): Add buffer_size_formatted and
	buffer_size_unformatted.
	* runtime/environ.c (variable_table): Add
	GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.


[-- Attachment #2: p2.diff --]
[-- Type: text/x-patch, Size: 6715 bytes --]

Index: gcc/fortran/gfortran.texi
===================================================================
--- gcc/fortran/gfortran.texi	(Revision 273183)
+++ gcc/fortran/gfortran.texi	(Arbeitskopie)
@@ -611,6 +611,8 @@ Malformed environment variables are silently ignor
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
+* GFORTRAN_BUFFER_SIZE_FORMATTED:: Buffer size for formatted files.
+* GFORTRAN_BUFFER_SIZE_UNFORMATTED:: Buffer size for unformatted files.
 @end menu
 
 @node TMPDIR
@@ -782,6 +784,20 @@ the backtracing, set the variable to @samp{n}, @sa
 Default is to print a backtrace unless the @option{-fno-backtrace}
 compile option was used.
 
+@node GFORTRAN_BUFFER_SIZE_FORMATTED
+@section @env{GFORTRAN_BUFFER_SIZE_FORMATTED}---Set buffer size for formatted I/O
+
+The @env{GFORTRAN_BUFFER_SIZE_FORMATTED} environment variable
+specifies buffer size in bytes to be used for formatted output.
+The default value is 8192.
+
+@node GFORTRAN_BUFFER_SIZE_UNFORMATTED
+@section @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED}---Set buffer size for unformatted I/O
+
+The @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED} environment variable
+specifies buffer size in bytes to be used for unformatted output.
+The default value is 131072.
+
 @c =====================================================================
 @c PART II: LANGUAGE REFERENCE
 @c =====================================================================
Index: libgfortran/io/unix.c
===================================================================
--- libgfortran/io/unix.c	(Revision 273183)
+++ libgfortran/io/unix.c	(Arbeitskopie)
@@ -193,7 +193,8 @@ fallback_access (const char *path, int mode)
 
 /* Unix and internal stream I/O module */
 
-static const int BUFFER_SIZE = 8192;
+static const int BUFFER_SIZE_FORMATTED_DEFAULT = 8192;
+static const int BUFFER_SIZE_UNFORMATTED_DEFAULT = 128*1024;
 
 typedef struct
 {
@@ -205,6 +206,7 @@ typedef struct
   gfc_offset file_length;	/* Length of the file. */
 
   char *buffer;                 /* Pointer to the buffer.  */
+  ssize_t buffer_size;           /* Length of the buffer.  */
   int fd;                       /* The POSIX file descriptor.  */
 
   int active;			/* Length of valid bytes in the buffer */
@@ -592,9 +594,9 @@ buf_read (unix_stream *s, void *buf, ssize_t nbyte
           && raw_seek (s, new_logical, SEEK_SET) < 0)
         return -1;
       s->buffer_offset = s->physical_offset = new_logical;
-      if (to_read <= BUFFER_SIZE/2)
+      if (to_read <= s->buffer_size/2)
         {
-          did_read = raw_read (s, s->buffer, BUFFER_SIZE);
+          did_read = raw_read (s, s->buffer, s->buffer_size);
 	  if (likely (did_read >= 0))
 	    {
 	      s->physical_offset += did_read;
@@ -632,11 +634,11 @@ buf_write (unix_stream *s, const void *buf, ssize_
     s->buffer_offset = s->logical_offset;
 
   /* Does the data fit into the buffer?  As a special case, if the
-     buffer is empty and the request is bigger than BUFFER_SIZE/2,
+     buffer is empty and the request is bigger than s->buffer_size/2,
      write directly. This avoids the case where the buffer would have
      to be flushed at every write.  */
-  if (!(s->ndirty == 0 && nbyte > BUFFER_SIZE/2)
-      && s->logical_offset + nbyte <= s->buffer_offset + BUFFER_SIZE
+  if (!(s->ndirty == 0 && nbyte > s->buffer_size/2)
+      && s->logical_offset + nbyte <= s->buffer_offset + s->buffer_size
       && s->buffer_offset <= s->logical_offset
       && s->buffer_offset + s->ndirty >= s->logical_offset)
     {
@@ -651,7 +653,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
          the request is bigger than the buffer size, write directly
          bypassing the buffer.  */
       buf_flush (s);
-      if (nbyte <= BUFFER_SIZE/2)
+      if (nbyte <= s->buffer_size/2)
         {
           memcpy (s->buffer, buf, nbyte);
           s->buffer_offset = s->logical_offset;
@@ -688,7 +690,7 @@ buf_write (unix_stream *s, const void *buf, ssize_
 static int
 buf_markeor (unix_stream *s)
 {
-  if (s->unbuffered || s->ndirty >= BUFFER_SIZE / 2)
+  if (s->unbuffered || s->ndirty >= s->buffer_size / 2)
     return buf_flush (s);
   return 0;
 }
@@ -765,11 +767,32 @@ static const struct stream_vtable buf_vtable = {
 };
 
 static int
-buf_init (unix_stream *s)
+buf_init (unix_stream *s, bool unformatted)
 {
   s->st.vptr = &buf_vtable;
 
-  s->buffer = xmalloc (BUFFER_SIZE);
+  /* Try to guess a good value for the buffer size.  For formatted
+     I/O, we use so many CPU cycles converting the data that there is
+     more sense in converving memory and especially cache.  For
+     unformatted, a bigger block can have a large impact in some
+     environments.  */
+
+  if (unformatted)
+    {
+      if (options.buffer_size_unformatted > 0)
+	s->buffer_size = options.buffer_size_unformatted;
+      else
+	s->buffer_size = BUFFER_SIZE_UNFORMATTED_DEFAULT;
+    }
+  else
+    {
+      if (options.buffer_size_formatted > 0)
+	s->buffer_size = options.buffer_size_formatted;
+      else
+	s->buffer_size = BUFFER_SIZE_FORMATTED_DEFAULT;
+    }
+
+  s->buffer = xmalloc (s->buffer_size);
   return 0;
 }
 
@@ -1120,13 +1143,13 @@ fd_to_stream (int fd, bool unformatted)
 	   (s->fd == STDIN_FILENO 
 	    || s->fd == STDOUT_FILENO 
 	    || s->fd == STDERR_FILENO)))
-    buf_init (s);
+    buf_init (s, unformatted);
   else
     {
       if (unformatted)
 	{
 	  s->unbuffered = true;
-	  buf_init (s);
+	  buf_init (s, unformatted);
 	}
       else
 	raw_init (s);
Index: libgfortran/libgfortran.h
===================================================================
--- libgfortran/libgfortran.h	(Revision 273183)
+++ libgfortran/libgfortran.h	(Arbeitskopie)
@@ -540,6 +540,7 @@ typedef struct
 
   int all_unbuffered, unbuffered_preconnected;
   int fpe, backtrace;
+  int buffer_size_unformatted, buffer_size_formatted;
 }
 options_t;
 
Index: libgfortran/runtime/environ.c
===================================================================
--- libgfortran/runtime/environ.c	(Revision 273183)
+++ libgfortran/runtime/environ.c	(Arbeitskopie)
@@ -198,6 +198,14 @@ static variable variable_table[] = {
   /* Print out a backtrace if possible on runtime error */
   { "GFORTRAN_ERROR_BACKTRACE", -1, &options.backtrace, init_boolean },
 
+  /* Buffer size for unformatted files.  */
+  { "GFORTRAN_BUFFER_SIZE_UNFORMATTED", 0, &options.buffer_size_unformatted,
+    init_integer },
+
+  /* Buffer size for formatted files.  */
+  { "GFORTRAN_BUFFER_SIZE_FORMATTED", 0, &options.buffer_size_formatted,
+    init_integer },
+
   { NULL, 0, NULL, NULL }
 };
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-19 20:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08  4:38 [patch, libfortran] Adjust block size for libgfortran for unformatted reads Thomas Koenig
2019-07-08  8:19 ` Manfred Schwarb
2019-07-08 13:26   ` Janne Blomqvist
2019-07-08 16:31     ` Steve Kargl
2019-07-09 19:22       ` Bernhard Reutner-Fischer
2019-07-14 10:09         ` Thomas Koenig
2019-07-14 10:50           ` Thomas Koenig
2019-07-14 18:55           ` Steve Kargl
2019-07-19 21:23             ` Thomas Koenig
2019-07-08 14:00 ` Janne Blomqvist

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).