public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
@ 2016-03-10 13:54 Florian Weimer
  2016-03-11 21:52 ` Roland McGrath
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2016-03-10 13:54 UTC (permalink / raw)
  To: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 65 bytes --]

Separating this change, as requested by Roland.

Thanks,
Florian

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-libio-Always-use-_IO_BUFSIZE-for-stream-buffers-BZ-4.patch --]
[-- Type: text/x-patch; name="0001-libio-Always-use-_IO_BUFSIZE-for-stream-buffers-BZ-4.patch", Size: 1638 bytes --]

2016-03-08  Florian Weimer  <fweimer@redhat.com>

	[BZ #4099]
	* libio/filedoalloc.c (_IO_file_doallocate): Always use _IO_BUFSIZ
	as the buffer size.

diff --git a/libio/filedoalloc.c b/libio/filedoalloc.c
index 4f9d738..74ff79b 100644
--- a/libio/filedoalloc.c
+++ b/libio/filedoalloc.c
@@ -56,8 +56,6 @@
 /* Modified for GNU iostream by Per Bothner 1991, 1992. */
 
 #include "libioP.h"
-#include <device-nrs.h>
-#include <sys/stat.h>
 #include <stdlib.h>
 #include <unistd.h>
 
@@ -72,36 +70,17 @@ local_isatty (int fd)
 }
 
 /* Allocate a file buffer, or switch to unbuffered I/O.  Streams for
-   TTY devices default to line buffered.  */
+ * TTY devices default to line buffered.  */
 int
 _IO_file_doallocate (_IO_FILE *fp)
 {
-  _IO_size_t size;
-  char *p;
-  struct stat64 st;
-
-  size = _IO_BUFSIZ;
-  if (fp->_fileno >= 0 && __builtin_expect (_IO_SYSSTAT (fp, &st), 0) >= 0)
-    {
-      if (S_ISCHR (st.st_mode))
-	{
-	  /* Possibly a tty.  */
-	  if (
-#ifdef DEV_TTY_P
-	      DEV_TTY_P (&st) ||
-#endif
-	      local_isatty (fp->_fileno))
-	    fp->_flags |= _IO_LINE_BUF;
-	}
-#if _IO_HAVE_ST_BLKSIZE
-      if (st.st_blksize > 0)
-	size = st.st_blksize;
-#endif
-    }
-  p = malloc (size);
+  /* Switch to line buffering for TTYs.  */
+  if (fp->_fileno >= 0 && local_isatty (fp->_fileno))
+    fp->_flags |= _IO_LINE_BUF;
+  char *p = malloc (_IO_BUFSIZ);
   if (__glibc_unlikely (p == NULL))
     return EOF;
-  _IO_setb (fp, p, p + size, 1);
+  _IO_setb (fp, p, p + _IO_BUFSIZ, 1);
   return 1;
 }
 libc_hidden_def (_IO_file_doallocate)
-- 
2.4.3


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-10 13:54 [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099] Florian Weimer
@ 2016-03-11 21:52 ` Roland McGrath
  2016-03-14 11:08   ` Florian Weimer
  0 siblings, 1 reply; 16+ messages in thread
From: Roland McGrath @ 2016-03-11 21:52 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

Justify with clear rationale.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-11 21:52 ` Roland McGrath
@ 2016-03-14 11:08   ` Florian Weimer
  2016-03-14 19:31     ` Carlos O'Donell
  2016-03-18 22:53     ` Roland McGrath
  0 siblings, 2 replies; 16+ messages in thread
From: Florian Weimer @ 2016-03-14 11:08 UTC (permalink / raw)
  To: Roland McGrath; +Cc: GNU C Library

On 03/11/2016 10:52 PM, Roland McGrath wrote:
> Justify with clear rationale.

It fixes bug 4099.  We need an arbitrary limit for that.

The libstdc++ buffer size is 8192 (or 8191), so this makes buffering
more consistent across the system.

The PostgreSQL people did extensive benchmarks to determine their
block/page size, and settled for a 8192 (but they do not use stdio
streams, for obvious reasons).

<stdio.h> documents BUFSIZ as the default buffer size. The new
implementation matches that.

Additional memory consumption is limited because file descriptors are a
scarce resource.

I can do some benchmarking, but I don't expect any compelling results.

Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-14 11:08   ` Florian Weimer
@ 2016-03-14 19:31     ` Carlos O'Donell
  2016-03-18 22:53     ` Roland McGrath
  1 sibling, 0 replies; 16+ messages in thread
From: Carlos O'Donell @ 2016-03-14 19:31 UTC (permalink / raw)
  To: Florian Weimer, Roland McGrath; +Cc: GNU C Library

On 03/14/2016 07:08 AM, Florian Weimer wrote:
> On 03/11/2016 10:52 PM, Roland McGrath wrote:
>> Justify with clear rationale.
> 
> It fixes bug 4099.  We need an arbitrary limit for that.
> 
> The libstdc++ buffer size is 8192 (or 8191), so this makes buffering
> more consistent across the system.
> 
> The PostgreSQL people did extensive benchmarks to determine their
> block/page size, and settled for a 8192 (but they do not use stdio
> streams, for obvious reasons).
> 
> <stdio.h> documents BUFSIZ as the default buffer size. The new
> implementation matches that.
> 
> Additional memory consumption is limited because file descriptors are a
> scarce resource.
> 
> I can do some benchmarking, but I don't expect any compelling results.

I don't know that benchmarking is required, Roland just asked for clear
rationale.

However, it would be wonderful if you added a microbenchmark just to make
sure we don't actually cause any unforseen problems. This way people can
run such a benchmark again on their remote filesystems and give us results.

Your answer seems clear enough to me. I agree with it too. The advertised
st_blksize is useful only in the abstract. The runtime has to pick something
which works well with the current implementation as a whole.

The only objection I might see is that this is actually a Linux-specific
tuning that you've done. Nobody knows if this tuning has any impact on
Hurd or not.

I would consider this OK to checkin only if you provide a detailed comment
that talks about the tradeoffs being made here and why _IO_BUFSIZE was
chosen.

In summary:
- Add comment just above setting _IO_BUFSIZE about tradeoff [Required]
- Add microbenchmark to avoid surprises [Optional]

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-14 11:08   ` Florian Weimer
  2016-03-14 19:31     ` Carlos O'Donell
@ 2016-03-18 22:53     ` Roland McGrath
  2016-03-31 10:14       ` Florian Weimer
  2016-04-01 18:20       ` [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099] Rich Felker
  1 sibling, 2 replies; 16+ messages in thread
From: Roland McGrath @ 2016-03-18 22:53 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

> On 03/11/2016 10:52 PM, Roland McGrath wrote:
> > Justify with clear rationale.
> 
> It fixes bug 4099.  We need an arbitrary limit for that.

That is justification for imposing an arbitrary maximum on the
automatically-chosen size.  Similar logic on the other side of the coin is
justification for imposing an arbitrary minimum on the automatically-chosen
size.  Neither is justification for always using a single fixed size.

> The libstdc++ buffer size is 8192 (or 8191), so this makes buffering
> more consistent across the system.

That's an internal implementation choice in libstdc++.  There is no reason
to expect it to stay the same, nor special reason to think that just
because libstdc++ chose it that it's ideal.

> The PostgreSQL people did extensive benchmarks to determine their
> block/page size, and settled for a 8192 (but they do not use stdio
> streams, for obvious reasons).

That's lovely.  They can inform the implementors of whatever filesystem(s)
they were using in their benchmarks that st_blksize=8192 is what they
should be reporting.

> <stdio.h> documents BUFSIZ as the default buffer size. The new
> implementation matches that.

It's the default in the sense that it's what setbuf uses.  So it's a
permanent part of the ABI and therefore can't be changed easily regardless
of whether it's a desireable value.  If the comments or other documentation
are unclear as to the true (very tiny) significance of BUFSIZ, they should
be fixed.

> Additional memory consumption is limited because file descriptors are a
> scarce resource.

There is no reason to consider file descriptors scarce.
The per-process limit is fungible.

> I can do some benchmarking, but I don't expect any compelling results.

Whatever the results, they would not IMHO be relevant here.

POSIX specifies that st_blksize is the "preferred I/O block size for this
object".  It's the kernel's responsibility to give userland good advice
through this channel.  If there are common buggy kernels that give bad
advice, that is a reason to apply upper and lower limits to the advice from
the kernel.  But the expectation should be that the kernel gets fixed to
give good advice, and the optimal thing to do with a good kernel is to
follow its advice.  

Since the recommended use of st_blksize in this way is a standard user
feature and not just what stdio's implementation happens to do, there is an
argument to be made that the limiting of the value should be done in the
*stat functions reported st_blksize values rather than in stdio's use of
them.  (I'm ambivalent about this point.)


Thanks,
Roland

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-18 22:53     ` Roland McGrath
@ 2016-03-31 10:14       ` Florian Weimer
  2016-05-19 15:02         ` Florian Weimer
  2016-04-01 18:20       ` [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099] Rich Felker
  1 sibling, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2016-03-31 10:14 UTC (permalink / raw)
  To: Roland McGrath; +Cc: GNU C Library

On 03/18/2016 11:52 PM, Roland McGrath wrote:

> Whatever the results, they would not IMHO be relevant here.
> 
> POSIX specifies that st_blksize is the "preferred I/O block size for this
> object".  It's the kernel's responsibility to give userland good advice
> through this channel.  If there are common buggy kernels that give bad
> advice, that is a reason to apply upper and lower limits to the advice from
> the kernel.  But the expectation should be that the kernel gets fixed to
> give good advice, and the optimal thing to do with a good kernel is to
> follow its advice.  
> 
> Since the recommended use of st_blksize in this way is a standard user
> feature and not just what stdio's implementation happens to do, there is an
> argument to be made that the limiting of the value should be done in the
> *stat functions reported st_blksize values rather than in stdio's use of
> them.  (I'm ambivalent about this point.)

That's a good point.  I'll try to get feedback from kernel file system
developers on this matter.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-18 22:53     ` Roland McGrath
  2016-03-31 10:14       ` Florian Weimer
@ 2016-04-01 18:20       ` Rich Felker
  2016-04-02  1:15         ` Carlos O'Donell
  1 sibling, 1 reply; 16+ messages in thread
From: Rich Felker @ 2016-04-01 18:20 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Florian Weimer, GNU C Library

On Fri, Mar 18, 2016 at 03:52:58PM -0700, Roland McGrath wrote:
> > I can do some benchmarking, but I don't expect any compelling results.
> 
> Whatever the results, they would not IMHO be relevant here.
> 
> POSIX specifies that st_blksize is the "preferred I/O block size for this
> object".  It's the kernel's responsibility to give userland good advice
> through this channel.  If there are common buggy kernels that give bad
> advice, that is a reason to apply upper and lower limits to the advice from
> the kernel.  But the expectation should be that the kernel gets fixed to
> give good advice, and the optimal thing to do with a good kernel is to
> follow its advice.  
> 
> Since the recommended use of st_blksize in this way is a standard user
> feature and not just what stdio's implementation happens to do, there is an
> argument to be made that the limiting of the value should be done in the
> *stat functions reported st_blksize values rather than in stdio's use of
> them.  (I'm ambivalent about this point.)

Regardless of st_blksize being "the preferred size", it's not suitable
for stdio, at least not for read purposes, because sparse/random
access reads are a valid application usage for stdio. Reading an
unboundedly large "optimal" block size, only to use one byte and throw
the rest away, is unacceptably pessimistic behavior and is the whole
point behind bug 4099.

If you insist on keeping unboundedly large buffers honoring
st_blksize, one option would be to only use the full buffer for
writing, and limit it to 4k or 8k for reading. But I think it's best
to just ignore st_blksize and use a reasonable buffer size all the
time.

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-04-01 18:20       ` [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099] Rich Felker
@ 2016-04-02  1:15         ` Carlos O'Donell
  2016-04-02  1:59           ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Carlos O'Donell @ 2016-04-02  1:15 UTC (permalink / raw)
  To: Rich Felker, Roland McGrath; +Cc: Florian Weimer, GNU C Library

On 04/01/2016 02:19 PM, Rich Felker wrote:
> On Fri, Mar 18, 2016 at 03:52:58PM -0700, Roland McGrath wrote:
>>> I can do some benchmarking, but I don't expect any compelling results.
>>
>> Whatever the results, they would not IMHO be relevant here.
>>
>> POSIX specifies that st_blksize is the "preferred I/O block size for this
>> object".  It's the kernel's responsibility to give userland good advice
>> through this channel.  If there are common buggy kernels that give bad
>> advice, that is a reason to apply upper and lower limits to the advice from
>> the kernel.  But the expectation should be that the kernel gets fixed to
>> give good advice, and the optimal thing to do with a good kernel is to
>> follow its advice.  
>>
>> Since the recommended use of st_blksize in this way is a standard user
>> feature and not just what stdio's implementation happens to do, there is an
>> argument to be made that the limiting of the value should be done in the
>> *stat functions reported st_blksize values rather than in stdio's use of
>> them.  (I'm ambivalent about this point.)
> 
> Regardless of st_blksize being "the preferred size", it's not suitable
> for stdio, at least not for read purposes, because sparse/random
> access reads are a valid application usage for stdio. Reading an
> unboundedly large "optimal" block size, only to use one byte and throw
> the rest away, is unacceptably pessimistic behavior and is the whole
> point behind bug 4099.
> 
> If you insist on keeping unboundedly large buffers honoring
> st_blksize, one option would be to only use the full buffer for
> writing, and limit it to 4k or 8k for reading. But I think it's best
> to just ignore st_blksize and use a reasonable buffer size all the
> time.

I think Roland has a good point though, if the kernel is going to buffer
for you using filesystem-based knowledge, then why doesn't it just report
an st_blksize that's small, say 8192 bytes, given the implementation?
What purpose does it serve to set st_blksize to 2MB?

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-04-02  1:15         ` Carlos O'Donell
@ 2016-04-02  1:59           ` Rich Felker
  2016-04-02  2:13             ` Roland McGrath
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2016-04-02  1:59 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Roland McGrath, Florian Weimer, GNU C Library

On Fri, Apr 01, 2016 at 09:15:21PM -0400, Carlos O'Donell wrote:
> On 04/01/2016 02:19 PM, Rich Felker wrote:
> > On Fri, Mar 18, 2016 at 03:52:58PM -0700, Roland McGrath wrote:
> >>> I can do some benchmarking, but I don't expect any compelling results.
> >>
> >> Whatever the results, they would not IMHO be relevant here.
> >>
> >> POSIX specifies that st_blksize is the "preferred I/O block size for this
> >> object".  It's the kernel's responsibility to give userland good advice
> >> through this channel.  If there are common buggy kernels that give bad
> >> advice, that is a reason to apply upper and lower limits to the advice from
> >> the kernel.  But the expectation should be that the kernel gets fixed to
> >> give good advice, and the optimal thing to do with a good kernel is to
> >> follow its advice.  
> >>
> >> Since the recommended use of st_blksize in this way is a standard user
> >> feature and not just what stdio's implementation happens to do, there is an
> >> argument to be made that the limiting of the value should be done in the
> >> *stat functions reported st_blksize values rather than in stdio's use of
> >> them.  (I'm ambivalent about this point.)
> > 
> > Regardless of st_blksize being "the preferred size", it's not suitable
> > for stdio, at least not for read purposes, because sparse/random
> > access reads are a valid application usage for stdio. Reading an
> > unboundedly large "optimal" block size, only to use one byte and throw
> > the rest away, is unacceptably pessimistic behavior and is the whole
> > point behind bug 4099.
> > 
> > If you insist on keeping unboundedly large buffers honoring
> > st_blksize, one option would be to only use the full buffer for
> > writing, and limit it to 4k or 8k for reading. But I think it's best
> > to just ignore st_blksize and use a reasonable buffer size all the
> > time.
> 
> I think Roland has a good point though, if the kernel is going to buffer
> for you using filesystem-based knowledge, then why doesn't it just report
> an st_blksize that's small, say 8192 bytes, given the implementation?
> What purpose does it serve to set st_blksize to 2MB?

Oh, I totally agree that the kernel is being stupid here. But it's
also stupid for glibc to honor values that obviously do not make sense
for the usage case at hand (reading when you can't know whether you'll
throw most of the result away).

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-04-02  1:59           ` Rich Felker
@ 2016-04-02  2:13             ` Roland McGrath
  0 siblings, 0 replies; 16+ messages in thread
From: Roland McGrath @ 2016-04-02  2:13 UTC (permalink / raw)
  To: Rich Felker; +Cc: Carlos O'Donell, Florian Weimer, GNU C Library

> Oh, I totally agree that the kernel is being stupid here. But it's
> also stupid for glibc to honor values that obviously do not make sense
> for the usage case at hand (reading when you can't know whether you'll
> throw most of the result away).

Hence the only actual suggestion I made: apply fixed lower and upper bounds
to the st_blksize value.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
  2016-03-31 10:14       ` Florian Weimer
@ 2016-05-19 15:02         ` Florian Weimer
  2016-06-24 15:35           ` [PING] Avoid excessive buffer size in libio (was: Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]) Florian Weimer
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2016-05-19 15:02 UTC (permalink / raw)
  To: Roland McGrath; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 1297 bytes --]

On 03/31/2016 12:14 PM, Florian Weimer wrote:
> On 03/18/2016 11:52 PM, Roland McGrath wrote:
>
>> Whatever the results, they would not IMHO be relevant here.
>>
>> POSIX specifies that st_blksize is the "preferred I/O block size for this
>> object".  It's the kernel's responsibility to give userland good advice
>> through this channel.  If there are common buggy kernels that give bad
>> advice, that is a reason to apply upper and lower limits to the advice from
>> the kernel.  But the expectation should be that the kernel gets fixed to
>> give good advice, and the optimal thing to do with a good kernel is to
>> follow its advice.
>>
>> Since the recommended use of st_blksize in this way is a standard user
>> feature and not just what stdio's implementation happens to do, there is an
>> argument to be made that the limiting of the value should be done in the
>> *stat functions reported st_blksize values rather than in stdio's use of
>> them.  (I'm ambivalent about this point.)
>
> That's a good point.  I'll try to get feedback from kernel file system
> developers on this matter.

I wasn't able to get any feedback.  Based on Rich's point about random 
I/O and Roland's earlier suggestion, I'm just capping the reported 
buffer size to 8192 in the attached patch.

Thanks,
Florian

[-- Attachment #2: bufsize.patch --]
[-- Type: text/x-patch, Size: 1311 bytes --]

libio: Limit buffer size to 8192 bytes [BZ #4099]

This avoids overly large buffers with network file systems which report
very large block sizes.

2016-05-19  Florian Weimer  <fweimer@redhat.com>

	[BZ #4099]
	* libio/filedoalloc.c (_IO_file_doallocate): Limit buffer size to
	_IO_BUFSIZ (8192).

diff --git a/NEWS b/NEWS
index b3fd3cc..dec9757 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,12 @@ Version 2.24
   group: files [SUCCESS=merge] nis
   Implemented by Stephen Gallagher (Red Hat).
 
+* The buffer size for byte-oriented stdio streams is now limited to 8192
+  bytes by default.  Previously, on Linux, the default buffer size on most
+  file systems was 4096 bytes (and thus remains unchanged), except on
+  network file systems, where the buffer size was unpredictable and could be
+  as large as several megabytes.
+
 Security related changes:
 
 * An unnecessary stack copy in _nss_dns_getnetbyname_r was removed.  It
diff --git a/libio/filedoalloc.c b/libio/filedoalloc.c
index 4f9d738..ded0725 100644
--- a/libio/filedoalloc.c
+++ b/libio/filedoalloc.c
@@ -94,7 +94,7 @@ _IO_file_doallocate (_IO_FILE *fp)
 	    fp->_flags |= _IO_LINE_BUF;
 	}
 #if _IO_HAVE_ST_BLKSIZE
-      if (st.st_blksize > 0)
+      if (st.st_blksize > 0 && st.st_blksize < _IO_BUFSIZ)
 	size = st.st_blksize;
 #endif
     }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PING] Avoid excessive buffer size in libio (was: Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099])
  2016-05-19 15:02         ` Florian Weimer
@ 2016-06-24 15:35           ` Florian Weimer
  2016-11-29 16:50             ` [PING] Avoid excessive buffer size in libio Florian Weimer
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2016-06-24 15:35 UTC (permalink / raw)
  To: GNU C Library

On 05/19/2016 04:57 PM, Florian Weimer wrote:
> On 03/31/2016 12:14 PM, Florian Weimer wrote:
>> On 03/18/2016 11:52 PM, Roland McGrath wrote:
>>
>>> Whatever the results, they would not IMHO be relevant here.
>>>
>>> POSIX specifies that st_blksize is the "preferred I/O block size for
>>> this
>>> object".  It's the kernel's responsibility to give userland good advice
>>> through this channel.  If there are common buggy kernels that give bad
>>> advice, that is a reason to apply upper and lower limits to the
>>> advice from
>>> the kernel.  But the expectation should be that the kernel gets fixed to
>>> give good advice, and the optimal thing to do with a good kernel is to
>>> follow its advice.
>>>
>>> Since the recommended use of st_blksize in this way is a standard user
>>> feature and not just what stdio's implementation happens to do, there
>>> is an
>>> argument to be made that the limiting of the value should be done in the
>>> *stat functions reported st_blksize values rather than in stdio's use of
>>> them.  (I'm ambivalent about this point.)
>>
>> That's a good point.  I'll try to get feedback from kernel file system
>> developers on this matter.
>
> I wasn't able to get any feedback.  Based on Rich's point about random
> I/O and Roland's earlier suggestion, I'm just capping the reported
> buffer size to 8192 in the attached patch.

Ping?

   <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html>

Thanks,
Florian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PING] Avoid excessive buffer size in libio
  2016-06-24 15:35           ` [PING] Avoid excessive buffer size in libio (was: Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]) Florian Weimer
@ 2016-11-29 16:50             ` Florian Weimer
  2016-11-30 13:15               ` Carlos O'Donell
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2016-11-29 16:50 UTC (permalink / raw)
  To: GNU C Library

On 06/24/2016 05:35 PM, Florian Weimer wrote:
> On 05/19/2016 04:57 PM, Florian Weimer wrote:
>> On 03/31/2016 12:14 PM, Florian Weimer wrote:
>>> On 03/18/2016 11:52 PM, Roland McGrath wrote:
>>>
>>>> Whatever the results, they would not IMHO be relevant here.
>>>>
>>>> POSIX specifies that st_blksize is the "preferred I/O block size for
>>>> this
>>>> object".  It's the kernel's responsibility to give userland good advice
>>>> through this channel.  If there are common buggy kernels that give bad
>>>> advice, that is a reason to apply upper and lower limits to the
>>>> advice from
>>>> the kernel.  But the expectation should be that the kernel gets
>>>> fixed to
>>>> give good advice, and the optimal thing to do with a good kernel is to
>>>> follow its advice.
>>>>
>>>> Since the recommended use of st_blksize in this way is a standard user
>>>> feature and not just what stdio's implementation happens to do, there
>>>> is an
>>>> argument to be made that the limiting of the value should be done in
>>>> the
>>>> *stat functions reported st_blksize values rather than in stdio's
>>>> use of
>>>> them.  (I'm ambivalent about this point.)
>>>
>>> That's a good point.  I'll try to get feedback from kernel file system
>>> developers on this matter.
>>
>> I wasn't able to get any feedback.  Based on Rich's point about random
>> I/O and Roland's earlier suggestion, I'm just capping the reported
>> buffer size to 8192 in the attached patch.
>
> Ping?
>
>   <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html>

Ping?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PING] Avoid excessive buffer size in libio
  2016-11-29 16:50             ` [PING] Avoid excessive buffer size in libio Florian Weimer
@ 2016-11-30 13:15               ` Carlos O'Donell
  2016-11-30 13:17                 ` Florian Weimer
  0 siblings, 1 reply; 16+ messages in thread
From: Carlos O'Donell @ 2016-11-30 13:15 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 11/29/2016 11:50 AM, Florian Weimer wrote:
> On 06/24/2016 05:35 PM, Florian Weimer wrote:
>> On 05/19/2016 04:57 PM, Florian Weimer wrote:
>>> On 03/31/2016 12:14 PM, Florian Weimer wrote:
>>>> On 03/18/2016 11:52 PM, Roland McGrath wrote:
>>>>
>>>>> Whatever the results, they would not IMHO be relevant here.
>>>>>
>>>>> POSIX specifies that st_blksize is the "preferred I/O block size for
>>>>> this
>>>>> object".  It's the kernel's responsibility to give userland good advice
>>>>> through this channel.  If there are common buggy kernels that give bad
>>>>> advice, that is a reason to apply upper and lower limits to the
>>>>> advice from
>>>>> the kernel.  But the expectation should be that the kernel gets
>>>>> fixed to
>>>>> give good advice, and the optimal thing to do with a good kernel is to
>>>>> follow its advice.
>>>>>
>>>>> Since the recommended use of st_blksize in this way is a standard user
>>>>> feature and not just what stdio's implementation happens to do, there
>>>>> is an
>>>>> argument to be made that the limiting of the value should be done in
>>>>> the
>>>>> *stat functions reported st_blksize values rather than in stdio's
>>>>> use of
>>>>> them.  (I'm ambivalent about this point.)
>>>>
>>>> That's a good point.  I'll try to get feedback from kernel file system
>>>> developers on this matter.
>>>
>>> I wasn't able to get any feedback.  Based on Rich's point about random
>>> I/O and Roland's earlier suggestion, I'm just capping the reported
>>> buffer size to 8192 in the attached patch.
>>
>> Ping?
>>
>>   <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html>
> 
> Ping?

Just to confirm the user can still call stat themselves and adjust the
buffer size up if they want?

If that's the case then this looks good to me and users have a way
to re-enable the old behaviour.

This kind of tuning can be changed at will from release to release
or patched by the distributions if they have problems with the change.

I don't see any real issue here, the kernel is returning "optimal" values
but this conflicts with what the user themselves want to be doing. Thus
providing a fixed cap for the default is fine (as long as the user can undo
the choice).

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PING] Avoid excessive buffer size in libio
  2016-11-30 13:15               ` Carlos O'Donell
@ 2016-11-30 13:17                 ` Florian Weimer
  2016-11-30 21:14                   ` Carlos O'Donell
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2016-11-30 13:17 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU C Library

On 11/30/2016 02:15 PM, Carlos O'Donell wrote:
> On 11/29/2016 11:50 AM, Florian Weimer wrote:
>> On 06/24/2016 05:35 PM, Florian Weimer wrote:
>>> On 05/19/2016 04:57 PM, Florian Weimer wrote:
>>>> On 03/31/2016 12:14 PM, Florian Weimer wrote:
>>>>> On 03/18/2016 11:52 PM, Roland McGrath wrote:
>>>>>
>>>>>> Whatever the results, they would not IMHO be relevant here.
>>>>>>
>>>>>> POSIX specifies that st_blksize is the "preferred I/O block size for
>>>>>> this
>>>>>> object".  It's the kernel's responsibility to give userland good advice
>>>>>> through this channel.  If there are common buggy kernels that give bad
>>>>>> advice, that is a reason to apply upper and lower limits to the
>>>>>> advice from
>>>>>> the kernel.  But the expectation should be that the kernel gets
>>>>>> fixed to
>>>>>> give good advice, and the optimal thing to do with a good kernel is to
>>>>>> follow its advice.
>>>>>>
>>>>>> Since the recommended use of st_blksize in this way is a standard user
>>>>>> feature and not just what stdio's implementation happens to do, there
>>>>>> is an
>>>>>> argument to be made that the limiting of the value should be done in
>>>>>> the
>>>>>> *stat functions reported st_blksize values rather than in stdio's
>>>>>> use of
>>>>>> them.  (I'm ambivalent about this point.)
>>>>>
>>>>> That's a good point.  I'll try to get feedback from kernel file system
>>>>> developers on this matter.
>>>>
>>>> I wasn't able to get any feedback.  Based on Rich's point about random
>>>> I/O and Roland's earlier suggestion, I'm just capping the reported
>>>> buffer size to 8192 in the attached patch.
>>>
>>> Ping?
>>>
>>>   <https://sourceware.org/ml/libc-alpha/2016-05/msg00428.html>
>>
>> Ping?
>
> Just to confirm the user can still call stat themselves and adjust the
> buffer size up if they want?

They can still install their own buffer, yes.

> If that's the case then this looks good to me and users have a way
> to re-enable the old behaviour.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PING] Avoid excessive buffer size in libio
  2016-11-30 13:17                 ` Florian Weimer
@ 2016-11-30 21:14                   ` Carlos O'Donell
  0 siblings, 0 replies; 16+ messages in thread
From: Carlos O'Donell @ 2016-11-30 21:14 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

On 11/30/2016 08:17 AM, Florian Weimer wrote:
> They can still install their own buffer, yes.

The only tradeoff I see is that you're making wasteful
write syscalls while the kernel waits for enough data
to optimally send to the target device.

You really have to know a lot about your device to have
enough information to do this optimally.

xstat()
https://lwn.net/Articles/686106/
~~~
Howells noted that Dave Chinner wanted more I/O 
parameters (e.g. preferred read and write sizes, 
erase block size). There were five to seven 
different numbers that Chinner wanted, but those 
could always be added later, he said. 
~~~

And the sysfs information is useless for my SATA disks...

[carlos@athas ~]$ cat /sys/block/sda/queue/optimal_io_size 
0

So we aren't yet at any kind of utopia where there is
enough information to make the right choice. Even then
the "right" choice depends on what you're trying to do
e.g. small write, large write, throughput, low-latency etc.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-11-30 21:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-10 13:54 [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099] Florian Weimer
2016-03-11 21:52 ` Roland McGrath
2016-03-14 11:08   ` Florian Weimer
2016-03-14 19:31     ` Carlos O'Donell
2016-03-18 22:53     ` Roland McGrath
2016-03-31 10:14       ` Florian Weimer
2016-05-19 15:02         ` Florian Weimer
2016-06-24 15:35           ` [PING] Avoid excessive buffer size in libio (was: Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]) Florian Weimer
2016-11-29 16:50             ` [PING] Avoid excessive buffer size in libio Florian Weimer
2016-11-30 13:15               ` Carlos O'Donell
2016-11-30 13:17                 ` Florian Weimer
2016-11-30 21:14                   ` Carlos O'Donell
2016-04-01 18:20       ` [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099] Rich Felker
2016-04-02  1:15         ` Carlos O'Donell
2016-04-02  1:59           ` Rich Felker
2016-04-02  2:13             ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).