public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
@ 2012-02-21  1:35 ` jsm28 at gcc dot gnu.org
  2012-12-19 10:40 ` schwab@linux-m68k.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2012-02-21  1:35 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libc                        |stdio

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
  2012-02-21  1:35 ` [Bug stdio/4099] Overly agressive caching by stream i/o functions jsm28 at gcc dot gnu.org
@ 2012-12-19 10:40 ` schwab@linux-m68k.org
  2013-05-20 19:49 ` ondra at iuuk dot mff.cuni.cz
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: schwab@linux-m68k.org @ 2012-12-19 10:40 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|drepper.fsp at gmail dot    |unassigned at sourceware
                   |com                         |dot org

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
  2012-02-21  1:35 ` [Bug stdio/4099] Overly agressive caching by stream i/o functions jsm28 at gcc dot gnu.org
  2012-12-19 10:40 ` schwab@linux-m68k.org
@ 2013-05-20 19:49 ` ondra at iuuk dot mff.cuni.cz
  2013-05-25 12:55 ` bugdal at aerifal dot cx
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: ondra at iuuk dot mff.cuni.cz @ 2013-05-20 19:49 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

OndrejBilka <ondra at iuuk dot mff.cuni.cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ondra at iuuk dot
                   |                            |mff.cuni.cz
           Severity|normal                      |enhancement

--- Comment #1 from OndrejBilka <ondra at iuuk dot mff.cuni.cz> 2013-05-20 19:49:24 UTC ---
This is old issue. Meanwhile to fopen an m mode that does mmap was added.

If you could show speedup on 64bit hosts we could consider use m mode by
default.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2013-05-20 19:49 ` ondra at iuuk dot mff.cuni.cz
@ 2013-05-25 12:55 ` bugdal at aerifal dot cx
  2013-05-25 19:20 ` neleai at seznam dot cz
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: bugdal at aerifal dot cx @ 2013-05-25 12:55 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #2 from Rich Felker <bugdal at aerifal dot cx> ---
mmap is not the solution; mmap-by-default is not an option because it will
SIGBUS under circumstances you cannot control. The solution is just decoupling
cache size from st_blksize, either never using st_blksize at all (and avoiding
the expensive fstat syscall at open time if possible) or only using it when
it's less than a reasonable upper bound like 8-64k.

As far as I can tell, the better solution is NEVER to use st_blksize for stdio.
The only time it might make sense is for files opened in O_SYNC mode,
unbuffered block devices, etc. But for normal files the way stdio uses them,
the kernel already does its own caching, the efficiency of which has little or
nothing to do with the size of read/write units. The purpose of stdio buffering
is not to match the underlying storage device's transfer units, just to
overcome the syscall overhead of read/write. For this purpose, a fixed buffer
size somewhere between 1k and 8k seems to be plenty large, from an empirical
standpoint.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2013-05-25 12:55 ` bugdal at aerifal dot cx
@ 2013-05-25 19:20 ` neleai at seznam dot cz
  2013-05-25 19:54 ` bugdal at aerifal dot cx
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: neleai at seznam dot cz @ 2013-05-25 19:20 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

--- Comment #3 from Ondrej Bilka <neleai at seznam dot cz> ---
On Sat, May 25, 2013 at 12:55:48PM +0000, bugdal at aerifal dot cx wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=4099
> 
> Rich Felker <bugdal at aerifal dot cx> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |bugdal at aerifal dot cx
> 
> --- Comment #2 from Rich Felker <bugdal at aerifal dot cx> ---
> mmap is not the solution; mmap-by-default is not an option because it will
> SIGBUS under circumstances you cannot control. The solution is just decoupling
> cache size from st_blksize, either never using st_blksize at all (and avoiding
> the expensive fstat syscall at open time if possible) or only using it when
> it's less than a reasonable upper bound like 8-64k.
>
For reading my comment was relevant unless you could find how produce
SIGBUS for fopen(foo,"rm"). 

Writing is different issue, a good upper bound looks ok (But I would
prefer to flush once per 100ms or so.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2013-05-25 19:20 ` neleai at seznam dot cz
@ 2013-05-25 19:54 ` bugdal at aerifal dot cx
  2013-05-26 11:53 ` neleai at seznam dot cz
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: bugdal at aerifal dot cx @ 2013-05-25 19:54 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

--- Comment #4 from Rich Felker <bugdal at aerifal dot cx> ---
> For reading my comment was relevant unless you could find how produce
> SIGBUS for fopen(foo,"rm"). 

Piece of cake. truncate(foo, 0);

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2013-05-25 19:54 ` bugdal at aerifal dot cx
@ 2013-05-26 11:53 ` neleai at seznam dot cz
  2013-05-26 16:31 ` green at linuxhacker dot ru
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: neleai at seznam dot cz @ 2013-05-26 11:53 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

--- Comment #5 from Ondrej Bilka <neleai at seznam dot cz> ---
On Sat, May 25, 2013 at 07:54:40PM +0000, bugdal at aerifal dot cx wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=4099
> 
> --- Comment #4 from Rich Felker <bugdal at aerifal dot cx> ---
> > For reading my comment was relevant unless you could find how produce
> > SIGBUS for fopen(foo,"rm"). 
> 
> Piece of cake. truncate(foo, 0);
>
Ok, however this posibility should be documented in manpage.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2013-05-26 11:53 ` neleai at seznam dot cz
@ 2013-05-26 16:31 ` green at linuxhacker dot ru
  2013-05-26 17:40 ` bugdal at aerifal dot cx
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: green at linuxhacker dot ru @ 2013-05-26 16:31 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

--- Comment #6 from Oleg Drokin <green at linuxhacker dot ru> ---
I would mostly agree with Rich that it does not make much sense to do a lot of
caching for reading. Bulk of the benefit is realized once 4K or so is read.
Then, seeks don't need to "pre-buffer" as much data as well which is a plus too
(and a lot of people consider seek to be a free operation). This should not
even be used when doing O_SYNC (that's where I disagree).

On the other hand, using full st_blksize for write caching makes total sense
and I see little to be gained from removing this, filesystems that advertize
bigger st_blksize would definitely get a big benefit out of it, allowing more
optimal write placing or whatever other benefits there might be from it's
perspective.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2013-05-26 16:31 ` green at linuxhacker dot ru
@ 2013-05-26 17:40 ` bugdal at aerifal dot cx
  2013-05-27  1:09 ` green at linuxhacker dot ru
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: bugdal at aerifal dot cx @ 2013-05-26 17:40 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

--- Comment #7 from Rich Felker <bugdal at aerifal dot cx> ---
> On the other hand, using full st_blksize for write caching makes total sense
> and I see little to be gained from removing this, filesystems that advertize
> bigger st_blksize would definitely get a big benefit out of it, allowing more
> optimal write placing or whatever other benefits there might be from it's
> perspective.

This only makes sense if write() actually writes something to disk. It doesn't.
It (conceptually) memcpy's from a userspace buffer to the kernel's cache
buffers. So there's no reason to think the optimal write() size has anything to
do with the filesystem's block size.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2013-05-26 17:40 ` bugdal at aerifal dot cx
@ 2013-05-27  1:09 ` green at linuxhacker dot ru
  2014-11-28 16:15 ` carlos at redhat dot com
  2020-11-02  7:56 ` [Bug stdio/4099] Overly aggressive " ldv at sourceware dot org
  11 siblings, 0 replies; 12+ messages in thread
From: green at linuxhacker dot ru @ 2013-05-27  1:09 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4099

--- Comment #8 from Oleg Drokin <green at linuxhacker dot ru> ---
(In reply to Rich Felker from comment #7)

> This only makes sense if write() actually writes something to disk. It
> doesn't. It (conceptually) memcpy's from a userspace buffer to the kernel's
> cache buffers. So there's no reason to think the optimal write() size has
> anything to do with the filesystem's block size.

Even if there is no immediate write, there might be other considerations:
networking filesytems (like Lustre) might need to do extra locking gymnastics
per every syscall (overall syscalls are somewhat expensive, we already touched
on that), if there are conflicting accesses to the file that force lock
revocations, doing writes in large chunks means that cache writeout happens in
larger chunks which helps RPC sizes and disk backend loads.

"Legacy" filesystems that don't have delayed block allocation might use write
size as a gauge of how big of a block of free space to find on disk so that the
file is less fragmented (and to reduce block allocator overhead and metadata
updates overhead) - we certainly did this back in reiserfs days.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly agressive caching by stream i/o functions.
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2013-05-27  1:09 ` green at linuxhacker dot ru
@ 2014-11-28 16:15 ` carlos at redhat dot com
  2020-11-02  7:56 ` [Bug stdio/4099] Overly aggressive " ldv at sourceware dot org
  11 siblings, 0 replies; 12+ messages in thread
From: carlos at redhat dot com @ 2014-11-28 16:15 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4099

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #9 from Carlos O'Donell <carlos at redhat dot com> ---
For now I think that using custom streams or setvbuf is the only workaround to
this problem.

The immediate solution that springs to mind is to use tunnables to set a
maximum block size that streams will use in the process, and default it to
st_blksize to get the old behaviour by default.

I've added this to the list of tunnables:
https://sourceware.org/glibc/wiki/TuningLibraryRuntimeBehavior

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug stdio/4099] Overly aggressive caching by stream i/o functions
       [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2014-11-28 16:15 ` carlos at redhat dot com
@ 2020-11-02  7:56 ` ldv at sourceware dot org
  11 siblings, 0 replies; 12+ messages in thread
From: ldv at sourceware dot org @ 2020-11-02  7:56 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4099

Dmitry V. Levin <ldv at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Overly agressive caching by |Overly aggressive caching
                   |stream i/o functions.       |by stream i/o functions

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-11-02  7:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-4099-131@http.sourceware.org/bugzilla/>
2012-02-21  1:35 ` [Bug stdio/4099] Overly agressive caching by stream i/o functions jsm28 at gcc dot gnu.org
2012-12-19 10:40 ` schwab@linux-m68k.org
2013-05-20 19:49 ` ondra at iuuk dot mff.cuni.cz
2013-05-25 12:55 ` bugdal at aerifal dot cx
2013-05-25 19:20 ` neleai at seznam dot cz
2013-05-25 19:54 ` bugdal at aerifal dot cx
2013-05-26 11:53 ` neleai at seznam dot cz
2013-05-26 16:31 ` green at linuxhacker dot ru
2013-05-26 17:40 ` bugdal at aerifal dot cx
2013-05-27  1:09 ` green at linuxhacker dot ru
2014-11-28 16:15 ` carlos at redhat dot com
2020-11-02  7:56 ` [Bug stdio/4099] Overly aggressive " ldv at sourceware dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).