public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH] stdio-common: Add the fgetln function
Date: Thu, 9 Jun 2022 13:08:58 -0700	[thread overview]
Message-ID: <fabe5f7f-d56b-b9bc-a718-4f509b4fced0@cs.ucla.edu> (raw)
In-Reply-To: <87r13y5lth.fsf@oldenburg.str.redhat.com>

On 6/9/22 00:37, Florian Weimer wrote:
> * Paul Eggert:
> 
>> If the stream is not already oriented, FreeBSD getln sets the stream
>> to byte-orientation. Should glibc getln do the same?
> 
> Our getdelim doesn't do that explicitly.

I raised the issue because one motivation for adding fgetln is to be 
compatible with FreeBSD. Although the orientation issue is secondary and 
can be detached from the main issue of adding fgetln, it might be 
helpful to address it while fgetln is being added (assuming it's added) 
rather than later.

Perhaps we'll decide that neither fgetln nor getdelim should change 
orientation, i.e., we're deliberately incompatible with FreeBSD. That's 
OK too.


>> On 5/3/22 00:36, Florian Weimer via Libc-alpha wrote:
>>> +  /* Discard the old buffer.  This optimizes for a buffered stream,
>>> +     with multiple lines in each buffer.  */
>>> +  if (fp->_fgetln_buf != NULL)
>>> +    {
>>> +      free (fp->_fgetln_buf);
>>> +      fp->_fgetln_buf = NULL;
>>> +    }
>>
>> Hope you don't mind a bit of bikeshedding here....
>>
>> Why free the fgetln buffer eagerly? Instead, free it only when
>> closing. That would lessen pressure on the memory allocator and would
>> save a few insns in fgetln's usual case.
> 
> The assumption is that very few lines cross buffer boundaries.

Yes, but the above test is run every time fgetln is called. It's faster 
to free only when closing, even if no lines cross buffer boundaries. The 
code must free when closing anyway, so this doesn't slow down closing. 
(But see below, as we may not need that buffer at all.)


>> Come to think of it, how about if we restrict fgetln to streams for
>> which either (1) the user has not called setvbuf with a nonnull
>> buffer, or (2) the input line fits in the user-supplied setvbuf
>> buffer.
> 
> It would require a layering violation as far as libio is concerned: a
> high-level function such as fgetln cannot reallocate the read buffer.
> You mention setvbuf, but there are probably other cases (and of course
> GCC 2.95 C++ classes, but we don't need to worry about compatibility
> with those, I think).

It's OK to violate layering if the performance improvement is worth it, 
which I would guess it would be for apps reading long lines.


> I'm not sure if it's more efficient.  The I/O block granularity would
> change depending on where lines end.

Can't we arrange for I/O blocking to be respected as the buffer grows? 
fgetln shouldn't need to stop reading the instant it sees a newline; it 
can read with the same blocksize it always does.

My sense is that a one-buffer solution is more efficient than two 
buffers, where data are copied from one into the other. Of course I 
haven't measured this though.

  reply	other threads:[~2022-06-09 20:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03  7:36 Florian Weimer
2022-05-03  8:06 ` Andreas Schwab
2022-05-03  8:31   ` Florian Weimer
2022-05-03  8:46     ` Andreas Schwab
2022-05-03  9:01       ` Florian Weimer
2022-05-03  9:10         ` Andreas Schwab
2022-05-03 10:45         ` Cristian Rodríguez
2022-05-04  0:40 ` Paul Eggert
2022-06-09  7:37   ` Florian Weimer
2022-06-09 20:08     ` Paul Eggert [this message]
2022-06-24 11:01       ` Florian Weimer
2022-06-24 20:35         ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fabe5f7f-d56b-b9bc-a718-4f509b4fced0@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).