public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH] stdio-common: Add the fgetln function
Date: Fri, 24 Jun 2022 13:01:35 +0200	[thread overview]
Message-ID: <87fsjujpf4.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <fabe5f7f-d56b-b9bc-a718-4f509b4fced0@cs.ucla.edu> (Paul Eggert's message of "Thu, 9 Jun 2022 13:08:58 -0700")

* Paul Eggert:

> On 6/9/22 00:37, Florian Weimer wrote:
>> * Paul Eggert:
>> 
>>> If the stream is not already oriented, FreeBSD getln sets the stream
>>> to byte-orientation. Should glibc getln do the same?
>> Our getdelim doesn't do that explicitly.
>
> I raised the issue because one motivation for adding fgetln is to be
> compatible with FreeBSD. Although the orientation issue is secondary
> and can be detached from the main issue of adding fgetln, it might be 
> helpful to address it while fgetln is being added (assuming it's
> added) rather than later.
>
> Perhaps we'll decide that neither fgetln nor getdelim should change
> orientation, i.e., we're deliberately incompatible with
> FreeBSD. That's OK too.

I will think about it.

>> I'm not sure if it's more efficient.  The I/O block granularity would
>> change depending on where lines end.
>
> Can't we arrange for I/O blocking to be respected as the buffer grows?
> fgetln shouldn't need to stop reading the instant it sees a newline;
> it can read with the same blocksize it always does.
>
> My sense is that a one-buffer solution is more efficient than two
> buffers, where data are copied from one into the other. Of course I 
> haven't measured this though.

I'm not sure if there is a good allocation scheme for this that is
obviously superior to a separate allocation.  If the end of the line
crosses the buffer boundary for the first time, moving the line to the
start of the buffer does not gives of sufficient room for a full block,
so we have to grow the buffer, by at least the number of bytes in the
line prefix read so far.  Not sure if we need some exponential resizing
policy there.  Assuming that the line is reasonably long, we will then
find a line terminator in the newly read block, and can return a pointer
to the start of the buffer from fgetln.

But we would have to teach the rest of libio to avoid the extra buffer
space at the end during future read operations.  We could avoid these
changes if we resized the buffer to twice the original buffer size.
Then we'd still maintain buffer read alignment, just with a larger
buffer.  But that runs counter to the goal of avoiding extra
allocations.

Thanks,
Florian


  reply	other threads:[~2022-06-24 11:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03  7:36 Florian Weimer
2022-05-03  8:06 ` Andreas Schwab
2022-05-03  8:31   ` Florian Weimer
2022-05-03  8:46     ` Andreas Schwab
2022-05-03  9:01       ` Florian Weimer
2022-05-03  9:10         ` Andreas Schwab
2022-05-03 10:45         ` Cristian Rodríguez
2022-05-04  0:40 ` Paul Eggert
2022-06-09  7:37   ` Florian Weimer
2022-06-09 20:08     ` Paul Eggert
2022-06-24 11:01       ` Florian Weimer [this message]
2022-06-24 20:35         ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fsjujpf4.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=eggert@cs.ucla.edu \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).