From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 5FBEA38582A7 for ; Fri, 24 Jun 2022 11:01:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5FBEA38582A7 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-663-uYSRio6pNyGiXcIQFJh0ew-1; Fri, 24 Jun 2022 07:01:38 -0400 X-MC-Unique: uYSRio6pNyGiXcIQFJh0ew-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85E3985A589; Fri, 24 Jun 2022 11:01:38 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.192.178]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A2A462166B26; Fri, 24 Jun 2022 11:01:37 +0000 (UTC) From: Florian Weimer To: Paul Eggert Cc: libc-alpha@sourceware.org Subject: Re: [PATCH] stdio-common: Add the fgetln function References: <871qxbxe2i.fsf@oldenburg.str.redhat.com> <577d0656-8b38-07d8-7b48-01870d3730c7@cs.ucla.edu> <87r13y5lth.fsf@oldenburg.str.redhat.com> Date: Fri, 24 Jun 2022 13:01:35 +0200 In-Reply-To: (Paul Eggert's message of "Thu, 9 Jun 2022 13:08:58 -0700") Message-ID: <87fsjujpf4.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Jun 2022 11:01:47 -0000 * Paul Eggert: > On 6/9/22 00:37, Florian Weimer wrote: >> * Paul Eggert: >> >>> If the stream is not already oriented, FreeBSD getln sets the stream >>> to byte-orientation. Should glibc getln do the same? >> Our getdelim doesn't do that explicitly. > > I raised the issue because one motivation for adding fgetln is to be > compatible with FreeBSD. Although the orientation issue is secondary > and can be detached from the main issue of adding fgetln, it might be > helpful to address it while fgetln is being added (assuming it's > added) rather than later. > > Perhaps we'll decide that neither fgetln nor getdelim should change > orientation, i.e., we're deliberately incompatible with > FreeBSD. That's OK too. I will think about it. >> I'm not sure if it's more efficient. The I/O block granularity would >> change depending on where lines end. > > Can't we arrange for I/O blocking to be respected as the buffer grows? > fgetln shouldn't need to stop reading the instant it sees a newline; > it can read with the same blocksize it always does. > > My sense is that a one-buffer solution is more efficient than two > buffers, where data are copied from one into the other. Of course I > haven't measured this though. I'm not sure if there is a good allocation scheme for this that is obviously superior to a separate allocation. If the end of the line crosses the buffer boundary for the first time, moving the line to the start of the buffer does not gives of sufficient room for a full block, so we have to grow the buffer, by at least the number of bytes in the line prefix read so far. Not sure if we need some exponential resizing policy there. Assuming that the line is reasonably long, we will then find a line terminator in the newly read block, and can return a pointer to the start of the buffer from fgetln. But we would have to teach the rest of libio to avoid the extra buffer space at the end during future read operations. We could avoid these changes if we resized the buffer to twice the original buffer size. Then we'd still maintain buffer read alignment, just with a larger buffer. But that runs counter to the goal of avoiding extra allocations. Thanks, Florian