public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* LINE_MAX
@ 2024-05-20 21:49 Alejandro Colomar
  2024-05-20 22:26 ` LINE_MAX Vincent Lefevre
  2024-05-20 22:35 ` LINE_MAX Lennart Jablonka
  0 siblings, 2 replies; 6+ messages in thread
From: Alejandro Colomar @ 2024-05-20 21:49 UTC (permalink / raw)
  To: libc-alpha, Eric Blake; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]

Hi Eric!

I think I found a bug in POSIX.1-2017 (and probably, previous ones too,
but didn't check).

<https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html>:
     {LINE_MAX}
           Unless  otherwise  noted, the maximum length, in bytes, of a
           utility’s input  line  (either  standard  input  or  another
           file),  when  the  utility  is  described as processing text
           files. The length includes room for the trailing <newline>.
           Minimum Acceptable Value: {_POSIX2_LINE_MAX}

It doesn't say anything about the trailing null byte for the buffer that
holds it, but I assume it doesn't include it, from the context.

However:
<https://pubs.opengroup.org/onlinepubs/009695399/functions/fgets.html>:
The following sections are informative.
EXAMPLES

    Reading Input

    The following example uses fgets() to read each line of input. {LINE_MAX}, which defines the maximum size of the input line, is defined in the <limits.h> header.

    #include <stdio.h>
    ...
    char line[LINE_MAX];
    ...
    while (fgets(line, LINE_MAX, fp) != NULL) {
    ...
    }
    ...


This example seems to contradict my understanding of what limits.h says.

So, either limits.h should be explicit that the trailing null byte is
also included in LINE_MAX, or the example is bogus and should be fixed.
I guess it's the latter, although I wish it was the former, so we can
avoid a +1 in the code.

In any case, could you please forward this to the Austin group?

Have a lovely night!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LINE_MAX
  2024-05-20 21:49 LINE_MAX Alejandro Colomar
@ 2024-05-20 22:26 ` Vincent Lefevre
  2024-05-21 10:08   ` LINE_MAX Alejandro Colomar
  2024-05-20 22:35 ` LINE_MAX Lennart Jablonka
  1 sibling, 1 reply; 6+ messages in thread
From: Vincent Lefevre @ 2024-05-20 22:26 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: libc-alpha, Eric Blake, linux-man

On 2024-05-20 23:49:13 +0200, Alejandro Colomar wrote:
> I think I found a bug in POSIX.1-2017 (and probably, previous ones too,
> but didn't check).

I already reported the issue in 2009 about the example:

> However:
> <https://pubs.opengroup.org/onlinepubs/009695399/functions/fgets.html>:
> The following sections are informative.
> EXAMPLES
> 
>     Reading Input
> 
>     The following example uses fgets() to read each line of input. {LINE_MAX}, which defines the maximum size of the input line, is defined in the <limits.h> header.
> 
>     #include <stdio.h>
>     ...
>     char line[LINE_MAX];
>     ...
>     while (fgets(line, LINE_MAX, fp) != NULL) {
>     ...
>     }
>     ...

See thread "fgets/strtok and LINE_MAX" I started on 2009-09-21
in the Austin Group mailing-list. It is available on gmane:

Path: news.gmane.org!not-for-mail
From: Vincent Lefevre <vincent-opgr-opTGSl+ZDNkdnm+yROfE0A@public.gmane.org>
Newsgroups: gmane.comp.standards.posix.austin.general
Subject: fgets/strtok and LINE_MAX
Date: Mon, 21 Sep 2009 01:03:13 +0200
Lines: 31
Approved: news@gmane.org
Message-ID: <20090920230313.GV657@prunille.vinc17.org>
[...]

There's the issue with the missing "+1", but also whether
LINE_MAX < INT_MAX.

See also
  https://www.austingroupbugs.net/view.php?id=182

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LINE_MAX
  2024-05-20 21:49 LINE_MAX Alejandro Colomar
  2024-05-20 22:26 ` LINE_MAX Vincent Lefevre
@ 2024-05-20 22:35 ` Lennart Jablonka
  2024-05-21 10:14   ` LINE_MAX Alejandro Colomar
  1 sibling, 1 reply; 6+ messages in thread
From: Lennart Jablonka @ 2024-05-20 22:35 UTC (permalink / raw)
  To: Alejandro Colomar, libc-alpha, Eric Blake; +Cc: linux-man

Quoth Alejandro Colomar:
>I think I found a bug in POSIX.1-2017 (and probably, previous ones too,
>but didn't check).
>
><https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html>:
>     {LINE_MAX}
>           Unless  otherwise  noted, the maximum length, in bytes, of a
>           utility’s input  line  (either  standard  input  or  another
>           file),  when  the  utility  is  described as processing text
>           files. The length includes room for the trailing <newline>.
>           Minimum Acceptable Value: {_POSIX2_LINE_MAX}
>
>It doesn't say anything about the trailing null byte for the buffer that
>holds it, but I assume it doesn't include it, from the context.
>
>However:
><https://pubs.opengroup.org/onlinepubs/009695399/functions/fgets.html>:
>The following sections are informative.
>EXAMPLES
>
>    Reading Input
>
>    The following example uses fgets() to read each line of input. {LINE_MAX}, which defines the maximum size of the input line, is defined in the <limits.h> header.
>
>    #include <stdio.h>
>    ...
>    char line[LINE_MAX];
>    ...
>    while (fgets(line, LINE_MAX, fp) != NULL) {
>    ...
>    }
>    ...
>
>
>This example seems to contradict my understanding of what limits.h says.
>
>So, either limits.h should be explicit that the trailing null byte is
>also included in LINE_MAX, or the example is bogus and should be fixed.
>I guess it's the latter, although I wish it was the former, so we can
>avoid a +1 in the code.
>
>In any case, could you please forward this to the Austin group?

Good find.  You aren’t the first one to find it: 
https://austingroupbugs.net/view.php?id=182 discusses that 
example a little.  The desired action written there appears 
verbatim (bar formatting) in the 4.1 draft of POSIX.1-202x.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LINE_MAX
  2024-05-20 22:26 ` LINE_MAX Vincent Lefevre
@ 2024-05-21 10:08   ` Alejandro Colomar
  2024-05-21 11:40     ` LINE_MAX Vincent Lefevre
  0 siblings, 1 reply; 6+ messages in thread
From: Alejandro Colomar @ 2024-05-21 10:08 UTC (permalink / raw)
  To: Vincent Lefevre, libc-alpha, Eric Blake, linux-man

[-- Attachment #1: Type: text/plain, Size: 1252 bytes --]

Hi Vincent,

On Tue, May 21, 2024 at 12:26:58AM GMT, Vincent Lefevre wrote:
> On 2024-05-20 23:49:13 +0200, Alejandro Colomar wrote:
> > I think I found a bug in POSIX.1-2017 (and probably, previous ones too,
> > but didn't check).
> 
> I already reported the issue in 2009 about the example:

Thanks!

> See thread "fgets/strtok and LINE_MAX" I started on 2009-09-21
> in the Austin Group mailing-list. It is available on gmane:
> 
> Path: news.gmane.org!not-for-mail
> From: Vincent Lefevre <vincent-opgr-opTGSl+ZDNkdnm+yROfE0A@public.gmane.org>
> Newsgroups: gmane.comp.standards.posix.austin.general
> Subject: fgets/strtok and LINE_MAX
> Date: Mon, 21 Sep 2009 01:03:13 +0200
> Lines: 31
> Approved: news@gmane.org
> Message-ID: <20090920230313.GV657@prunille.vinc17.org>
> [...]

Hmmm, how does that thing work?  Any http link available?

> 
> There's the issue with the missing "+1", but also whether
> LINE_MAX < INT_MAX.

I guess the LINE_MAX <? INT_MAX issue is not an actual issue as long as
implementations do the Right Thing and don't set it to >= INT_MAX.

> See also
>   https://www.austingroupbugs.net/view.php?id=182

Thanks!

Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LINE_MAX
  2024-05-20 22:35 ` LINE_MAX Lennart Jablonka
@ 2024-05-21 10:14   ` Alejandro Colomar
  0 siblings, 0 replies; 6+ messages in thread
From: Alejandro Colomar @ 2024-05-21 10:14 UTC (permalink / raw)
  To: libc-alpha, Eric Blake, linux-man

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

Hi Lennart,

On Mon, May 20, 2024 at 10:35:14PM GMT, Lennart Jablonka wrote:
> Quoth Alejandro Colomar:
> > I think I found a bug in POSIX.1-2017 (and probably, previous ones too,
> > but didn't check).

[...]

> > This example seems to contradict my understanding of what limits.h says.
> > 
> > So, either limits.h should be explicit that the trailing null byte is
> > also included in LINE_MAX, or the example is bogus and should be fixed.
> > I guess it's the latter, although I wish it was the former, so we can
> > avoid a +1 in the code.
> > 
> > In any case, could you please forward this to the Austin group?
> 
> Good find.  You aren’t the first one to find it:

:-)

> https://austingroupbugs.net/view.php?id=182 discusses that example a little.
> The desired action written there appears verbatim (bar formatting) in the
> 4.1 draft of POSIX.1-202x.

Could you please paste that part of the draft?  It's quite inaccessible
to me.  And since the info is already public in the ticket, I guess it's
not a violation of anything.

Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LINE_MAX
  2024-05-21 10:08   ` LINE_MAX Alejandro Colomar
@ 2024-05-21 11:40     ` Vincent Lefevre
  0 siblings, 0 replies; 6+ messages in thread
From: Vincent Lefevre @ 2024-05-21 11:40 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: libc-alpha, Eric Blake, linux-man

On 2024-05-21 12:08:13 +0200, Alejandro Colomar wrote:
> On Tue, May 21, 2024 at 12:26:58AM GMT, Vincent Lefevre wrote:
> > See thread "fgets/strtok and LINE_MAX" I started on 2009-09-21
> > in the Austin Group mailing-list. It is available on gmane:
> > 
> > Path: news.gmane.org!not-for-mail
> > From: Vincent Lefevre <vincent-opgr-opTGSl+ZDNkdnm+yROfE0A@public.gmane.org>
> > Newsgroups: gmane.comp.standards.posix.austin.general
> > Subject: fgets/strtok and LINE_MAX
> > Date: Mon, 21 Sep 2009 01:03:13 +0200
> > Lines: 31
> > Approved: news@gmane.org
> > Message-ID: <20090920230313.GV657@prunille.vinc17.org>
> > [...]
> 
> Hmmm, how does that thing work?

You need a NNTP client, such a "tin", and the server is currently
news.gmane.io (news.gmane.org was the one at that time, but it
changed in January 2020). If you use "tin", you may use something
like

news.gmane.io .newsrc-gmane gmane

in the .tin/newsrctable file, and run "tin -g gmane".

> Any http link available?

For Gmane, it is no longer possible to access it via http.
And I don't know any website that has archives for such old
Austin Group messages.

> > There's the issue with the missing "+1", but also whether
> > LINE_MAX < INT_MAX.
> 
> I guess the LINE_MAX <? INT_MAX issue is not an actual issue as long as
> implementations do the Right Thing and don't set it to >= INT_MAX.

Unfortunately, the int type is typically a 32-bit type, even on
64-bit platforms. This would mean a silly limit for 64-bit platforms.
2^31 is quite large, but for some particular uses (hmm... GNU MPFR
tests, for instance?), one may want to support larger text files.

Note also that some XML files have all the contents on a single line.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-05-21 11:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-20 21:49 LINE_MAX Alejandro Colomar
2024-05-20 22:26 ` LINE_MAX Vincent Lefevre
2024-05-21 10:08   ` LINE_MAX Alejandro Colomar
2024-05-21 11:40     ` LINE_MAX Vincent Lefevre
2024-05-20 22:35 ` LINE_MAX Lennart Jablonka
2024-05-21 10:14   ` LINE_MAX Alejandro Colomar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).