public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Zack Weinberg" <zack@owlfolio.org>
To: "Job Snijders" <job@fastly.com>, "Andreas Schwab" <schwab@suse.de>
Cc: "Florian Weimer" <fweimer@redhat.com>,
	"GNU libc development" <libc-alpha@sourceware.org>
Subject: Re: [PATCH] resolv: add IPv6 support to inet_net_pton()
Date: Fri, 22 Mar 2024 10:24:35 -0400	[thread overview]
Message-ID: <3f50bd26-9199-4174-9fde-8f890651f48e@app.fastmail.com> (raw)
In-Reply-To: <Zf0GHKlYs8d8PkEY@feather.sobornost.net>

On Fri, Mar 22, 2024, at 12:16 AM, Job Snijders wrote:
> On Tue, Mar 19, 2024 at 10:50:14AM +0100, Andreas Schwab wrote:
>> It's still unnecessary code that makes it harder to understand, which is
>> bad.

This seems to be the crux of the disagreement - Andreas says this code
is longer than necessary and that makes it harder to understand, Job
says the extra code makes it more idiomatic and therefore *easier* to
understand. Yes?

>     "If the correct value is outside the range of representable values,
>      LONG_MAX or LONG_MIN is returned (according to the sign of the value),
>      and errno is set to [ERANGE]."
>      https://pubs.opengroup.org/onlinepubs/7908799/xsh/strtol.html
>
> The above to me means that it is not enough to just check whether errno
> is set to ERANGE, but one also has to check whether LONG_MAX or LONG_MIN
> were returned. It's an 'AND' operation.

Not quite. POSIX specs are written as directives to the implementor.
A better way to read this is: If the correct values is outside the
representable range, strtol does two things: it sets errno to ERANGE,
and it returns the representable value farthest away from zero,
with the appropriate sign.

Note also this sentence

    "Because 0, LONG_MIN and LONG_MAX are returned on error and are
    also valid returns on success, an application wishing to check
    for error situations should set errno to 0, then call strtol(),
    then check errno."

This implicitly makes a guarantee that strtol *will not* set errno
to a nonzero value unless one of the specified error conditions
occurs.  (Most C library functions are permitted to set errno nonzero
even if they succeed.  No C library function is ever allowed to set
errno to zero.)  That means it's safe to look at errno *before*
looking at the return value, as the code under discussion does.

> Then, if errno is NOT set to ERANGE, then we have to additionally
> check whether the value is within contextual bounds.

Here I agree with Andreas.  If there are contextual bounds that are
a strict subset of the representable range (i.e. neither LONG_MAX
nor LONG_MIN is contextually valid) then it *doesn't matter* whether
LONG_MAX or LONG_MIN was actually the input value or whether it was
produced as a result of overflow; it's contextually invalid either
way, so the contextual bounds check subsumes the overflow check.

> So I don't think it is correct to skimp on checks here, to me
> idiomatic checks do not equate 'bad'.

Let's recap - the code under discussion is

+	save_errno = errno;
...
+	__set_errno (0);
+	lbits = strtol(sep, &ep, 10);
+	if (sep[0] == '\0' || *ep != '\0') {
+		__set_errno (ENOENT);
+		return (-1);
+	}
+	if ((errno == ERANGE && (lbits == LONG_MAX || lbits == LONG_MIN))
+	    || (lbits > 128 || lbits < 0)) {
+		__set_errno (EMSGSIZE);
+		return (-1);
+	}
...
+       __set_errno (save_errno);

Tangentially, "sep[0] == '\0'" is not the conventional way to check
for strtol not having advanced ep at all.  It *works*, but only
because "*ep == '\0'" catches all the cases it misses, and I didn't
see that until *after* I started writing this paragraph, which
originally would have claimed that this was an outright bug.  Please
change that to "ep == sep", even if you make no other changes.

Anyway, I would have written this as

    lbits = strtol(sep, &ep, 10);
    if (ep == sep || *ep != '\0') {
        __set_errno(ENOENT);
        return -1;
    }
    if (lbits < 0 || lbits > 128) {
        __set_errno(EMSGSIZE);
        return -1;
    }

and I do think this is an improvement over what you had, largely
because it eliminates the need to save and restore the original value
of errno.  inet_pton *is* allowed to set errno nonzero despite having
succeeded, the save/restore was only necessary because of internally
setting it to *zero*, and if we're not going to look at the errno
return from strtol then we don't need that.

It *does* mean future readers have to think for a moment to confirm
that one of the three cases where strtol might set errno is impossible
(invalid base, 10 is statically valid) and the other two are covered
by the conditions below (no characters consumed -> ep == sep, overflow
-> outside the range [0, 128]).  But in this case all the numbers
involved are small enough that the extra cognitive load is reasonable.
If it was "lbits > 4294967242" then I would probably include a comment
to the effect that 4294967242 is less than the minimum value of
LONG_MAX and therefore the input overflow check is subsumed.

zw

  reply	other threads:[~2024-03-22 14:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-22 17:54 Job Snijders
2022-12-22 18:21 ` Florian Weimer
2022-12-22 18:28 ` copying a string with truncation (was: [PATCH] resolv: add IPv6 support to inet_net_pton()) Alejandro Colomar
2022-12-22 20:25   ` Alejandro Colomar
2022-12-23  6:55     ` Sam James
2022-12-23  7:00       ` Sam James
2022-12-23 11:42         ` Alejandro Colomar
2022-12-23 11:45           ` Alejandro Colomar
2022-12-31 15:11           ` Sam James
2023-01-17 10:56 ` [PATCH] resolv: add IPv6 support to inet_net_pton() Job Snijders
2023-04-19 11:31   ` Job Snijders
2024-03-17  1:23 ` Job Snijders
2024-03-17  3:19   ` Job Snijders
2024-03-17 11:18     ` Florian Weimer
2024-03-18  8:59       ` Job Snijders
2024-03-18  9:23         ` Andreas Schwab
2024-03-18 23:01           ` Job Snijders
2024-03-19  8:20             ` Andreas Schwab
2024-03-19  8:29               ` Job Snijders
2024-03-19  9:50                 ` Andreas Schwab
2024-03-22  4:16                   ` Job Snijders
2024-03-22 14:24                     ` Zack Weinberg [this message]
2024-03-25  9:04                       ` Job Snijders
2024-04-14 14:56                         ` Job Snijders
2024-04-15  8:15                           ` Xi Ruoyao
2024-03-25  8:45                     ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f50bd26-9199-4174-9fde-8f890651f48e@app.fastmail.com \
    --to=zack@owlfolio.org \
    --cc=fweimer@redhat.com \
    --cc=job@fastly.com \
    --cc=libc-alpha@sourceware.org \
    --cc=schwab@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).