public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Amol Surati <suratiamol@gmail.com>
To: Alejandro Colomar <alx@kernel.org>
Cc: libc-help@sourceware.org, gcc-help@gcc.gnu.org,
	 Guillem Jover <guillem@hadrons.org>,
	libbsd@lists.freedesktop.org
Subject: Re: restrictness of strtoi(3bsd) and strtol(3)
Date: Sun, 3 Dec 2023 16:29:07 +0530	[thread overview]
Message-ID: <CA+nuEB-T9Qi8eRwVovsZau33J5o+sAQ7X-MD9wy8Up1C_-3qkA@mail.gmail.com> (raw)
In-Reply-To: <ZWskPqcvoqXq6dEN@debian>

On Sat, 2 Dec 2023 at 18:05, Alejandro Colomar via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
> On Sat, Dec 02, 2023 at 01:29:01PM +0100, Alejandro Colomar wrote:
> > On Sat, Dec 02, 2023 at 12:50:28PM +0100, Alejandro Colomar wrote:
> > > Hi,
> > >
> > > I've been implementing my own copy of strto[iu](3bsd), to avoid the
> > > complexity of calling strtol(3) et al.  In the process, I've noticed
> > > that all of these functions use restrict for their parameters.
> > >
> > > Why do these functions use restrict?  While the second parameter is not
> > > used for accessing nptr memory (**endptr is not accessed), it can point
> > > to the same memory.  Here is an example of how these functions can have
> > > pointers to the same memory in the two arguments.
> > >
> > >     l = strtol(p, &p, 0);
> > >
> > > The use of restrict in the prototype of the function could result in
> > > compiler warnings, no?  Currently, I don't see any warnings, but I
> > > suspect the compiler could complain, since the same memory is available
> > > to the function via two different arguments (albeit with a different
> > > number of references).
> > >
> > > The use of restrict in the definition of the function doesn't help the
> > > optimizer, since it already knows that the second parameter is out-only,
> > > so even if it weren't restrict, the only way to access memory is via the
> > > first parameter.
> >
> > In the case of strto[iu](3bsd), I have even more doubts.
> >
> > Here's libbsd's version of it (omitting unimportant parts):
> >
> >       $ grepc -tfd strtoi .
> >       ./src/strtoi.c:intmax_t
> >       strtoi(const char *__restrict nptr,
> >              char **__restrict endptr, int base,
> >              intmax_t lo, intmax_t hi, int *rstatus)
> >       {
> >               ...
> >
> >               im = strtoimax(nptr, endptr, base);
> >
> >               *rstatus = errno;
> >               errno = serrno;
> >
> >               if (*rstatus == 0) {
> >                       /* No digits were found */
> >                       if (nptr == *endptr)
> >                               *rstatus = ECANCELED;
> >                       /* There are further characters after number */
> >                       else if (**endptr != '\0')
> >                               *rstatus = ENOTSUP;
> >               }
> >
> >               ...
> >
> >               return im;
> >       }
> >
> > Let's say the base is unsupported (e.g., -42), and endptr initially
> > points to nptr-1.  Imagine this call:
> >
> >       i = strtoimax(p + 1, &p, -42);
> >
> > ISO C doesn't specify what happens if the base is not between 0 and 36,
> > so the behavior is probably undefined in ISO C.
> >
> > POSIX says it returns 0 and sets errno to EINVAL, but doesn't say what
> > happens to endptr.  I expect two possible implementations:
> >
> > -  Leave endptr untouched.
> > -  Set *endptr = nptr.
> >
> > Let's suppose it leaves endptr untouched (otherwise, it would be
> > impossible to portably differentiate an EINVAL due to unsupported base
> > from an EINVAL due to no digits in the string).
> >
> > So, the test (nptr == *endptr) would be false (because p+1 != p), and
> > the code would jump into accessing **endptr without having derived
> > that pointer from nptr, which is a violation of restrict.
>
> Oops, it's within an (errno == 0) path, so *endptr is guaranteed to be
> derived from nptr here.
>
> So no bug, but still unclear to me what's the benefit of using restrict,

The section "7. Library" at [1] has some information about the 'restrict'
keyword.

I think the restrict keywords compel the programmer to keep the string
(or that portion of the string that strtol actually accesses) and the
pointer to a string in non-overlapping memory regions. Calling
strtol(p, &p, 0) should be well-defined in such cases.
-------------------
[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n881.pdf
-Amol

> and also unclear why GCC doesn't warn about it at call site.
>
> > I made many assumptions here, where the standards are not clear, so I
> > may be wrong in some of them.  But it looks to me like a bug.
> >
> > CCing libbsd.
> >
> > Cheers,
> > Alex
> >
> > --
> > <https://www.alejandro-colomar.es/>
>
>
>
> --
> <https://www.alejandro-colomar.es/>

  reply	other threads:[~2023-12-03 10:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-02 11:50 Alejandro Colomar
2023-12-02 12:29 ` Alejandro Colomar
2023-12-02 12:34   ` Alejandro Colomar
2023-12-03 10:59     ` Amol Surati [this message]
2023-12-03 11:35       ` Alejandro Colomar
2023-12-03 15:38         ` Amol Surati
2023-12-03 16:33           ` Alejandro Colomar
2023-12-03 16:46             ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+nuEB-T9Qi8eRwVovsZau33J5o+sAQ7X-MD9wy8Up1C_-3qkA@mail.gmail.com \
    --to=suratiamol@gmail.com \
    --cc=alx@kernel.org \
    --cc=gcc-help@gcc.gnu.org \
    --cc=guillem@hadrons.org \
    --cc=libbsd@lists.freedesktop.org \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).