From: Martin Uecker <uecker@tugraz.at>
To: Alejandro Colomar <alx.manpages@gmail.com>,
JeanHeyd Meneide <wg14@soasis.org>
Cc: Ingo Schwarze <schwarze@usta.de>,
linux-man@vger.kernel.org, gcc@gcc.gnu.org
Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
Date: Sat, 03 Sep 2022 14:47:00 +0200 [thread overview]
Message-ID: <4475b350c2a4d60da540c0f3055f466640e6c409.camel@tugraz.at> (raw)
In-Reply-To: <2abccaa2-d472-4c5b-aea6-7a2dddd665da@gmail.com>
...
> >
> > Whether or not you feel like the manpages are the best place to
> > start that, I'll leave up to you!
>
> I'll try to defend the reasons to start this in the man-pages.
>
> This feature is mostly for documentation purposes, not being meaningful
> for code at all (for some meaning of meaningful), since it won't change
> the function definition in any way, nor the calls to it. At least not
> by itself; static analysis may get some benefits, though.
GCC will warn if the bound is specified inconsistently between
declarations and also emit warnings if it can see that a buffer
which is passed is too small:
https://godbolt.org/z/PsjPG1nv7
BTW: If you declare pointers to arrays (not first elements) you
can get run-time bounds checking with UBSan:
https://godbolt.org/z/TvMo89WfP
>
> Also, new code can be designed from the beginning so that sizes go
> before their corresponding arrays, so that new code won't typically be
> affected by the lack of this feature in the language.
>
> This leaves us with legacy code, especially libc, which just works, and
> doesn't have any urgent needs to change their prototypes in this regard
> (they could, to improve static analysis, but not what we'd call urgent).
It would be useful step to find out-of-bounds problem in
applications using libc.
> And since most people don't go around reading libc headers searching for
> function declarations (especially since there are manual pages that show
> them nicely), it's not like the documentation of the code depends on how
> the function is _actually_ declared in code (that's why I also defended
> documenting restrict even if glibc wouldn't have cared to declare it),
> but it depends basically on what the manual pages say about the
> function. If the manual pages say a function gets 'restrict' params, it
> means it gets 'restrict' params, no matter what the code says, and if it
> doesn't, the function accepts overlapping pointers, at least for most of
> the public (modulo manual page bugs, that is).
>
> So this extension could very well be added by the manual pages, as a
> form of documentation, and then maybe picked up by compilers that have
> enough resources to implement it.
>
>
> Considering that this feature is mostly about documentation (and a bit
> of static analysis too), the documentation should be something appealing
> to the reader.
>
>
> Let's take an example:
>
>
> int getnameinfo(const struct sockaddr *restrict addr,
> socklen_t addrlen,
> char *restrict host, socklen_t hostlen,
> char *restrict serv, socklen_t servlen,
> int flags);
>
> and some transformations:
>
>
> int getnameinfo(const struct sockaddr *restrict addr,
> socklen_t addrlen,
> char host[restrict hostlen], socklen_t hostlen,
> char serv[restrict servlen], socklen_t servlen,
> int flags);
>
>
> int getnameinfo(socklen_t hostlen;
> socklen_t servlen;
> const struct sockaddr *restrict addr,
> socklen_t addrlen,
> char host[restrict hostlen], socklen_t hostlen,
> char serv[restrict servlen], socklen_t servlen,
> int flags);
>
> (I'm not sure if I used correct GNU syntax, since I never used that
> extension myself.)
>
> The first transformation above is non-ambiguous, as concise as possible,
> and its only issue is that it might complicate the implementation a bit
> too much. I don't think forward-using a parameter's size would be too
> much of a parsing problem for human readers.
I personally find the second form not terrible. Being
able to read code left-to-right, top-down is helpful in more
complicated examples.
> The second one is unnecessarily long and verbose, and semicolons are not
> very distinguishable from commas, for human readers, which may be very
> confusing.
>
> int foo(int a; int b[a], int a);
> int foo(int a, int b[a], int o);
>
> Those two are very different to the compiler, and yet very similar to
> the human eye. I don't like it. The fact that it allows for simpler
> compilers isn't enough to overcome the readability issues.
This is true, I would probably use it with a comma and/or
syntax highlighting.
> I think I'd prefer having the forward-using syntax as a non-standard
> extension --or a standard but optional language feature-- to avoid
> forcing small compilers to implement it, rather than having the GNU
> extension standardized in all compilers.
The problems with the second form are:
- it is not 100% backwards compatible (which maybe ok though) as
the semantics of the following code changes:
int n;
int foo(int a[n], int n); // refers to different n!
Code written for new compilers could then be misunderstood
by old compilers when a variable with 'n' is in scope.
- it would generally be fundamentally new to C to have
backwards references and parser might need to be changes
to allow this
- a compiler or tool then has to deal also with ugly
corner cases such as mutual references:
int foo(int (*a)[sizeof(*b)], int (*b)[sizeof(*a)]);
We could consider new syntax such as
int foo(char buf[.n], int n);
Personally, I would prefer the conceptual simplicity of forward
declarations and the fact that these exist already in GCC
over any alternative. I would also not mind new syntax, but
then one has to define the rules more precisely to avoid the
aforementioned problems.
Martin
next prev parent reply other threads:[~2022-09-03 12:47 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220826210710.35237-1-alx.manpages@gmail.com>
[not found] ` <Ywn7jMtB5ppSW0PB@asta-kit.de>
[not found] ` <89d79095-d1cd-ab2b-00e4-caa31126751e@gmail.com>
[not found] ` <YwoXTGD8ljB8Gg6s@asta-kit.de>
[not found] ` <e29de088-ae10-bbc8-0bfd-90bbb63aaf06@gmail.com>
[not found] ` <5ba53bad-019e-8a94-d61e-85b2f13223a9@gmail.com>
[not found] ` <CACqA6+mfaj6Viw+LVOG=nE350gQhCwVKXRzycVru5Oi4EJzgTg@mail.gmail.com>
[not found] ` <491a930d-47eb-7c86-c0c4-25eef4ac0be0@gmail.com>
2022-09-02 21:57 ` Alejandro Colomar
2022-09-03 12:47 ` Martin Uecker [this message]
2022-09-03 13:29 ` Ingo Schwarze
2022-09-03 15:08 ` Alejandro Colomar
2022-09-03 13:41 ` Alejandro Colomar
2022-09-03 14:35 ` Martin Uecker
2022-09-03 14:59 ` Alejandro Colomar
2022-09-03 15:31 ` Martin Uecker
2022-09-03 20:02 ` Alejandro Colomar
2022-09-05 14:31 ` Alejandro Colomar
2022-11-10 0:06 ` Alejandro Colomar
2022-11-10 0:09 ` Alejandro Colomar
2022-11-10 1:33 ` Joseph Myers
2022-11-10 1:39 ` Joseph Myers
2022-11-10 6:21 ` Martin Uecker
2022-11-10 10:09 ` Alejandro Colomar
2022-11-10 23:19 ` Joseph Myers
2022-11-10 23:28 ` Alejandro Colomar
2022-11-11 19:52 ` Martin Uecker
2022-11-12 1:09 ` Joseph Myers
2022-11-12 7:24 ` Martin Uecker
2022-11-12 12:34 ` Alejandro Colomar
2022-11-12 12:46 ` Alejandro Colomar
2022-11-12 13:03 ` Joseph Myers
2022-11-12 13:40 ` Alejandro Colomar
2022-11-12 13:58 ` Alejandro Colomar
2022-11-12 14:54 ` Joseph Myers
2022-11-12 15:35 ` Alejandro Colomar
2022-11-12 17:02 ` Joseph Myers
2022-11-12 17:08 ` Alejandro Colomar
2022-11-12 15:56 ` Martin Uecker
2022-11-13 13:19 ` Alejandro Colomar
2022-11-13 13:33 ` Alejandro Colomar
2022-11-13 14:02 ` Alejandro Colomar
2022-11-13 14:58 ` Martin Uecker
2022-11-13 15:15 ` Alejandro Colomar
2022-11-13 15:32 ` Martin Uecker
2022-11-13 16:25 ` Alejandro Colomar
2022-11-13 16:28 ` Alejandro Colomar
2022-11-13 16:31 ` Alejandro Colomar
2022-11-13 16:34 ` Alejandro Colomar
2022-11-13 16:56 ` Alejandro Colomar
2022-11-13 19:05 ` Alejandro Colomar
2022-11-14 18:13 ` Joseph Myers
2022-11-28 22:59 ` Alex Colomar
2022-11-28 23:18 ` Alex Colomar
2022-11-29 0:05 ` Joseph Myers
2022-11-29 14:58 ` Michael Matz
2022-11-29 15:17 ` Uecker, Martin
2022-11-29 15:44 ` Michael Matz
2022-11-29 16:58 ` Uecker, Martin
2022-11-29 17:28 ` Alex Colomar
2022-11-29 16:49 ` Joseph Myers
2022-11-29 16:53 ` Jonathan Wakely
2022-11-29 17:00 ` Martin Uecker
2022-11-29 17:19 ` Alex Colomar
2022-11-29 17:29 ` Alex Colomar
2022-12-03 21:03 ` Alejandro Colomar
2022-12-03 21:13 ` Andrew Pinski
2022-12-03 21:15 ` Martin Uecker
2022-12-03 21:18 ` Alejandro Colomar
2022-12-06 2:08 ` Joseph Myers
2022-11-14 17:52 ` Joseph Myers
2022-11-14 17:57 ` Alejandro Colomar
2022-11-14 18:26 ` Joseph Myers
2022-11-28 23:02 ` Alex Colomar
2022-11-10 9:40 ` G. Branden Robinson
2022-11-10 10:59 ` Alejandro Colomar
2022-11-10 22:25 ` G. Branden Robinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4475b350c2a4d60da540c0f3055f466640e6c409.camel@tugraz.at \
--to=uecker@tugraz.at \
--cc=alx.manpages@gmail.com \
--cc=gcc@gcc.gnu.org \
--cc=linux-man@vger.kernel.org \
--cc=schwarze@usta.de \
--cc=wg14@soasis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).