From: Alex Colomar <alx.manpages@gmail.com>
To: Martin Uecker <uecker@tugraz.at>, Joseph Myers <joseph@codesourcery.com>
Cc: Ingo Schwarze <schwarze@usta.de>,
JeanHeyd Meneide <wg14@soasis.org>,
linux-man@vger.kernel.org, gcc@gcc.gnu.org
Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
Date: Tue, 29 Nov 2022 00:18:56 +0100 [thread overview]
Message-ID: <494309ce-c8ec-5219-f83e-b8dda5b9bcd1@gmail.com> (raw)
In-Reply-To: <b78e43af88ccd2443363e88e8e2be3d1a4d75312.camel@tugraz.at>
[-- Attachment #1.1: Type: text/plain, Size: 4976 bytes --]
Hi Martin,
On 11/13/22 15:58, Martin Uecker wrote:
> Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
>>
>> On 11/13/22 14:33, Alejandro Colomar wrote:
>>> Hi Martin,
>>>
>>> On 11/13/22 14:19, Alejandro Colomar wrote:
>>>>> But there are not only syntactical problems, because
>>>>> also the type of the parameter might become relevant
>>>>> and then you can get circular dependencies:
>>>>>
>>>>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>>
>>>> This seems to be a difficult stone in the road.
>
> But note that GNU forward declarations solve this nicely.
Okay, so GNU declarations basically work by duplicating (some of) the
declarations.
How about the compiler parsing the parameter list twice? One for
getting the declarations and their types (but not resolving any
sizeof(), _Lengthof(), or typeof(), when they contain .identifier (or
expressions containing it; in those cases, leave the type incomplete, to
be completed in the second pass). As if the programmer had specified
the firward declarations, but it's the compiler that gets them
automatically.
I guess asking the compiler to do two passes on the param list isn't as
bad as asking to do unbound lookahead. In this case it's bound: look
ahead till the end of the param list; get as much info as possible, and
then do it again to complete. Anything not yet clear after two passes
is not valid.
So, for
void foo(char (*a)[sizeof(*.b)], char (*b)[sizeof(*.a)]);
in the first pass, the compiler would read:
char (*a)[sizeof(*.b)]; // sizeof .identifier; incomplete type;
continue parsing
char (*b)[sizeof(*.a)]; // sizeof .identifier; incomplete type;
continue parsing
At the end of the first pass, the compiler only know:
char (*a)[];
char (*b)[];
At the second pass, when evaluating sizeof(), since the type of the
arguments are yet incomplete, it can't be evaluated, and therefore,
there's an error at the first sizeof(*.b): *.b has incomplete type.
---
Let's show a distinct case:
void foo(char (*a)[sizeof(*.b)], char (*b)[10]);
After the first pass, the compiler would know:
char (*a)[];
char (*b)[10];
At the second pass, sizeof(*.b) would be evaluated undoubtedly to
sizeof(char[10]), and the parameter list would then be fine.
Does this 2-pass parsing make sense to you? Did I miss any details?
>
>>>>
>>>>> I am not sure what would the best way to fix it. One
>>>>> could specifiy that parameters referred to by
>>>>> the .identifer syntax must of some integer type and
>>>>> that the sub-expression .identifer is always
>>>>> converted to a 'size_t'.
>>>>
>>>> That makes sense, but then overnight some quite useful thing came to my mind
>>>> that would not be possible with this limitation:
>>>>
>>>>
>>>> <https://software.codidact.com/posts/285946>
>>>>
>>>> char *
>>>> stpecpy(char dst[.end - .dst], char *src, char end[1])
>>
>> Heh, I got an off-by-one error. It should be dst[.end - .dst + 1], of course,
>> and then the result of the whole expression would be 0, which is fine as size_t.
>>
>> So, never mind.
>
> .end and .dst would have pointer size though.
>
>>>> {
>>>> for (/* void */; dst <= end; dst++) {
>>>> *dst = *src++;
>>>> if (*dst == '\0')
>>>> return dst;
>>>> }
>>>> /* Truncation detected */
>>>> *end = '\0';
>>>>
>>>> #if !defined(NDEBUG)
>>>> /* Consume the rest of the input string. */
>>>> while (*src++) {};
>>>> #endif
>>>>
>>>> return end + 1;
>>>> }
>>> And I forgot to say it: Default promotions rank high (probably the highest) in
>>> my list of most hated features^Wbugs in C.
>
> If you replaced them with explicit conversion you then have
> to add by hand all the time, I am pretty sure most people
> would hate this more. (and it could also hide bugs)
>
>>> I wouldn't convert it to size_t, but
>>> rather follow normal promotion rules.
>
> The point of making it size_t is that you then
> do need to know the type of the parameter to make
> sense of the expression. If the type matters, then you get
> mutual dependencies as in the example above.
>
>>> Since you can use anything between INTMAX_MIN and UINTMAX_MAX for accessing an
>>> array (which took me some time to understand), I'd also allow the same here. So,
>>> the type of the expression between [] could perfectly be signed or unsigned.
>>>
>>> So, you could use size_t for very high indices, or e.g. ptrdiff_t if you want to
>>> allow negative numbers. In the function above, since dst can be a pointer to
>>> one-past-the-end (it represents a previous truncation; that's why the test
>>> dst<=end), forcing a size_t conversion would disallow that syntax.
>
> Yes, this then does not work.
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2022-11-28 23:19 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220826210710.35237-1-alx.manpages@gmail.com>
[not found] ` <Ywn7jMtB5ppSW0PB@asta-kit.de>
[not found] ` <89d79095-d1cd-ab2b-00e4-caa31126751e@gmail.com>
[not found] ` <YwoXTGD8ljB8Gg6s@asta-kit.de>
[not found] ` <e29de088-ae10-bbc8-0bfd-90bbb63aaf06@gmail.com>
[not found] ` <5ba53bad-019e-8a94-d61e-85b2f13223a9@gmail.com>
[not found] ` <CACqA6+mfaj6Viw+LVOG=nE350gQhCwVKXRzycVru5Oi4EJzgTg@mail.gmail.com>
[not found] ` <491a930d-47eb-7c86-c0c4-25eef4ac0be0@gmail.com>
2022-09-02 21:57 ` Alejandro Colomar
2022-09-03 12:47 ` Martin Uecker
2022-09-03 13:29 ` Ingo Schwarze
2022-09-03 15:08 ` Alejandro Colomar
2022-09-03 13:41 ` Alejandro Colomar
2022-09-03 14:35 ` Martin Uecker
2022-09-03 14:59 ` Alejandro Colomar
2022-09-03 15:31 ` Martin Uecker
2022-09-03 20:02 ` Alejandro Colomar
2022-09-05 14:31 ` Alejandro Colomar
2022-11-10 0:06 ` Alejandro Colomar
2022-11-10 0:09 ` Alejandro Colomar
2022-11-10 1:33 ` Joseph Myers
2022-11-10 1:39 ` Joseph Myers
2022-11-10 6:21 ` Martin Uecker
2022-11-10 10:09 ` Alejandro Colomar
2022-11-10 23:19 ` Joseph Myers
2022-11-10 23:28 ` Alejandro Colomar
2022-11-11 19:52 ` Martin Uecker
2022-11-12 1:09 ` Joseph Myers
2022-11-12 7:24 ` Martin Uecker
2022-11-12 12:34 ` Alejandro Colomar
2022-11-12 12:46 ` Alejandro Colomar
2022-11-12 13:03 ` Joseph Myers
2022-11-12 13:40 ` Alejandro Colomar
2022-11-12 13:58 ` Alejandro Colomar
2022-11-12 14:54 ` Joseph Myers
2022-11-12 15:35 ` Alejandro Colomar
2022-11-12 17:02 ` Joseph Myers
2022-11-12 17:08 ` Alejandro Colomar
2022-11-12 15:56 ` Martin Uecker
2022-11-13 13:19 ` Alejandro Colomar
2022-11-13 13:33 ` Alejandro Colomar
2022-11-13 14:02 ` Alejandro Colomar
2022-11-13 14:58 ` Martin Uecker
2022-11-13 15:15 ` Alejandro Colomar
2022-11-13 15:32 ` Martin Uecker
2022-11-13 16:25 ` Alejandro Colomar
2022-11-13 16:28 ` Alejandro Colomar
2022-11-13 16:31 ` Alejandro Colomar
2022-11-13 16:34 ` Alejandro Colomar
2022-11-13 16:56 ` Alejandro Colomar
2022-11-13 19:05 ` Alejandro Colomar
2022-11-14 18:13 ` Joseph Myers
2022-11-28 22:59 ` Alex Colomar
2022-11-28 23:18 ` Alex Colomar [this message]
2022-11-29 0:05 ` Joseph Myers
2022-11-29 14:58 ` Michael Matz
2022-11-29 15:17 ` Uecker, Martin
2022-11-29 15:44 ` Michael Matz
2022-11-29 16:58 ` Uecker, Martin
2022-11-29 17:28 ` Alex Colomar
2022-11-29 16:49 ` Joseph Myers
2022-11-29 16:53 ` Jonathan Wakely
2022-11-29 17:00 ` Martin Uecker
2022-11-29 17:19 ` Alex Colomar
2022-11-29 17:29 ` Alex Colomar
2022-12-03 21:03 ` Alejandro Colomar
2022-12-03 21:13 ` Andrew Pinski
2022-12-03 21:15 ` Martin Uecker
2022-12-03 21:18 ` Alejandro Colomar
2022-12-06 2:08 ` Joseph Myers
2022-11-14 17:52 ` Joseph Myers
2022-11-14 17:57 ` Alejandro Colomar
2022-11-14 18:26 ` Joseph Myers
2022-11-28 23:02 ` Alex Colomar
2022-11-10 9:40 ` G. Branden Robinson
2022-11-10 10:59 ` Alejandro Colomar
2022-11-10 22:25 ` G. Branden Robinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=494309ce-c8ec-5219-f83e-b8dda5b9bcd1@gmail.com \
--to=alx.manpages@gmail.com \
--cc=gcc@gcc.gnu.org \
--cc=joseph@codesourcery.com \
--cc=linux-man@vger.kernel.org \
--cc=schwarze@usta.de \
--cc=uecker@tugraz.at \
--cc=wg14@soasis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).