Hi Martin,

On 11/13/22 15:58, Martin Uecker wrote:
> Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
>>
>> On 11/13/22 14:33, Alejandro Colomar wrote:
>>> Hi Martin,
>>>
>>> On 11/13/22 14:19, Alejandro Colomar wrote:
>>>>> But there are not only syntactical problems, because
>>>>> also the type of the parameter might become relevant
>>>>> and then you can get circular dependencies:
>>>>>
>>>>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>>
>>>> This seems to be a difficult stone in the road.
> 
> But note that GNU forward declarations solve this nicely.

Okay, so GNU declarations basically work by duplicating (some of) the 
declarations.

How about the compiler parsing the parameter list twice?  One for 
getting the declarations and their types (but not resolving any 
sizeof(), _Lengthof(), or typeof(), when they contain .identifier (or 
expressions containing it; in those cases, leave the type incomplete, to 
be completed in the second pass).  As if the programmer had specified 
the firward declarations, but it's the compiler that gets them 
automatically.

I guess asking the compiler to do two passes on the param list isn't as 
bad as asking to do unbound lookahead.  In this case it's bound:  look 
ahead till the end of the param list; get as much info as possible, and 
then do it again to complete.  Anything not yet clear after two passes 
is not valid.

So, for

     void foo(char (*a)[sizeof(*.b)], char (*b)[sizeof(*.a)]);

in the first pass, the compiler would read:

     char (*a)[sizeof(*.b)];  // sizeof .identifier; incomplete type; 
continue parsing
     char (*b)[sizeof(*.a)];  // sizeof .identifier; incomplete type; 
continue parsing

At the end of the first pass, the compiler only know:

     char (*a)[];
     char (*b)[];

At the second pass, when evaluating sizeof(), since the type of the 
arguments are yet incomplete, it can't be evaluated, and therefore, 
there's an error at the first sizeof(*.b): *.b has incomplete type.

---

Let's show a distinct case:

     void foo(char (*a)[sizeof(*.b)], char (*b)[10]);

After the first pass, the compiler would know:

     char (*a)[];
     char (*b)[10];

At the second pass, sizeof(*.b) would be evaluated undoubtedly to 
sizeof(char[10]), and the parameter list would then be fine.

Does this 2-pass parsing make sense to you?  Did I miss any details?


> 
>>>>
>>>>> I am not sure what would the best way to fix it. One
>>>>> could specifiy that parameters referred to by
>>>>> the .identifer syntax must of some integer type and
>>>>> that the sub-expression .identifer is always
>>>>> converted to a 'size_t'.
>>>>
>>>> That makes sense, but then overnight some quite useful thing came to my mind
>>>> that would not be possible with this limitation:
>>>>
>>>>
>>>> <https://software.codidact.com/posts/285946>
>>>>
>>>> char *
>>>> stpecpy(char dst[.end - .dst], char *src, char end[1])
>>
>> Heh, I got an off-by-one error.  It should be dst[.end - .dst + 1], of course,
>> and then the result of the whole expression would be 0, which is fine as size_t.
>>
>> So, never mind.
> 
> .end and .dst would have pointer size though.
> 
>>>> {
>>>>       for (/* void */; dst <= end; dst++) {
>>>>           *dst = *src++;
>>>>           if (*dst == '\0')
>>>>               return dst;
>>>>       }
>>>>       /* Truncation detected */
>>>>       *end = '\0';
>>>>
>>>> #if !defined(NDEBUG)
>>>>       /* Consume the rest of the input string. */
>>>>       while (*src++) {};
>>>> #endif
>>>>
>>>>       return end + 1;
>>>> }
>>> And I forgot to say it:  Default promotions rank high (probably the highest) in
>>> my list of most hated features^Wbugs in C.
> 
> If you replaced them with explicit conversion you then have
> to add by hand all the time, I am pretty sure most people
> would hate this more. (and it could also hide bugs)
> 
>>> I wouldn't convert it to size_t, but
>>> rather follow normal promotion rules.
> 
> The point of making it size_t is that you then
> do need to know the type of the parameter to make
> sense of the expression. If the type matters, then you get
> mutual dependencies as in the example above.
> 
>>> Since you can use anything between INTMAX_MIN and UINTMAX_MAX for accessing an
>>> array (which took me some time to understand), I'd also allow the same here. So,
>>> the type of the expression between [] could perfectly be signed or unsigned.
>>>
>>> So, you could use size_t for very high indices, or e.g. ptrdiff_t if you want to
>>> allow negative numbers.  In the function above, since dst can be a pointer to
>>> one-past-the-end (it represents a previous truncation; that's why the test
>>> dst<=end), forcing a size_t conversion would disallow that syntax.
> 
> Yes, this then does not work.

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>