Hi Martin and Michael,

On 11/29/22 17:58, Uecker, Martin wrote:
> 
> Hi,
> 
> Am Dienstag, dem 29.11.2022 um 15:44 +0000 schrieb Michael Matz:
>> Hey,
>>
>> On Tue, 29 Nov 2022, Uecker, Martin wrote:
>>
>>> It does not require any changes on how arrays are represented.
>>>
>>> As part of VM-types the size becomes part of the type and this
>>> can be used for static or dynamic analysis, e.g. you can
>>> - today - get a run-time bounds violation with the sanitizer:
>>>
>>> void foo(int n, char (*buf)[n])
>>> {
>>>    (*buf)[n] = 1;
>>> }
>>
>> This can already statically analyzed as being wrong, no need for
>> dynamic checking.
> 
> In this toy example, but in general in can be checked
> only at run-time by using the information about the
> dynamic bound.
> 
>> What I mean is the checking of the claimed contract.
>> Above you assure for the function body that buf has n elements.
> 
> Yes.
> 
>> This is also a pre-condition for calling this function and
>> _that_ can't be checked in all  cases because:
>>
>>    void foo (int n, char (*buf)[n]) { (*buf)[n-1] = 1; }
>>    void callfoo(char * buf) { foo(10, buf); }
>>
>> buf doesn't have a known size.
> 
> This does not type check.
> 
>>   And a pre-condition that can't be checked
>> is no pre-condition at all, as only then it can become a guarantee
>> for the body.
> 
> The example above should look like:
> 
> void foo(int n, char (*buf)[n]);
> 
> void callfoo(char (*buf)[12]) { foo(10, buf); }
> 
> This could be checked by an UB sanitizer as calling
> the function with an argument of incompatible type
> is UB (but we currently do not do this)
> 
> 
> If you think about
> 
> void foo(int n, char buf[n]);
> 
> void callfoo(char *buf) { foo(10, buf); }
> 
> 
> Then you are right that this can not be checked at this
> time. But this  does not mean it is useless because we
> still can detect inconsistencies in other cases:
> 
> void callfoo(int n, char buf[n - 1]) { foo(n, buf); }
> 
> We could also - in the future - have a warning about all
> situations where bound information is lost, making sure
> that preconditions are always checked for people who
> consistently use these annotations.
> 
> 
>> The compiler has no choice than to trust the user that the pre-
>> condition  for calling foo is fulfilled.  I can see how
>> being able to just check half  of the contract might be
>> useful, but if it doesn't give full checking then
>> any proposal for syntax should be even more obviously
>> orthogonal than the current one.
> 
> Your argument is not clear to me.
> 
> 
>>> For
>>>
>>> void foo(int n, char buf[n]);
>>>
>>> it semantically has no meaning according to the C standard,
>>> but a compiler could still warn.
>>
>> Hmm?  Warn about what in this decl?
> 
> I meant, we could warn about something like this
> because it is likely an error:
> 
> void foo(int n, char buf[n])
> {
>    buf[n] = 1;
> }
> 
> 
>>> It could also warn for
>>>
>>> void foo(int n, char buf[n]);
>>>
>>> int main()
>>> {
>>>      char buf[9];
>>>      foo(buf);
>>> }
>>
>> You mean if you write 'foo(10,buf)' (the above, as is, is simply a
>> syntax error for non-matching number of args).  Or was it a mispaste
>> and you mean  the one from the godbolt link, i.e.:
> 
> I meant:
> 
> char buf[9];
> foo(10, buf);
> 
> In fact, it turns out we warn already:
> 
> https://godbolt.org/z/qcvsv87Ev
> 
>> void foo(char buf[10]){ buf[9] = 1; }
>> int main()
>> {
>>      char buf[9];
>>      foo(buf);
>> }
>>
>> ?  If so, yeah, we warn already.  I don't think this is an argument
>> for (or against) introducing new syntax.
>> ...
> 
> It is argument for having this syntax, because we could
> extend such warning (those we already have and those we
> could still add) to more common cases such as
> 
> void foo(char buf[.n], size_t n);
> 
> In my opinion, this would a huge step forward for
> safety of C programs as we already have a lot of
> infrastructure for checking bounds.
> 
> Of course, the existing GNU extension would achieve
> the same thing:
> 
> void foo(size_t n; char buf[n], size_t n);
> 
> 
> 
>>> But in general: This feature is useful not only for documentation
>>> but also for analysis.
>>
>> Which feature we're talking about now?  The ones you used all work
>> today,
>> as you demonstrated.  I thought we would be talking about that
>> ".whatever"
>> syntax to refer to arbitrary parameters, even following ones?  I
>> think a
>> disrupting syntax change like that should have a higher bar than "in
>> some
>> cases, depending on circumstance, we might even be able to warn".
> 
> We can use our existing features and then apply them
> to cases where the bound is specified after the pointer,
> which is more common in practice.

Yep; basically adding some (not perfect, but some) static analysis to 
many libc function calls.

Also, considering the issues with sizeof and arrays, and the lack of a 
_Nitems() [proposed as _Lengthof()] operator, there's a lot of manual 
work in array (read pointer) parameters.

However, a hypothetical _Nitems() operator could make use of this 
syntactic sugar, and be more useful than just providing static analysis. 
  Using _Nitems() on a VMT (including pointer parameters) could be 
specified to return the number of elements, so I foresee code like:


void foo(int arr[nmemb], size_t nmemb)
{
         // _Nitems() evaluates to nmemb
         for (size_t i = 0; i < _Nitems(arr); i++)
                 arr[i] = i;
}


void bar(int arr[])
{
         // Constraint violation
         for (size_t i = 0; i < _Nitems(arr); i++)
                 arr[i] = i;
}


This is probably the most useful part of this feature (but admittedly 
it's not only about this feature, or even could be added without this 
feature).

> 
> 
> Martin
> 

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>