Hi Martin, On 9/3/22 14:47, Martin Uecker wrote: [...] > GCC will warn if the bound is specified inconsistently between > declarations and also emit warnings if it can see that a buffer > which is passed is too small: > > https://godbolt.org/z/PsjPG1nv7 That's very good news! BTW, it's nice to see that GCC doesn't need 'static' for array parameters. I never understood what the static keyword adds there. There's no way one can specify an array size an mean anything other than requiring that, for a non-null pointer, the array should have at least that size. > > > BTW: If you declare pointers to arrays (not first elements) you > can get run-time bounds checking with UBSan: > > https://godbolt.org/z/TvMo89WfP Couldn't that be caught at compile time? n is certainly out of bounds always for such an array, since the last element is n-1. > > >> >> Also, new code can be designed from the beginning so that sizes go >> before their corresponding arrays, so that new code won't typically be >> affected by the lack of this feature in the language. >> >> This leaves us with legacy code, especially libc, which just works, and >> doesn't have any urgent needs to change their prototypes in this regard >> (they could, to improve static analysis, but not what we'd call urgent). > > It would be useful step to find out-of-bounds problem in > applications using libc. Yep, it would be very useful for that. Not urgent, but yes, very useful. >> Let's take an example: >> >> >> int getnameinfo(const struct sockaddr *restrict addr, >> socklen_t addrlen, >> char *restrict host, socklen_t hostlen, >> char *restrict serv, socklen_t servlen, >> int flags); >> >> and some transformations: >> >> >> int getnameinfo(const struct sockaddr *restrict addr, >> socklen_t addrlen, >> char host[restrict hostlen], socklen_t hostlen, >> char serv[restrict servlen], socklen_t servlen, >> int flags); >> >> >> int getnameinfo(socklen_t hostlen; >> socklen_t servlen; >> const struct sockaddr *restrict addr, >> socklen_t addrlen, >> char host[restrict hostlen], socklen_t hostlen, >> char serv[restrict servlen], socklen_t servlen, >> int flags); >> >> (I'm not sure if I used correct GNU syntax, since I never used that >> extension myself.) >> >> The first transformation above is non-ambiguous, as concise as possible, >> and its only issue is that it might complicate the implementation a bit >> too much. I don't think forward-using a parameter's size would be too >> much of a parsing problem for human readers. > > > I personally find the second form not terrible. Being > able to read code left-to-right, top-down is helpful in more > complicated examples. > > > >> The second one is unnecessarily long and verbose, and semicolons are not >> very distinguishable from commas, for human readers, which may be very >> confusing. >> >> int foo(int a; int b[a], int a); >> int foo(int a, int b[a], int o); >> >> Those two are very different to the compiler, and yet very similar to >> the human eye. I don't like it. The fact that it allows for simpler >> compilers isn't enough to overcome the readability issues. > > This is true, I would probably use it with a comma and/or > syntax highlighting. > > >> I think I'd prefer having the forward-using syntax as a non-standard >> extension --or a standard but optional language feature-- to avoid >> forcing small compilers to implement it, rather than having the GNU >> extension standardized in all compilers. > > The problems with the second form are: > > - it is not 100% backwards compatible (which maybe ok though) as > the semantics of the following code changes: > > int n; > int foo(int a[n], int n); // refers to different n! > > Code written for new compilers could then be misunderstood > by old compilers when a variable with 'n' is in scope. > > Hmmm, this one is serious. I can't seem to solve it with that syntax. > - it would generally be fundamentally new to C to have > backwards references and parser might need to be changes > to allow this > > > - a compiler or tool then has to deal also with ugly > corner cases such as mutual references: > > int foo(int (*a)[sizeof(*b)], int (*b)[sizeof(*a)]); > > > > We could consider new syntax such as > > int foo(char buf[.n], int n); > > > Personally, I would prefer the conceptual simplicity of forward > declarations and the fact that these exist already in GCC > over any alternative. I would also not mind new syntax, but > then one has to define the rules more precisely to avoid the > aforementioned problems. What about taking something from K&R functions for this?: int foo(q; w; int a[q], int q, int s[w], int w); By not specifying the types, the syntax is again short. This is left-to-right, so no problems with global variables, and no need for complex parsers. Also, by not specifying types, now it's more obvious to the naked eye that there's a difference: int foo(a; int b[a], int a); int foo(int a, int b[a], int o); What do you think about this syntax? Thanks, Alex -- Alejandro Colomar