Hi Jonathan, On 11/14/22 14:14, Jonathan Wakely wrote: > On Mon, 14 Nov 2022 at 11:38, Alejandro Colomar via Gcc wrote: >> BTW, I had another idea to add a suffix to string literals to make them >> unterminated: >> >> char foo[3] = "foo"u; // OK >> char bar[4] = "bar"; // OK >> >> char baz[4] = "baz"u; // Warning: initializer is too short. >> char etc[3] = "etc"; // Warning: unterminated string. >> >> Is that doable? Do you think it makes sense? > > IMHO no. This is not useful enough to add a language extension, it's > an incredibly niche use case. I agree it's way too niche. > Your suggested syntax also looks very > confusing with UTF-16 string literals, Maybe. > and is not sufficiently > distinct from a normal string literal to be obvious when quickly > reading the code. People expect string literals in C to be > null-terminated, having a subtle suffix that changes that would be a > bug farm. But, you have to combine both the suffix with the corresponding size (one less than for normal strings). A programmer needs to be consciously doing this. For readers of the code, maybe there's a bit more of a readability issue, especially if you don't know the extension. But when you stop a little bit to check what that suffix is doing and then realize the size is weird, a reasonable programmer should at least ask or check the documentation for that thing. Regarding safety, I also have that thing very present in my mind, and in an attempt to get the compiler on my side, I decided to use 'char *' for NUL-terminated strings, and 'u_char *' for u_nterminated strings. That helps the compiler know when we're using one in place of another, which as you say would be a source of bugs. Maybe having the type of these new strings be u_char[] instead of char[] would help have more type safety. I didn't suggest this because that would not be how strings in C have always been. However, considering that they are not really strings, it could make sense. > > You can do {'b', 'a', 'z'} if you want an explicitly unterminated array of char. A bit unreadable :) I think I'll keep using normal literals, and maybe some workaround to disable the warnings for specific cases. Not my preference, but it can work. Cheers, Alex --