* UB status of snprintf on invalid ptr+size combination? @ 2023-03-14 19:47 Simon Chopin 2023-03-14 21:39 ` Paul Eggert 0 siblings, 1 reply; 32+ messages in thread From: Simon Chopin @ 2023-03-14 19:47 UTC (permalink / raw) To: libc-alpha Hi, We've come across a couple of regressions[0] in 2.37 on armhf due to a peculiar use of snprintf that boils down to this: char [32]buf; ... snprintf(buf, INT_MAX, "%s", "Hello world"); This used to work fine, but now fails in a non-obvious way: the return value is the one you would expect from a successful call (here, 11), but the last character is overwritten by a NUL terminator. It's likely the new behavior has been introduced by commit e88b9f0e5cc50cab57a299dc7efe1a4eb385161d Author: Florian Weimer <fweimer@redhat.com> Date: Mon Dec 19 18:56:54 2022 +0100 stdio-common: Convert vfprintf and related functions to buffers We're hitting the issue because buf+INT_MAX overflows on armhf (being 32-bit and on the stack). When the issue was first discovered[1], I didn't raise the issue because I dismissed it as UB, but it reappeared in an unrelated context, and colleagues pointed out that the wording in the standard doesn't actually say that the `n` argument is the size of the array. Do you think it worth it to try and fix this regression? Florian suggested an approach similar to commit 0d50f477f47ba637b54fb03ac48d769ec4543e8d Author: Florian Weimer <fweimer@redhat.com> Date: Wed Jan 25 08:01:00 2023 +0100 stdio-common: Handle -1 buffer size in __sprintf_chk & co (bug 30039) Thanks in advance, Simon [0]: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/2011326 [1]: https://bugs.launchpad.net/ubuntu/+source/notcurses/+bug/2008108 -- Simon Chopin Foundations Team Ubuntu MOTU/Core Dev simon.chopin@canonical.com schopin@ubuntu.com ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-14 19:47 UB status of snprintf on invalid ptr+size combination? Simon Chopin @ 2023-03-14 21:39 ` Paul Eggert 2023-03-15 9:22 ` Andreas Schwab 2023-03-15 12:39 ` Vincent Lefevre 0 siblings, 2 replies; 32+ messages in thread From: Paul Eggert @ 2023-03-14 21:39 UTC (permalink / raw) To: Simon Chopin; +Cc: libc-alpha On 3/14/23 12:47, Simon Chopin via Libc-alpha wrote: > When the issue was first discovered[1], I didn't raise the issue because I > dismissed it as UB, but it reappeared in an unrelated context, and > colleagues pointed out that the wording in the standard doesn't actually > say that the `n` argument is the size of the array. That sounds like a misreading of the C standard. Even though the standard often does not explicitly say that a size argument is the size of an array, it's obvious from context that this is intended. So Florian is correct here that the call with INT_MAX is not portable C code. For example, it's valid for snprintf to be implemented this way: int snprintf (char *buf, size_t size, char const *fmt, ...) { char *buf_limit = buf + size; ... } even though this would have undefined behavior if BUF points to a character array smaller than SIZE. glibc snprintf does the equivalent of the above internally, so it's a good thing that notcurses has been fixed to not use the INT_MAX trick as that trick already did not work in general with glibc and I assume with other C libraries (with rare and hard-to-diagnose failures), even aside from any fortification business that made the bug more visible. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-14 21:39 ` Paul Eggert @ 2023-03-15 9:22 ` Andreas Schwab 2023-03-15 15:54 ` Siddhesh Poyarekar ` (2 more replies) 2023-03-15 12:39 ` Vincent Lefevre 1 sibling, 3 replies; 32+ messages in thread From: Andreas Schwab @ 2023-03-15 9:22 UTC (permalink / raw) To: Paul Eggert; +Cc: Simon Chopin, libc-alpha On Mär 14 2023, Paul Eggert wrote: > For example, it's valid for snprintf to be implemented this way: > > int > snprintf (char *buf, size_t size, char const *fmt, ...) > { > char *buf_limit = buf + size; > ... > } > > even though this would have undefined behavior if BUF points to a > character array smaller than SIZE. Since it is part of the implementation it is irrelevant from the POV of the standard. The implementation does not have to abide to the C standard, as long as it properly implements the interface constraints. What matters is the wording of the standard. The POSIX standard is more explicit here: "with the addition of the n argument which states the size of the buffer referred to by s." Probably the C standard should be clarified. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-15 9:22 ` Andreas Schwab @ 2023-03-15 15:54 ` Siddhesh Poyarekar 2023-03-15 18:34 ` Michael Hudson-Doyle 2023-03-19 14:45 ` manfred 2 siblings, 0 replies; 32+ messages in thread From: Siddhesh Poyarekar @ 2023-03-15 15:54 UTC (permalink / raw) To: Andreas Schwab, Paul Eggert; +Cc: Simon Chopin, libc-alpha On 2023-03-15 05:22, Andreas Schwab via Libc-alpha wrote: > On Mär 14 2023, Paul Eggert wrote: > >> For example, it's valid for snprintf to be implemented this way: >> >> int >> snprintf (char *buf, size_t size, char const *fmt, ...) >> { >> char *buf_limit = buf + size; >> ... >> } >> >> even though this would have undefined behavior if BUF points to a >> character array smaller than SIZE. > > Since it is part of the implementation it is irrelevant from the POV of > the standard. The implementation does not have to abide to the C > standard, as long as it properly implements the interface constraints. > > What matters is the wording of the standard. The POSIX standard is more > explicit here: "with the addition of the n argument which states the > size of the buffer referred to by s." Probably the C standard should be > clarified. +1, the C standard wording ought to mirror POSIX here. FWIW, when built with fortification, this code will abort prematurely because it considers passed size being greater than the buffer size as being unsafe. Thanks, Sid ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-15 9:22 ` Andreas Schwab 2023-03-15 15:54 ` Siddhesh Poyarekar @ 2023-03-15 18:34 ` Michael Hudson-Doyle 2023-03-19 14:45 ` manfred 2 siblings, 0 replies; 32+ messages in thread From: Michael Hudson-Doyle @ 2023-03-15 18:34 UTC (permalink / raw) To: Andreas Schwab; +Cc: Paul Eggert, Simon Chopin, libc-alpha [-- Attachment #1: Type: text/plain, Size: 1210 bytes --] On Wed, 15 Mar 2023 at 22:22, Andreas Schwab via Libc-alpha < libc-alpha@sourceware.org> wrote: > On Mär 14 2023, Paul Eggert wrote: > > > For example, it's valid for snprintf to be implemented this way: > > > > int > > snprintf (char *buf, size_t size, char const *fmt, ...) > > { > > char *buf_limit = buf + size; > > ... > > } > > > > even though this would have undefined behavior if BUF points to a > > character array smaller than SIZE. > > Since it is part of the implementation it is irrelevant from the POV of > the standard. The implementation does not have to abide to the C > standard, as long as it properly implements the interface constraints. > > What matters is the wording of the standard. The POSIX standard is more > explicit here: "with the addition of the n argument which states the > size of the buffer referred to by s." Probably the C standard should be > clarified. > Ah that's interesting that POSIX is clearer here, thanks for pointing that out. I can feel more confident declaring the affected code broken now :-) Is anyone here close enough to the C standards process to push getting this clarified there? Cheers, mwh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-15 9:22 ` Andreas Schwab 2023-03-15 15:54 ` Siddhesh Poyarekar 2023-03-15 18:34 ` Michael Hudson-Doyle @ 2023-03-19 14:45 ` manfred 2023-03-19 23:07 ` Vincent Lefevre 2023-03-20 15:09 ` Vincent Lefevre 2 siblings, 2 replies; 32+ messages in thread From: manfred @ 2023-03-19 14:45 UTC (permalink / raw) To: libc-alpha After reading it a few times, I believe that the meaning of the wording of the ISO C standard is that 'n' is an upper limit to the number of characters written to s, not necessarily the size of the array. Also, the standard says that "The snprintf function is equivalent to fprintf, except that the output is written into an array (specified by argument s) rather than to a stream" Which implies that no access to the output array is performed after termination of encoding, regardless of the actual value of n. To me, it looks like in this case, i.e. when the encoded result is shorter than the array size and n, the distinction between n being the actual array size, or an upper limit to the length of the output string, is only relevant to the implementation: The question becomes: is the implementation allowed to access elements past the encoded result? Is it allowed to evaluate s+n? Out of curiosity, I compared the wording with the specifications in Annex K, which is explicitly aimed at checking array boundaries: as far as I can tell snprintf_s refers to snprintf, and does not add to this scenario. sprintf_s says about n: "n shall neither equal zero nor be greater than RSIZE_MAX. The number of characters (including the trailing null) required for the result to be written to the array pointed to by s shall not be greater than n." Again, this describes an upper limit to the length of the result, not the array size, but next in the description: "4 The sprintf_s function is equivalent to the sprintf function except for the parameter n and the explicit runtime-constraints listed above. 5 The sprintf_s function, unlike snprintf_s, treats a result too big for the array pointed to by s as a runtime-constraint violation." In this last sentence, the expression "a result too big for the array" hints at the array size (otherwise it should say "a result longer than n"), but propagating this to the rest of the text requires some imagination about the intention of the authors. All of that said, back to the OP case I would not pass INT_MAX to snprintf. If I have a situation wherein I know that the buffer is large enough, but I don't know its exact size, I'd use sprintf and be done with it. (I'm sure that the actual code is more elaborate than this, but still) My 2c On 3/15/2023 5:22 AM, Andreas Schwab via Libc-alpha wrote: > On Mär 14 2023, Paul Eggert wrote: > >> For example, it's valid for snprintf to be implemented this way: >> >> int >> snprintf (char *buf, size_t size, char const *fmt, ...) >> { >> char *buf_limit = buf + size; >> ... >> } >> >> even though this would have undefined behavior if BUF points to a >> character array smaller than SIZE. > > Since it is part of the implementation it is irrelevant from the POV of > the standard. The implementation does not have to abide to the C > standard, as long as it properly implements the interface constraints. > > What matters is the wording of the standard. The POSIX standard is more > explicit here: "with the addition of the n argument which states the > size of the buffer referred to by s." Probably the C standard should be > clarified. > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-19 14:45 ` manfred @ 2023-03-19 23:07 ` Vincent Lefevre 2023-03-20 12:05 ` Siddhesh Poyarekar 2023-03-20 15:09 ` Vincent Lefevre 1 sibling, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-19 23:07 UTC (permalink / raw) To: libc-alpha On 2023-03-19 10:45:59 -0400, manfred via Libc-alpha wrote: > All of that said, back to the OP case I would not pass INT_MAX to snprintf. > If I have a situation wherein I know that the buffer is large enough, but I > don't know its exact size, I'd use sprintf and be done with it. (I'm sure > that the actual code is more elaborate than this, but still) In simple code, probably. But in actual code, it may be more natural to use snprintf. Something like that: snprintf(buf, checked ? SIZE_MAX : n, "%s", s); The function may not know the buffer size if `checked` is true, so that it uses a known bound. Thanks to common code factorized, this is more readable than if (checked) sprintf (buf, "%s", s); else snprintf(buf, n, "%s", s); in particular in the cases where the format string is complex. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-19 23:07 ` Vincent Lefevre @ 2023-03-20 12:05 ` Siddhesh Poyarekar 2023-03-20 12:17 ` Alejandro Colomar 2023-03-20 13:50 ` Vincent Lefevre 0 siblings, 2 replies; 32+ messages in thread From: Siddhesh Poyarekar @ 2023-03-20 12:05 UTC (permalink / raw) To: Vincent Lefevre, libc-alpha On 2023-03-19 19:07, Vincent Lefevre wrote: > On 2023-03-19 10:45:59 -0400, manfred via Libc-alpha wrote: >> All of that said, back to the OP case I would not pass INT_MAX to snprintf. >> If I have a situation wherein I know that the buffer is large enough, but I >> don't know its exact size, I'd use sprintf and be done with it. (I'm sure >> that the actual code is more elaborate than this, but still) > > In simple code, probably. But in actual code, it may be more natural > to use snprintf. Something like that: > > snprintf(buf, checked ? SIZE_MAX : n, "%s", s); > > The function may not know the buffer size if `checked` is true, > so that it uses a known bound. Thanks to common code factorized, > this is more readable than > > if (checked) > sprintf (buf, "%s", s); > else > snprintf(buf, n, "%s", s); > > in particular in the cases where the format string is complex. If your application requires such patterns then it really needs an additional layer of abstraction or maybe a rethink on the pattern itself. This is not something the C runtime should try to solve. I think on the glibc front it makes sense from a security perspective to interpret this through POSIX than the C standard. Even if the C standard is clarified to be contrary to POSIX and explicitly state that n is not the size of the buffer (which would be a terrible mistake IMO), I'd lean towards violating the C standard and conforming to POSIX instead. Sid ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 12:05 ` Siddhesh Poyarekar @ 2023-03-20 12:17 ` Alejandro Colomar 2023-03-20 12:29 ` Siddhesh Poyarekar 2023-03-20 13:36 ` Vincent Lefevre 2023-03-20 13:50 ` Vincent Lefevre 1 sibling, 2 replies; 32+ messages in thread From: Alejandro Colomar @ 2023-03-20 12:17 UTC (permalink / raw) To: Siddhesh Poyarekar, Vincent Lefevre, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 2090 bytes --] Hi Vincent, Siddhesh, On 3/20/23 13:05, Siddhesh Poyarekar wrote: > On 2023-03-19 19:07, Vincent Lefevre wrote: >> On 2023-03-19 10:45:59 -0400, manfred via Libc-alpha wrote: >>> All of that said, back to the OP case I would not pass INT_MAX to snprintf. >>> If I have a situation wherein I know that the buffer is large enough, but I >>> don't know its exact size, I'd use sprintf and be done with it. (I'm sure >>> that the actual code is more elaborate than this, but still) >> >> In simple code, probably. But in actual code, it may be more natural >> to use snprintf. Something like that: >> >> snprintf(buf, checked ? SIZE_MAX : n, "%s", s); >> >> The function may not know the buffer size if `checked` is true, >> so that it uses a known bound. Thanks to common code factorized, >> this is more readable than >> >> if (checked) >> sprintf (buf, "%s", s); >> else >> snprintf(buf, n, "%s", s); >> >> in particular in the cases where the format string is complex. That pattern looks like _FORTIFY_SOURCE, doesn't it? If so, the correct action would be to call sprintf(3) and rely on the compiler to do the checks. snprintf(3) should be called when you can't guarantee at coding time if the array is possibly overrun. If you can guarantee that, then call sprintf(3), and the compiler will confirm. Cheers, Alex > > If your application requires such patterns then it really needs an > additional layer of abstraction or maybe a rethink on the pattern > itself. This is not something the C runtime should try to solve. > > I think on the glibc front it makes sense from a security perspective to > interpret this through POSIX than the C standard. Even if the C > standard is clarified to be contrary to POSIX and explicitly state that > n is not the size of the buffer (which would be a terrible mistake IMO), > I'd lean towards violating the C standard and conforming to POSIX instead. > > Sid -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 12:17 ` Alejandro Colomar @ 2023-03-20 12:29 ` Siddhesh Poyarekar 2023-03-20 13:36 ` Vincent Lefevre 1 sibling, 0 replies; 32+ messages in thread From: Siddhesh Poyarekar @ 2023-03-20 12:29 UTC (permalink / raw) To: Alejandro Colomar, Vincent Lefevre, libc-alpha On 2023-03-20 08:17, Alejandro Colomar wrote: >>> if (checked) >>> sprintf (buf, "%s", s); >>> else >>> snprintf(buf, n, "%s", s); >>> >>> in particular in the cases where the format string is complex. > > That pattern looks like _FORTIFY_SOURCE, doesn't it? If so, the correct > action would be to call sprintf(3) and rely on the compiler to do the > checks. Yes the pattern is similar to __sprintf_chk, except that if the result string is larger, it will result in program abort while in the above pattern the string will get truncated. Sid ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 12:17 ` Alejandro Colomar 2023-03-20 12:29 ` Siddhesh Poyarekar @ 2023-03-20 13:36 ` Vincent Lefevre 1 sibling, 0 replies; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 13:36 UTC (permalink / raw) To: libc-alpha On 2023-03-20 13:17:11 +0100, Alejandro Colomar wrote: > On 3/20/23 13:05, Siddhesh Poyarekar wrote: > > On 2023-03-19 19:07, Vincent Lefevre wrote: > >> The function may not know the buffer size if `checked` is true, > >> so that it uses a known bound. Thanks to common code factorized, > >> this is more readable than > >> > >> if (checked) > >> sprintf (buf, "%s", s); > >> else > >> snprintf(buf, n, "%s", s); > >> > >> in particular in the cases where the format string is complex. > > That pattern looks like _FORTIFY_SOURCE, doesn't it? If so, the correct > action would be to call sprintf(3) and rely on the compiler to do the > checks. Not necessarily. Here, this is the programmer who would do the check, though he may add code (e.g. assertions) on the caller side to give enough information to the compiler so that a smart compiler could prove that the call is valid. But this may need an analysis across different compilation units. If the compiler cannot check, the validity relies on the proof of the code; but not that in any case, the code needs to be proved: for instance, snprintf() may yield data loss and potential security issues if string truncation is not expected (or if the string is truncation too early). -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 12:05 ` Siddhesh Poyarekar 2023-03-20 12:17 ` Alejandro Colomar @ 2023-03-20 13:50 ` Vincent Lefevre 2023-03-20 16:56 ` Siddhesh Poyarekar 1 sibling, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 13:50 UTC (permalink / raw) To: libc-alpha On 2023-03-20 08:05:32 -0400, Siddhesh Poyarekar wrote: > I think on the glibc front it makes sense from a security > perspective to interpret this through POSIX than the C standard. > Even if the C standard is clarified to be contrary to POSIX and > explicitly state that n is not the size of the buffer (which would > be a terrible mistake IMO), I'd lean towards violating the C > standard and conforming to POSIX instead. I disagree about the POSIX behavior (assuming it is intentional). With it, if the compiler detects that n is larger than the actual buffer size, then due to undefined behavior, the compiler could assume that this is dead code and introduce erratic behavior in code written with the C standard in mind (or when it was introduced in BSD). Note that the printf(3) man page from FreeBSD 1.0 (1991), just says: Snprintf() and vsnprintf() will write at most size-1 of the characters printed into the output string (the size'th character then gets the terminating `\0'); if the return value is greater than or equal to the size argument, the string was too short and some of the printed characters were discarded. Reference: https://man.freebsd.org/cgi/man.cgi?query=snprintf&apropos=0&sektion=0&manpath=FreeBSD+1.0-RELEASE&arch=default&format=html -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 13:50 ` Vincent Lefevre @ 2023-03-20 16:56 ` Siddhesh Poyarekar 2023-03-20 17:36 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Siddhesh Poyarekar @ 2023-03-20 16:56 UTC (permalink / raw) To: Vincent Lefevre, libc-alpha On 2023-03-20 09:50, Vincent Lefevre wrote: > On 2023-03-20 08:05:32 -0400, Siddhesh Poyarekar wrote: >> I think on the glibc front it makes sense from a security >> perspective to interpret this through POSIX than the C standard. >> Even if the C standard is clarified to be contrary to POSIX and >> explicitly state that n is not the size of the buffer (which would >> be a terrible mistake IMO), I'd lean towards violating the C >> standard and conforming to POSIX instead. > > I disagree about the POSIX behavior (assuming it is intentional). Why do you think it may be unintentional? The POSIX wording seems pretty deliberate and clear to me. The C standard wording on the other hand leaves things to the imagination, which is why we're having this discussion. > With it, if the compiler detects that n is larger than the actual > buffer size, then due to undefined behavior, the compiler could > assume that this is dead code and introduce erratic behavior in > code written with the C standard in mind (or when it was introduced > in BSD). In fact, with _FORTIFY_SOURCE, if the runtime detects that n is larger than the actual buffer size, the code will abort, see pr28989[1]. But that's a runtime feature, nothing to do with gcc. gcc at the moment doesn't have any such check AFAICT but if it does, I reckon that's a discussion to be had on the gcc mailing list. Sid [1] https://sourceware.org/bugzilla/show_bug.cgi?id=28989 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 16:56 ` Siddhesh Poyarekar @ 2023-03-20 17:36 ` Vincent Lefevre 0 siblings, 0 replies; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 17:36 UTC (permalink / raw) To: libc-alpha On 2023-03-20 12:56:34 -0400, Siddhesh Poyarekar wrote: > On 2023-03-20 09:50, Vincent Lefevre wrote: > > On 2023-03-20 08:05:32 -0400, Siddhesh Poyarekar wrote: > > > I think on the glibc front it makes sense from a security > > > perspective to interpret this through POSIX than the C standard. > > > Even if the C standard is clarified to be contrary to POSIX and > > > explicitly state that n is not the size of the buffer (which would > > > be a terrible mistake IMO), I'd lean towards violating the C > > > standard and conforming to POSIX instead. > > > > I disagree about the POSIX behavior (assuming it is intentional). > > Why do you think it may be unintentional? Yes, because it does not warn about a difference with the C standard (and see below...). > The POSIX wording seems pretty deliberate and clear to me. Standards may have clear text, while some corner cases have been overlooked. This occurs frequently. As an example, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399 (LDBL_MAX is incorrect with IBM long double format) and a DR eventually changed the definition, even though it was clear: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399#c4 Note: This change of definition was not much an issue in practice because it made the standard match the actual implementations (probably all of them). So it did not break anything. Back to POSIX: The text was probably written like that because in almost all the cases, n is a valid size of the array and the change of formulation made the text a bit simpler. But this broke some legitimate uses, and at that time, the working group may not have been aware of that. It would be useful to know whether this had been brought in the discussions at that time. > The C standard wording on the other hand leaves things to the > imagination, which is why we're having this discussion. What things to the imagination? IMHO, the C standard wording is very clear. > > With it, if the compiler detects that n is larger than the actual > > buffer size, then due to undefined behavior, the compiler could > > assume that this is dead code and introduce erratic behavior in > > code written with the C standard in mind (or when it was introduced > > in BSD). > > In fact, with _FORTIFY_SOURCE, if the runtime detects that n is larger than > the actual buffer size, the code will abort, see pr28989[1]. But that's a > runtime feature, nothing to do with gcc. > > gcc at the moment doesn't have any such check AFAICT but if it does, I > reckon that's a discussion to be had on the gcc mailing list. If the ISO C standard is meant to be changed, all compilers could potentially consider the code as dead code, not just GCC. IMHO, it would be very bad to consider the code as UB. It would be better to standardize _FORTIFY_SOURCE (that would add restrictions, but with a controlled behavior). -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-19 14:45 ` manfred 2023-03-19 23:07 ` Vincent Lefevre @ 2023-03-20 15:09 ` Vincent Lefevre 2023-03-20 16:15 ` Alejandro Colomar 1 sibling, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 15:09 UTC (permalink / raw) To: libc-alpha On 2023-03-19 10:45:59 -0400, manfred via Libc-alpha wrote: > All of that said, back to the OP case I would not pass INT_MAX to snprintf. > If I have a situation wherein I know that the buffer is large enough, but I > don't know its exact size, I'd use sprintf and be done with it. (I'm sure > that the actual code is more elaborate than this, but still) Here's another example where the support of snprintf with a large n argument (larger than the buffer size) may be used: In GNU MPFR, for our function mpfr_snprintf, we have assumed a behavior analogue to the ISO C behavior. In particular, we use that in our tests in order to check that large values of n are correctly handled. This allowed us to trigger/detect a bug in the choice of the integer types in our implementation: The test: https://gitlab.inria.fr/mpfr/mpfr/-/commit/67a75bfe41d3a7f95367ee9e62bd7dfc73e5b395 The bug fix (replacing an int by a size_t in a variable declaration): https://gitlab.inria.fr/mpfr/mpfr/-/commit/6b8cf3e2bdc285027627281cac230ed932c1b73f Note that the MPFR code currently does not use snprintf or any similar function, so that there should be no issues with this test if snprintf does not support n > buffer size, it but may use such a function in the future. (Indeed, MPFR currently uses gmp_vasprintf and gmp_asprintf, thus generating a full output before truncating it as needed, so that this is potentially very inefficient and could unnecessarily exhaust the memory and/or crash.) -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 15:09 ` Vincent Lefevre @ 2023-03-20 16:15 ` Alejandro Colomar 2023-03-20 16:33 ` Vincent Lefevre 2023-03-20 17:00 ` Vincent Lefevre 0 siblings, 2 replies; 32+ messages in thread From: Alejandro Colomar @ 2023-03-20 16:15 UTC (permalink / raw) To: Vincent Lefevre, libc-alpha Cc: Stephan Bergmann, Andreas Schwab, Paul Eggert, Simon Chopin [-- Attachment #1.1: Type: text/plain, Size: 6801 bytes --] Hi Vincent, On 3/20/23 14:36, Vincent Lefevre wrote: > On 2023-03-20 13:17:11 +0100, Alejandro Colomar wrote: >> On 3/20/23 13:05, Siddhesh Poyarekar wrote: >>> On 2023-03-19 19:07, Vincent Lefevre wrote: >>>> The function may not know the buffer size if `checked` is true, >>>> so that it uses a known bound. Thanks to common code factorized, >>>> this is more readable than >>>> >>>> if (checked) >>>> sprintf (buf, "%s", s); >>>> else >>>> snprintf(buf, n, "%s", s); >>>> >>>> in particular in the cases where the format string is complex. >> >> That pattern looks like _FORTIFY_SOURCE, doesn't it? If so, the correct >> action would be to call sprintf(3) and rely on the compiler to do the >> checks. > > Not necessarily. Here, this is the programmer who would do the check, > though he may add code (e.g. assertions) on the caller side to give > enough information to the compiler so that a smart compiler could > prove that the call is valid. But this may need an analysis across > different compilation units. If the compiler cannot check, the > validity relies on the proof of the code; but not that in any case, > the code needs to be proved: for instance, snprintf() may yield > data loss and potential security issues if string truncation is > not expected (or if the string is truncation too early). > $ cat str.c #include <stdio.h> void f(char *dst, char *src) { sprintf(dst, "%s plus some extra stuff", src); } $ cat main.c void f(char *dst, char *src); int main(void) { char *str = "some long string"; char s[20]; f(s, str); } $ cc -Wall -Wextra *.c -flto -O3 -fanalyzer -D_FORTIFY_SOURCE=1 $ ./a.out *** buffer overflow detected ***: terminated Aborted $ cc -Wall -Wextra *.c -flto -O3 -fanalyzer -D_FORTIFY_SOURCE=3 alx@asus5775:~/tmp/fort$ ./a.out *** buffer overflow detected ***: terminated Aborted Is this what you're looking for? I agree that it would be nicer if the analyzer could catch this at build time, but it seems it's not yet so powerful. On 3/20/23 14:50, Vincent Lefevre wrote: > On 2023-03-20 08:05:32 -0400, Siddhesh Poyarekar wrote: >> I think on the glibc front it makes sense from a security >> perspective to interpret this through POSIX than the C standard. >> Even if the C standard is clarified to be contrary to POSIX and >> explicitly state that n is not the size of the buffer (which would >> be a terrible mistake IMO), I'd lean towards violating the C >> standard and conforming to POSIX instead. > I disagree about the POSIX behavior (assuming it is intentional). > With it, if the compiler detects that n is larger than the actual > buffer size, then due to undefined behavior, the compiler could > assume that this is dead code and introduce erratic behavior in > code written with the C standard in mind (or when it was introduced > in BSD). > > Note that the printf(3) man page from FreeBSD 1.0 (1991), just says: > > Snprintf() and vsnprintf() will write at most size-1 of the > characters printed into the output string (the size'th character > then gets the terminating `\0'); if the return value is greater > than or equal to the size argument, the string was too short and > some of the printed characters were discarded. The Linux man-pages show the following SYNOPSIS, which shows the meaning of 'size' as being an array size: SYNOPSIS #include <stdio.h> int printf(const char *restrict format, ...); int fprintf(FILE *restrict stream, const char *restrict format, ...); int dprintf(int fd, const char *restrict format, ...); int sprintf(char *restrict str, const char *restrict format, ...); int snprintf(char str[restrict .size], size_t size, const char *restrict format, ...); int vprintf(const char *restrict format, va_list ap); int vfprintf(FILE *restrict stream, const char *restrict format, va_list ap); int vdprintf(int fd, const char *restrict format, va_list ap); int vsprintf(char *restrict str, const char *restrict format, va_list ap); int vsnprintf(char str[restrict .size], size_t size, const char *restrict format, va_list ap); I'll need to update the text, though, to be more explicit in that it's an array size, and not just a limit for the copying. > > Reference: > https://man.freebsd.org/cgi/man.cgi?query=snprintf&apropos=0&sektion=0&manpath=FreeBSD+1.0-RELEASE&arch=default&format=html > > -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) On 3/20/23 16:09, Vincent Lefevre wrote: > On 2023-03-19 10:45:59 -0400, manfred via Libc-alpha wrote: >> All of that said, back to the OP case I would not pass INT_MAX to snprintf. >> If I have a situation wherein I know that the buffer is large enough, but I >> don't know its exact size, I'd use sprintf and be done with it. (I'm sure >> that the actual code is more elaborate than this, but still) > > Here's another example where the support of snprintf with a large n > argument (larger than the buffer size) may be used: > > In GNU MPFR, for our function mpfr_snprintf, we have assumed a > behavior analogue to the ISO C behavior. In particular, we use > that in our tests in order to check that large values of n are > correctly handled. This allowed us to trigger/detect a bug in > the choice of the integer types in our implementation: > > The test: > > https://gitlab.inria.fr/mpfr/mpfr/-/commit/67a75bfe41d3a7f95367ee9e62bd7dfc73e5b395 > > The bug fix (replacing an int by a size_t in a variable declaration): > > https://gitlab.inria.fr/mpfr/mpfr/-/commit/6b8cf3e2bdc285027627281cac230ed932c1b73f I don't understand how snprintf(3) helped catch the bug. Could you show some small reproducer? Would _FORTIFY_SOURCE not do a similar thing? Cheers, Alex > > Note that the MPFR code currently does not use snprintf or any > similar function, so that there should be no issues with this > test if snprintf does not support n > buffer size, it but may > use such a function in the future. (Indeed, MPFR currently uses > gmp_vasprintf and gmp_asprintf, thus generating a full output > before truncating it as needed, so that this is potentially > very inefficient and could unnecessarily exhaust the memory > and/or crash.) > -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 16:15 ` Alejandro Colomar @ 2023-03-20 16:33 ` Vincent Lefevre 2023-03-20 17:00 ` Vincent Lefevre 1 sibling, 0 replies; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 16:33 UTC (permalink / raw) To: libc-alpha On 2023-03-20 17:15:49 +0100, Alejandro Colomar wrote: > $ cat str.c > #include <stdio.h> > > void f(char *dst, char *src) > { > sprintf(dst, "%s plus some extra stuff", src); > } > > $ cat main.c > void f(char *dst, char *src); > > int main(void) > { > char *str = "some long string"; > char s[20]; > > f(s, str); > } > > $ cc -Wall -Wextra *.c -flto -O3 -fanalyzer -D_FORTIFY_SOURCE=1 > $ ./a.out > *** buffer overflow detected ***: terminated > Aborted > $ cc -Wall -Wextra *.c -flto -O3 -fanalyzer -D_FORTIFY_SOURCE=3 > alx@asus5775:~/tmp/fort$ ./a.out > *** buffer overflow detected ***: terminated > Aborted > > > Is this what you're looking for? I agree that it would be nicer > if the analyzer could catch this at build time, but it seems it's > not yet so powerful. I meant at build time. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 16:15 ` Alejandro Colomar 2023-03-20 16:33 ` Vincent Lefevre @ 2023-03-20 17:00 ` Vincent Lefevre 2023-03-20 17:31 ` Siddhesh Poyarekar 1 sibling, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 17:00 UTC (permalink / raw) To: libc-alpha On 2023-03-20 17:15:49 +0100, Alejandro Colomar wrote: > On 3/20/23 16:09, Vincent Lefevre wrote: > > Here's another example where the support of snprintf with a large n > > argument (larger than the buffer size) may be used: > > > > In GNU MPFR, for our function mpfr_snprintf, we have assumed a > > behavior analogue to the ISO C behavior. In particular, we use > > that in our tests in order to check that large values of n are > > correctly handled. This allowed us to trigger/detect a bug in > > the choice of the integer types in our implementation: > > > > The test: > > > > https://gitlab.inria.fr/mpfr/mpfr/-/commit/67a75bfe41d3a7f95367ee9e62bd7dfc73e5b395 > > > > The bug fix (replacing an int by a size_t in a variable declaration): > > > > https://gitlab.inria.fr/mpfr/mpfr/-/commit/6b8cf3e2bdc285027627281cac230ed932c1b73f > > I don't understand how snprintf(3) helped catch the bug. It doesn't. *Currently*, MPFR does not use snprintf. With the buggy version, on a typical 64-bit machine (where int = 32 bits), the size given to mpfr_snprintf became the value modulo 2^32, so if n is 2^32 (possible as size_t has 64 bits), the implementation of mpfr_snprintf assumed that the size was 0 (instead of 2^32). Hence the incorrect behavior. The test helped to catch this bug because it checks mpfr_snprintf on this value n = (size_t) UINT_MAX + 1, which is 2^32 here. But in order not to use much memory, the test is done with a small buffer. Using a buffer of size n would need 4 GB, and this amount of memory is not available everywhere. As MPFR does not currently use snprintf, there would not be any issue with any snprintf behavior. But in the future, we should use snprintf (actually gmp_snprintf or similar, but this may call snprintf at the end) to implement mpfr_snprintf. In such a case, the above test will no longer work if snprintf aborts just because n is larger than the buffer size. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 17:00 ` Vincent Lefevre @ 2023-03-20 17:31 ` Siddhesh Poyarekar 2023-03-20 17:45 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Siddhesh Poyarekar @ 2023-03-20 17:31 UTC (permalink / raw) To: Vincent Lefevre, libc-alpha On 2023-03-20 13:00, Vincent Lefevre wrote: > It doesn't. *Currently*, MPFR does not use snprintf. With the buggy > version, on a typical 64-bit machine (where int = 32 bits), the size > given to mpfr_snprintf became the value modulo 2^32, so if n is 2^32 > (possible as size_t has 64 bits), the implementation of mpfr_snprintf > assumed that the size was 0 (instead of 2^32). Hence the incorrect > behavior. > > The test helped to catch this bug because it checks mpfr_snprintf > on this value n = (size_t) UINT_MAX + 1, which is 2^32 here. But in > order not to use much memory, the test is done with a small buffer. > Using a buffer of size n would need 4 GB, and this amount of memory > is not available everywhere. In glibc we tend to skip a test as UNSUPPORTED when resources to run a test are considered uncommon (from the perspective of a test system) and cannot be expected to be met in all environments. It looks like mpfr_snprintf and glibc snprintf are incompatible in this context and the latter should not be used to implement the former if the n > __bos(dest) use case is important to it. Then again, I wish mpfr_snprintf also similarly tightened its requirements, but that's a discussion for another day. Thanks, Sid ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 17:31 ` Siddhesh Poyarekar @ 2023-03-20 17:45 ` Vincent Lefevre 0 siblings, 0 replies; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 17:45 UTC (permalink / raw) To: libc-alpha On 2023-03-20 13:31:33 -0400, Siddhesh Poyarekar wrote: > In glibc we tend to skip a test as UNSUPPORTED when resources to run a test > are considered uncommon (from the perspective of a test system) and cannot > be expected to be met in all environments. In MPFR, we have some tests that need much memory, which are enabled only if the MPFR_CHECK_LARGEMEM environment variable is set. But this means that some issues would not be detected on particular platforms by end users (MPFR_CHECK_LARGEMEM is mainly for developers). > It looks like mpfr_snprintf and glibc snprintf are incompatible in this > context and the latter should not be used to implement the former if the n > > __bos(dest) use case is important to it. > > Then again, I wish mpfr_snprintf also similarly tightened its requirements, > but that's a discussion for another day. Well, if the C standard is changed, I suppose that we will align to it. We can still recommend not to use n > buffer size if this is not supported by snprintf (even though the MPFR code is currently guaranteed to work in this case). -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-14 21:39 ` Paul Eggert 2023-03-15 9:22 ` Andreas Schwab @ 2023-03-15 12:39 ` Vincent Lefevre 2023-03-16 10:29 ` Stephan Bergmann 1 sibling, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-15 12:39 UTC (permalink / raw) To: Paul Eggert; +Cc: Simon Chopin, libc-alpha About snprintf(buf, INT_MAX, "%s", "Hello world"); On 2023-03-14 14:39:42 -0700, Paul Eggert wrote: > On 3/14/23 12:47, Simon Chopin via Libc-alpha wrote: > > When the issue was first discovered[1], I didn't raise the issue because I > > dismissed it as UB, but it reappeared in an unrelated context, and > > colleagues pointed out that the wording in the standard doesn't actually > > say that the `n` argument is the size of the array. > > That sounds like a misreading of the C standard. Even though the standard > often does not explicitly say that a size argument is the size of an array, > it's obvious from context that this is intended. So Florian is correct here > that the call with INT_MAX is not portable C code. No, it is not obvious. If the C standard does not say that this is the size of the array, then it does not have to be the size of the array. The C standard just says: Otherwise, output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array. In practice, it is possible that the check of the buffer size has already been done somewhere else, so that using a larger value for n is fine. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-15 12:39 ` Vincent Lefevre @ 2023-03-16 10:29 ` Stephan Bergmann 2023-03-18 2:07 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Stephan Bergmann @ 2023-03-16 10:29 UTC (permalink / raw) To: libc-alpha; +Cc: Vincent Lefevre, Paul Eggert, Simon Chopin On 15/03/2023 13:39, Vincent Lefevre wrote: > No, it is not obvious. If the C standard does not say that this is > the size of the array, then it does not have to be the size of the > array. The C standard just says: > > Otherwise, output characters beyond the n-1st are discarded rather > than being written to the array, and a null character is written at > the end of the characters actually written into the array. But in 7.1.4 "Use of library functions" the standard also says > If a function argument is described as being an array, the pointer passed to the function shall > have a value such that all address computations and accesses to objects (that would be valid if > the pointer did point to the first element of such an array) are valid. which could be construed as meaning that the n-1st array element must always be accessible, even if a given invocation is known to always generate less then n output characters. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-16 10:29 ` Stephan Bergmann @ 2023-03-18 2:07 ` Vincent Lefevre 2023-03-18 2:30 ` Alejandro Colomar 0 siblings, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-18 2:07 UTC (permalink / raw) To: Stephan Bergmann; +Cc: libc-alpha, Paul Eggert, Simon Chopin On 2023-03-16 11:29:31 +0100, Stephan Bergmann wrote: > On 15/03/2023 13:39, Vincent Lefevre wrote: > > No, it is not obvious. If the C standard does not say that this is > > the size of the array, then it does not have to be the size of the > > array. The C standard just says: > > > > Otherwise, output characters beyond the n-1st are discarded rather > > than being written to the array, and a null character is written at > > the end of the characters actually written into the array. > > But in 7.1.4 "Use of library functions" the standard also says > > > If a function argument is described as being an array, the pointer > > passed to the function shall have a value such that all address > > computations and accesses to objects (that would be valid if the > > pointer did point to the first element of such an array) are > > valid. > > which could be construed as meaning that the n-1st array element must always > be accessible, even if a given invocation is known to always generate less > then n output characters. But the standard does not say that n is the size of the array. The size of the array could be the maximum of n and the size corresponding to the untruncated output string. Similarly, for strncpy, I would not see n as the size of the arrays, i.e. it is not allowed for the implementation to read characters past a null character (possibly unless this does not have unwanted effects), even though such characters would be among the first n characters. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-18 2:07 ` Vincent Lefevre @ 2023-03-18 2:30 ` Alejandro Colomar 2023-03-18 10:58 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Alejandro Colomar @ 2023-03-18 2:30 UTC (permalink / raw) To: Vincent Lefevre, libc-alpha Cc: Stephan Bergmann, Paul Eggert, Simon Chopin, Andreas Schwab [-- Attachment #1.1: Type: text/plain, Size: 2608 bytes --] Hello Vincent, On 3/18/23 03:07, Vincent Lefevre wrote: > On 2023-03-16 11:29:31 +0100, Stephan Bergmann wrote: >> On 15/03/2023 13:39, Vincent Lefevre wrote: >>> No, it is not obvious. If the C standard does not say that this is >>> the size of the array, then it does not have to be the size of the >>> array. The C standard just says: >>> >>> Otherwise, output characters beyond the n-1st are discarded rather >>> than being written to the array, and a null character is written at >>> the end of the characters actually written into the array. >> >> But in 7.1.4 "Use of library functions" the standard also says >> >>> If a function argument is described as being an array, the pointer >>> passed to the function shall have a value such that all address >>> computations and accesses to objects (that would be valid if the >>> pointer did point to the first element of such an array) are >>> valid. >> >> which could be construed as meaning that the n-1st array element must always >> be accessible, even if a given invocation is known to always generate less >> then n output characters. > > But the standard does not say that n is the size of the array. > The size of the array could be the maximum of n and the size > corresponding to the untruncated output string. I guess you mean the minimum? If it were the maximum, then it would never truncate. [assuming you meant minimum]: As Andreas mentioned, that's valid for ISO C, but POSIX is more restrictive. Here's a quote from fprintf(3posix): The snprintf() function shall be equivalent to sprintf(), with the addition of the n argument which states the size of the buffer referred to by s. It clearly specifies that 'n' is the size of the buffer, so implementations are free to assume that `s+n` is a valid pointer. > > Similarly, for strncpy, I would not see n as the size of the arrays, > i.e. it is not allowed for the implementation to read characters > past a null character (possibly unless this does not have unwanted > effects), even though such characters would be among the first n > characters. The size argument to strncpy(3) is the size of the destination buffer, not the size of the input buffer. The input buffer must be either a string, or a character sequence at least as large as the destination buffer. Thus, in strncpy(3), reads are limited by `strnlen(src, size)`, but writes are limited by `size`. Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-18 2:30 ` Alejandro Colomar @ 2023-03-18 10:58 ` Vincent Lefevre 2023-03-18 15:01 ` Andreas Schwab 0 siblings, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-18 10:58 UTC (permalink / raw) To: Alejandro Colomar Cc: libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin, Andreas Schwab On 2023-03-18 03:30:27 +0100, Alejandro Colomar wrote: > Hello Vincent, > > On 3/18/23 03:07, Vincent Lefevre wrote: > > On 2023-03-16 11:29:31 +0100, Stephan Bergmann wrote: > >> On 15/03/2023 13:39, Vincent Lefevre wrote: > >>> No, it is not obvious. If the C standard does not say that this is > >>> the size of the array, then it does not have to be the size of the > >>> array. The C standard just says: > >>> > >>> Otherwise, output characters beyond the n-1st are discarded rather > >>> than being written to the array, and a null character is written at > >>> the end of the characters actually written into the array. > >> > >> But in 7.1.4 "Use of library functions" the standard also says > >> > >>> If a function argument is described as being an array, the pointer > >>> passed to the function shall have a value such that all address > >>> computations and accesses to objects (that would be valid if the > >>> pointer did point to the first element of such an array) are > >>> valid. > >> > >> which could be construed as meaning that the n-1st array element must always > >> be accessible, even if a given invocation is known to always generate less > >> then n output characters. > > > > But the standard does not say that n is the size of the array. > > The size of the array could be the maximum of n and the size > > corresponding to the untruncated output string. > > I guess you mean the minimum? If it were the maximum, then it would > never truncate. Yes, obviously, the minimum. > [assuming you meant minimum]: > > As Andreas mentioned, that's valid for ISO C, but POSIX is more > restrictive. Here's a quote from fprintf(3posix): > > The snprintf() function shall be equivalent to sprintf(), with > the addition of the n argument which states the size of the > buffer referred to by s. > > It clearly specifies that 'n' is the size of the buffer, so > implementations are free to assume that `s+n` is a valid pointer. However, I'm wondering whether such a change is intentional. BTW, this description is even wrong: this is certainly not equivalent! If the untruncated output is larger than n, then the call is UB with sprintf(), while the output is truncated with snprintf(). What really occurs is described later, but still, this first sentence is incorrect. This could just mean that they tried to oversimplify when writing this text, without thinking of the details. > > Similarly, for strncpy, I would not see n as the size of the arrays, > > i.e. it is not allowed for the implementation to read characters > > past a null character (possibly unless this does not have unwanted > > effects), even though such characters would be among the first n > > characters. > > The size argument to strncpy(3) is the size of the destination buffer, > not the size of the input buffer. The input buffer must be either > a string, or a character sequence at least as large as the destination > buffer. Thus, in strncpy(3), reads are limited by > `strnlen(src, size)`, but writes are limited by `size`. I agree, but I was answering to Stephan Bergmann, who thought that a size_t argument could be seen as the size of the array arguments. IMHO, the (minimum) sizes of the arrays are just implied from the described behavior of the function. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-18 10:58 ` Vincent Lefevre @ 2023-03-18 15:01 ` Andreas Schwab 2023-03-19 22:48 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2023-03-18 15:01 UTC (permalink / raw) To: Vincent Lefevre Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On Mär 18 2023, Vincent Lefevre wrote: > However, I'm wondering whether such a change is intentional. BTW, > this description is even wrong: this is certainly not equivalent! > If the untruncated output is larger than n, then the call is UB > with sprintf(), while the output is truncated with snprintf(). Which makes them equivalent in all situations where sprintf is defined, so there is no discrepancy here. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-18 15:01 ` Andreas Schwab @ 2023-03-19 22:48 ` Vincent Lefevre 2023-03-19 23:24 ` Andreas Schwab 0 siblings, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-19 22:48 UTC (permalink / raw) To: Andreas Schwab Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On 2023-03-18 16:01:07 +0100, Andreas Schwab wrote: > On Mär 18 2023, Vincent Lefevre wrote: > > > However, I'm wondering whether such a change is intentional. BTW, > > this description is even wrong: this is certainly not equivalent! > > If the untruncated output is larger than n, then the call is UB > > with sprintf(), while the output is truncated with snprintf(). > > Which makes them equivalent in all situations where sprintf is defined, > so there is no discrepancy here. The conditions under which the function is defined or not are part of the equivalence. For instance, I would not say that memcpy and memmove are equivalent, even though they are equivalent when memcpy is defined. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-19 22:48 ` Vincent Lefevre @ 2023-03-19 23:24 ` Andreas Schwab 2023-03-20 4:10 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2023-03-19 23:24 UTC (permalink / raw) To: Vincent Lefevre Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On Mär 19 2023, Vincent Lefevre wrote: > On 2023-03-18 16:01:07 +0100, Andreas Schwab wrote: >> On Mär 18 2023, Vincent Lefevre wrote: >> >> > However, I'm wondering whether such a change is intentional. BTW, >> > this description is even wrong: this is certainly not equivalent! >> > If the untruncated output is larger than n, then the call is UB >> > with sprintf(), while the output is truncated with snprintf(). >> >> Which makes them equivalent in all situations where sprintf is defined, >> so there is no discrepancy here. > > The conditions under which the function is defined or not are part of > the equivalence. No, it isn't. Equivalent is not the same as identical. > For instance, I would not say that memcpy and memmove are equivalent, > even though they are equivalent when memcpy is defined. But they are. You can use either memcpy or memove in all situations where memcpy is defined. Outside of the domain of memcpy there is no relation at all. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-19 23:24 ` Andreas Schwab @ 2023-03-20 4:10 ` Vincent Lefevre 2023-03-20 9:19 ` Andreas Schwab 0 siblings, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 4:10 UTC (permalink / raw) To: Andreas Schwab Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On 2023-03-20 00:24:51 +0100, Andreas Schwab wrote: > On Mär 19 2023, Vincent Lefevre wrote: > > > On 2023-03-18 16:01:07 +0100, Andreas Schwab wrote: > >> On Mär 18 2023, Vincent Lefevre wrote: > >> > >> > However, I'm wondering whether such a change is intentional. BTW, > >> > this description is even wrong: this is certainly not equivalent! > >> > If the untruncated output is larger than n, then the call is UB > >> > with sprintf(), while the output is truncated with snprintf(). > >> > >> Which makes them equivalent in all situations where sprintf is defined, > >> so there is no discrepancy here. > > > > The conditions under which the function is defined or not are part of > > the equivalence. > > No, it isn't. Equivalent is not the same as identical. > > > For instance, I would not say that memcpy and memmove are equivalent, > > even though they are equivalent when memcpy is defined. > > But they are. You can use either memcpy or memove in all situations > where memcpy is defined. Outside of the domain of memcpy there is no > relation at all. POSIX specifies _exit() by: The _Exit() and _exit() functions shall be functionally equivalent. So, you mean that _exit() may have undefined behavior while _Exit() has a well-defined behavior? (as this would match your definition of equivalence.) -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 4:10 ` Vincent Lefevre @ 2023-03-20 9:19 ` Andreas Schwab 2023-03-20 10:42 ` Vincent Lefevre 0 siblings, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2023-03-20 9:19 UTC (permalink / raw) To: Vincent Lefevre Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On Mär 20 2023, Vincent Lefevre wrote: > So, you mean that _exit() may have undefined behavior while _Exit() > has a well-defined behavior? (as this would match your definition > of equivalence.) That's a logical fallacy. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 9:19 ` Andreas Schwab @ 2023-03-20 10:42 ` Vincent Lefevre 2023-03-20 10:44 ` Andreas Schwab 0 siblings, 1 reply; 32+ messages in thread From: Vincent Lefevre @ 2023-03-20 10:42 UTC (permalink / raw) To: Andreas Schwab Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On 2023-03-20 10:19:13 +0100, Andreas Schwab wrote: > On Mär 20 2023, Vincent Lefevre wrote: > > > So, you mean that _exit() may have undefined behavior while _Exit() > > has a well-defined behavior? (as this would match your definition > > of equivalence.) > > That's a logical fallacy. Not more than saying that memcpy and memmove are equivalent. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: UB status of snprintf on invalid ptr+size combination? 2023-03-20 10:42 ` Vincent Lefevre @ 2023-03-20 10:44 ` Andreas Schwab 0 siblings, 0 replies; 32+ messages in thread From: Andreas Schwab @ 2023-03-20 10:44 UTC (permalink / raw) To: Vincent Lefevre Cc: Alejandro Colomar, libc-alpha, Stephan Bergmann, Paul Eggert, Simon Chopin On Mär 20 2023, Vincent Lefevre wrote: > On 2023-03-20 10:19:13 +0100, Andreas Schwab wrote: >> On Mär 20 2023, Vincent Lefevre wrote: >> >> > So, you mean that _exit() may have undefined behavior while _Exit() >> > has a well-defined behavior? (as this would match your definition >> > of equivalence.) >> >> That's a logical fallacy. > > Not more than saying that memcpy and memmove are equivalent. No. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2023-03-20 17:45 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-14 19:47 UB status of snprintf on invalid ptr+size combination? Simon Chopin 2023-03-14 21:39 ` Paul Eggert 2023-03-15 9:22 ` Andreas Schwab 2023-03-15 15:54 ` Siddhesh Poyarekar 2023-03-15 18:34 ` Michael Hudson-Doyle 2023-03-19 14:45 ` manfred 2023-03-19 23:07 ` Vincent Lefevre 2023-03-20 12:05 ` Siddhesh Poyarekar 2023-03-20 12:17 ` Alejandro Colomar 2023-03-20 12:29 ` Siddhesh Poyarekar 2023-03-20 13:36 ` Vincent Lefevre 2023-03-20 13:50 ` Vincent Lefevre 2023-03-20 16:56 ` Siddhesh Poyarekar 2023-03-20 17:36 ` Vincent Lefevre 2023-03-20 15:09 ` Vincent Lefevre 2023-03-20 16:15 ` Alejandro Colomar 2023-03-20 16:33 ` Vincent Lefevre 2023-03-20 17:00 ` Vincent Lefevre 2023-03-20 17:31 ` Siddhesh Poyarekar 2023-03-20 17:45 ` Vincent Lefevre 2023-03-15 12:39 ` Vincent Lefevre 2023-03-16 10:29 ` Stephan Bergmann 2023-03-18 2:07 ` Vincent Lefevre 2023-03-18 2:30 ` Alejandro Colomar 2023-03-18 10:58 ` Vincent Lefevre 2023-03-18 15:01 ` Andreas Schwab 2023-03-19 22:48 ` Vincent Lefevre 2023-03-19 23:24 ` Andreas Schwab 2023-03-20 4:10 ` Vincent Lefevre 2023-03-20 9:19 ` Andreas Schwab 2023-03-20 10:42 ` Vincent Lefevre 2023-03-20 10:44 ` Andreas Schwab
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).