* Missed warning (-Wuse-after-free) @ 2023-02-16 14:35 Alejandro Colomar 2023-02-16 15:15 ` David Malcolm 2023-02-17 8:49 ` Yann Droneaud 0 siblings, 2 replies; 39+ messages in thread From: Alejandro Colomar @ 2023-02-16 14:35 UTC (permalink / raw) To: GCC; +Cc: Iker Pedrosa [-- Attachment #1.1: Type: text/plain, Size: 2193 bytes --] Hi! I was preparing an example program of a use-after-realloc bug, when I found that GCC doesn't warn in a case where it should. alx@debian:~/tmp$ cat realloc.c #include <stdint.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <unistd.h> static inline char * xstrdup(const char *s) { char *p; p = strdup(s); if (p == NULL) exit(EXIT_FAILURE); return p; } static inline char * strnul(const char *s) { return (char *) s + strlen(s); } int main(void) { char *p, *q; p = xstrdup(""); q = strnul(p); if (p == q) puts("equal before"); else exit(EXIT_FAILURE); // It's an empty string; this won't happen printf("p = %p; q = %p\n", p, q); p = realloc(p, UINT16_MAX); if (p == NULL) exit(EXIT_FAILURE); puts("realloc()"); if (p == q) { // Use after realloc. I'd expect a warning here. puts("equal after"); } else { /* Can we get here? Let's see the options: - realloc(3) fails: We exit immediately. We don't arrive here. - realloc(3) doesn't move the memory: p == q, as before - realloc(3) moved the memory: p is guaranteed to be a unique pointer, and q is now an invalid pointer. It is Undefined Behavior to read `q`, so `p == q` is UB. As we see, there's no _defined_ path where this can happen */ printf("PID = %i\n", (int) getpid()); } printf("p = %p; q = %p\n", p, q); } alx@debian:~/tmp$ cc -Wall -Wextra realloc.c -O3 -fanalyzer realloc.c: In function ‘main’: realloc.c:67:9: warning: pointer ‘p’ may be used after ‘realloc’ [-Wuse-after-free] 67 | printf("p = %p; q = %p\n", p, q); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ realloc.c:39:13: note: call to ‘realloc’ here 39 | p = realloc(p, UINT16_MAX); | ^~~~~~~~~~~~~~~~~~~~~~ alx@debian:~/tmp$ ./a.out equal before p = 0x55bff80802a0; q = 0x55bff80802a0 realloc() PID = 25222 p = 0x55bff80806d0; q = 0x55bff80802a0 Did I miss anything? Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-16 14:35 Missed warning (-Wuse-after-free) Alejandro Colomar @ 2023-02-16 15:15 ` David Malcolm 2023-02-17 1:04 ` Alejandro Colomar 2023-02-17 3:48 ` Siddhesh Poyarekar 2023-02-17 8:49 ` Yann Droneaud 1 sibling, 2 replies; 39+ messages in thread From: David Malcolm @ 2023-02-16 15:15 UTC (permalink / raw) To: Alejandro Colomar, GCC; +Cc: Iker Pedrosa On Thu, 2023-02-16 at 15:35 +0100, Alejandro Colomar via Gcc wrote: > Hi! > > I was preparing an example program of a use-after-realloc bug, > when I found that GCC doesn't warn in a case where it should. > > > alx@debian:~/tmp$ cat realloc.c > #include <stdint.h> > #include <stdlib.h> > #include <stdio.h> > #include <string.h> > #include <unistd.h> > > static inline char * > xstrdup(const char *s) > { > char *p; > > p = strdup(s); > if (p == NULL) > exit(EXIT_FAILURE); > return p; > } > > static inline char * > strnul(const char *s) > { > return (char *) s + strlen(s); > } > > int > main(void) > { > char *p, *q; > > p = xstrdup(""); > q = strnul(p); > > if (p == q) > puts("equal before"); > else > exit(EXIT_FAILURE); // It's an empty string; this > won't happen > > printf("p = %p; q = %p\n", p, q); > > p = realloc(p, UINT16_MAX); > if (p == NULL) > exit(EXIT_FAILURE); > puts("realloc()"); > > if (p == q) { // Use after realloc. I'd expect a warning > here. > puts("equal after"); > } else { > /* Can we get here? > Let's see the options: > > - realloc(3) fails: > We exit immediately. We don't arrive > here. > > - realloc(3) doesn't move the memory: > p == q, as before > > - realloc(3) moved the memory: > p is guaranteed to be a unique > pointer, > and q is now an invalid pointer. It > is > Undefined Behavior to read `q`, so `p > == q` > is UB. > > As we see, there's no _defined_ path where this > can happen > */ > printf("PID = %i\n", (int) getpid()); > } > > printf("p = %p; q = %p\n", p, q); > } > alx@debian:~/tmp$ cc -Wall -Wextra realloc.c -O3 -fanalyzer > realloc.c: In function ‘main’: > realloc.c:67:9: warning: pointer ‘p’ may be used after ‘realloc’ [- > Wuse-after-free] > 67 | printf("p = %p; q = %p\n", p, q); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > realloc.c:39:13: note: call to ‘realloc’ here > 39 | p = realloc(p, UINT16_MAX); > | ^~~~~~~~~~~~~~~~~~~~~~ > alx@debian:~/tmp$ ./a.out > equal before > p = 0x55bff80802a0; q = 0x55bff80802a0 > realloc() > PID = 25222 > p = 0x55bff80806d0; q = 0x55bff80802a0 > > > Did I miss anything? GCC's -fanalyzer will warn if you dereference q, so e.g. adding: printf("*q = %i\n", *q); gives a warning: https://godbolt.org/z/6qx4afb3E <source>: In function 'main': <source>:65:29: warning: use after 'free' of 'q' [CWE-416] [-Wanalyzer-use-after-free] 65 | printf("*q = %i\n", *q); | ^~ 'main': events 1-2 | | 25 | main(void) | | ^~~~ | | | | | (1) entry to 'main' |...... | 29 | p = xstrdup(""); | | ~~~~~~~~~~~ | | | | | (2) calling 'xstrdup' from 'main' | +--> 'xstrdup': events 3-5 | | 8 | xstrdup(const char *s) | | ^~~~~~~ | | | | | (3) entry to 'xstrdup' |...... | 13 | if (p == NULL) | | ~ | | | | | (4) following 'false' branch (when 'p' is non-NULL)... | 14 | exit(EXIT_FAILURE); | 15 | return p; | | ~ | | | | | (5) ...to here | <------+ | 'main': events 6-15 | | 29 | p = xstrdup(""); | | ^~~~~~~~~~~ | | | | | (6) returning to 'main' from 'xstrdup' |...... | 32 | if (p == q) | | ~ | | | | | (7) following 'true' branch (when 'p == q')... | 33 | puts("equal before"); | | ~~~~~~~~~~~~~~~~~~~~ | | | | | (8) ...to here |...... | 39 | p = realloc(p, UINT16_MAX); | | ~~~~~~~~~~~~~~~~~~~~~~ | | | | | (9) freed here | | (10) when 'realloc' succeeds, moving buffer | 40 | if (p == NULL) | | ~ | | | | | (11) following 'false' branch (when 'p' is non-NULL)... | 41 | exit(EXIT_FAILURE); | 42 | puts("realloc()"); | | ~~~~~~~~~~~~~~~~~ | | | | | (12) ...to here | 43 | | 44 | if (p == q) { // Use after realloc. I'd expect a warning here. | | ~ | | | | | (13) following 'false' branch (when 'p != q')... |...... | 64 | printf("PID = %i\n", (int) getpid()); | | ~~~~~~~~ | | | | | (14) ...to here | 65 | printf("*q = %i\n", *q); | | ~~ | | | | | (15) use after 'free' of 'q'; freed at (9) | I'm not convinced that it's useful to the end-user to warn about the "use of q itself" case. Dave ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-16 15:15 ` David Malcolm @ 2023-02-17 1:04 ` Alejandro Colomar 2023-02-17 1:05 ` Alejandro Colomar 2023-02-17 8:12 ` Martin Uecker 2023-02-17 3:48 ` Siddhesh Poyarekar 1 sibling, 2 replies; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 1:04 UTC (permalink / raw) To: David Malcolm, GCC Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt [-- Attachment #1.1: Type: text/plain, Size: 5628 bytes --] [CC: Added those who contributed to the discussion in linux-man@, and also the authors of N2861 for C2x] Hi David, On 2/16/23 16:15, David Malcolm wrote: > On Thu, 2023-02-16 at 15:35 +0100, Alejandro Colomar via Gcc wrote: >> Hi! >> >> I was preparing an example program of a use-after-realloc bug, >> when I found that GCC doesn't warn in a case where it should. >> >> >> alx@debian:~/tmp$ cat realloc.c >> #include <stdint.h> >> #include <stdlib.h> >> #include <stdio.h> >> #include <string.h> >> #include <unistd.h> >> >> static inline char * >> xstrdup(const char *s) >> { >> char *p; >> >> p = strdup(s); >> if (p == NULL) >> exit(EXIT_FAILURE); >> return p; >> } >> >> static inline char * >> strnul(const char *s) >> { >> return (char *) s + strlen(s); >> } >> >> int >> main(void) >> { >> char *p, *q; >> >> p = xstrdup(""); >> q = strnul(p); >> >> if (p == q) >> puts("equal before"); >> else >> exit(EXIT_FAILURE); // It's an empty string; this >> won't happen >> >> printf("p = %p; q = %p\n", p, q); >> >> p = realloc(p, UINT16_MAX); >> if (p == NULL) >> exit(EXIT_FAILURE); >> puts("realloc()"); >> >> if (p == q) { // Use after realloc. I'd expect a warning >> here. >> puts("equal after"); >> } else { >> /* Can we get here? >> Let's see the options: >> >> - realloc(3) fails: >> We exit immediately. We don't arrive >> here. >> >> - realloc(3) doesn't move the memory: >> p == q, as before >> >> - realloc(3) moved the memory: >> p is guaranteed to be a unique >> pointer, >> and q is now an invalid pointer. It >> is >> Undefined Behavior to read `q`, so `p >> == q` >> is UB. >> >> As we see, there's no _defined_ path where this >> can happen >> */ >> printf("PID = %i\n", (int) getpid()); >> } >> >> printf("p = %p; q = %p\n", p, q); >> } >> alx@debian:~/tmp$ cc -Wall -Wextra realloc.c -O3 -fanalyzer >> realloc.c: In function ‘main’: >> realloc.c:67:9: warning: pointer ‘p’ may be used after ‘realloc’ [- >> Wuse-after-free] >> 67 | printf("p = %p; q = %p\n", p, q); >> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> realloc.c:39:13: note: call to ‘realloc’ here >> 39 | p = realloc(p, UINT16_MAX); >> | ^~~~~~~~~~~~~~~~~~~~~~ >> alx@debian:~/tmp$ ./a.out >> equal before >> p = 0x55bff80802a0; q = 0x55bff80802a0 >> realloc() >> PID = 25222 >> p = 0x55bff80806d0; q = 0x55bff80802a0 >> >> >> Did I miss anything? > > GCC's -fanalyzer will warn if you dereference q, so e.g. adding: > printf("*q = %i\n", *q); > gives a warning: > https://godbolt.org/z/6qx4afb3E > > <source>: In function 'main': > <source>:65:29: warning: use after 'free' of 'q' [CWE-416] [-Wanalyzer-use-after-free] > 65 | printf("*q = %i\n", *q); > | ^~ [...] > > I'm not convinced that it's useful to the end-user to warn about the > "use of q itself" case. I didn't quote the standard because I couldn't find it. I was searching in C11, and it seems that it was only implicitly Undefined Behavior, without explicit spelling (the value of the pointer was indeterminate, according to C11). Now C23 will better clarify that reading such a pointer value (not even dereferencing) is Undefined Behavior. There was a discussion in linux-man@ some years ago, which now I realize it didn't end up being applied (I thought we had applied a patch, but it seems we didn't). I'll check if we still need such a patch (and I guess we do, since we're having this conversation). Using the pointer is _wrong_. And by wrong, I mean that it's Undefined Behavior. I think that alone is enough to issue a warning. Especially, since the compiler already has that information; otherwise, it couldn't have warned about line 67 of my example program. I could understand if due to optimizations the compiler lost that information, so it couldn't warn, but in this case, there's no excuse. The benefit for users? They'll realize that the code they wrote is bad. Not even suspicious, as some warnings warn about suspicious code. This case is uncontroversially wrong. That code has no valid reason to be written that way, under ISO C. Cheers, Alex > > Dave -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 1:04 ` Alejandro Colomar @ 2023-02-17 1:05 ` Alejandro Colomar 2023-02-17 1:56 ` Sam James 2023-02-17 8:12 ` Martin Uecker 1 sibling, 1 reply; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 1:05 UTC (permalink / raw) To: David Malcolm, GCC Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt [-- Attachment #1.1: Type: text/plain, Size: 659 bytes --] On 2/17/23 02:04, Alejandro Colomar wrote: > [CC: Added those who contributed to the discussion in linux-man@, > and also the authors of N2861 for C2x] [...] > > There was a discussion in linux-man@ some years ago, which now I realize it > didn't end up being applied (I thought we had applied a patch, but it seems we > didn't). I'll check if we still need such a patch (and I guess we do, since > we're having this conversation). I forgot to link: <https://lore.kernel.org/linux-man/87lf4im6sf.fsf@oldenburg.str.redhat.com/T/> -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 1:05 ` Alejandro Colomar @ 2023-02-17 1:56 ` Sam James 0 siblings, 0 replies; 39+ messages in thread From: Sam James @ 2023-02-17 1:56 UTC (permalink / raw) To: Alejandro Colomar Cc: David Malcolm, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Siddhesh Poyarekar [-- Attachment #1: Type: text/plain, Size: 753 bytes --] > On 17 Feb 2023, at 01:05, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote: > > On 2/17/23 02:04, Alejandro Colomar wrote: >> [CC: Added those who contributed to the discussion in linux-man@, >> and also the authors of N2861 for C2x] > > [...] > >> >> There was a discussion in linux-man@ some years ago, which now I realize it >> didn't end up being applied (I thought we had applied a patch, but it seems we >> didn't). I'll check if we still need such a patch (and I guess we do, since >> we're having this conversation). > > I forgot to link: > <https://lore.kernel.org/linux-man/87lf4im6sf.fsf@oldenburg.str.redhat.com/T/> > See also https://siddhesh.in/posts/that-is-not-a-number-that-is-a-freed-object.html. [-- Attachment #2: Message signed with OpenPGP --] [-- Type: application/pgp-signature, Size: 358 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 1:04 ` Alejandro Colomar 2023-02-17 1:05 ` Alejandro Colomar @ 2023-02-17 8:12 ` Martin Uecker 2023-02-17 11:35 ` Alejandro Colomar 1 sibling, 1 reply; 39+ messages in thread From: Martin Uecker @ 2023-02-17 8:12 UTC (permalink / raw) To: Alejandro Colomar, David Malcolm, GCC Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt Am Freitag, dem 17.02.2023 um 02:04 +0100 schrieb Alejandro Colomar > > [...] > > > > > I'm not convinced that it's useful to the end-user to warn about > > the > > "use of q itself" case. > > I didn't quote the standard because I couldn't find it. I was > searching in C11, > and it seems that it was only implicitly Undefined Behavior, without > explicit > spelling (the value of the pointer was indeterminate, according to > C11). The value becomes indeterminate according to 6.2.4p2 in C11. An indeterminate value may be a trap representation 3.19.2 If such a trap representation is read (the pointer, not just the pointed-to object), the behavior is undefined according to 6.2.6.1p5. So it is explitely UB and was already in C99 or even before. > Now C23 will better clarify that reading such a pointer value (not > ever dereferencing) is Undefined Behavior. We did not change this for C23. Only the terminology regarding trap representation (now: non-value representation) and indeterminate values (now: indeterminate representation) was revised. There are proposal to define bevahior for such pointers, but I think this would be a mistake. (although somehow I ended up being a co-author of this paper), The reason is that every use of such a pointer risks running into sublte issues related to pointer provenance. So in my opinion it would be very useful to warn about all uses of invalid pointers, because fixing this is much easier than understanding and fixing provenance issues which might arise from incorrect use of such pointers. Martin > There was a discussion in linux-man@ some years ago, which now I > realize it > didn't end up being applied (I thought we had applied a patch, but it > seems we > didn't). I'll check if we still need such a patch (and I guess we > do, since > we're having this conversation). > > Using the pointer is _wrong_. And by wrong, I mean that it's > Undefined Behavior. > I think that alone is enough to issue a warning. Especially, since > the compiler > already has that information; otherwise, it couldn't have warned > about line 67 > of my example program. I could understand if due to optimizations > the compiler > lost that information, so it couldn't warn, but in this case, there's > no excuse. > > The benefit for users? They'll realize that the code they wrote is > bad. Not even > suspicious, as some warnings warn about suspicious code. This case > is > uncontroversially wrong. That code has no valid reason to be written > that way, > under ISO C. > > Cheers, > > Alex > > > > > Dave > > -- > <http://www.alejandro-colomar.es/> > GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 8:12 ` Martin Uecker @ 2023-02-17 11:35 ` Alejandro Colomar 2023-02-17 13:34 ` Andreas Schwab 2023-02-17 13:48 ` Martin Uecker 0 siblings, 2 replies; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 11:35 UTC (permalink / raw) To: Martin Uecker, GCC, Jonathan Wakely Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Yann Droneaud [-- Attachment #1.1: Type: text/plain, Size: 2735 bytes --] Hi Martin, On 2/17/23 09:12, Martin Uecker wrote: > Am Freitag, dem 17.02.2023 um 02:04 +0100 schrieb Alejandro Colomar > > >> >> [...] >> >>> >>> I'm not convinced that it's useful to the end-user to warn about >>> the >>> "use of q itself" case. >> >> I didn't quote the standard because I couldn't find it. I was >> searching in C11, >> and it seems that it was only implicitly Undefined Behavior, without >> explicit >> spelling (the value of the pointer was indeterminate, according to >> C11). > > The value becomes indeterminate according to 6.2.4p2 in C11. > An indeterminate value may be a trap representation 3.19.2 > If such a trap representation is read (the pointer, not > just the pointed-to object), the behavior is undefined > according to 6.2.6.1p5. So it is explitely UB and was > already in C99 or even before. What if the comparison is performed as uintptr_t? You wouldn't have trap representations, would you? Or we could even go to memcmp(3) to compare as char, if paranoic enough :) > > >> Now C23 will better clarify that reading such a pointer value (not >> ever dereferencing) is Undefined Behavior. > > We did not change this for C23. C11: The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime. <https://port70.net/%7Ensz/c/c11/n1570.html#6.2.4p2> C2x (N3054 is the latest I know): If a pointer value is used in an evaluation after the object the pointer points to (or just past) reaches the end of its lifetime, the behavior is undefined. <https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3054.pdf#subsection.6.2.4> This new wording doesn't even allow one to use memcmp(3); just reading the pointer value, however you do it, is UB. > > Only the terminology regarding trap representation > (now: non-value representation) and indeterminate > values (now: indeterminate representation) was revised. > > > There are proposal to define bevahior for such > pointers, but I think this would be a mistake. > (although somehow I ended up being a co-author > of this paper), > > The reason is that every use of such a pointer > risks running into sublte issues related to pointer > provenance. > > So in my opinion it would be very useful to warn about > all uses of invalid pointers, because fixing this is > much easier than understanding and fixing provenance > issues which might arise from incorrect use of such > pointers. Agree; making this defined behavior doesn't seem a good idea. Cheers, Alex > > > Martin -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 11:35 ` Alejandro Colomar @ 2023-02-17 13:34 ` Andreas Schwab 2023-02-17 13:48 ` Martin Uecker 1 sibling, 0 replies; 39+ messages in thread From: Andreas Schwab @ 2023-02-17 13:34 UTC (permalink / raw) To: Alejandro Colomar via Gcc Cc: Martin Uecker, Jonathan Wakely, Alejandro Colomar, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Yann Droneaud On Feb 17 2023, Alejandro Colomar via Gcc wrote: > C2x (N3054 is the latest I know): > > If a pointer value is used in an evaluation after > the object the pointer points to (or just past) > reaches the end of its lifetime, > the behavior is undefined. > <https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3054.pdf#subsection.6.2.4> > > > This new wording doesn't even allow one to use memcmp(3); > just reading the pointer value, however you do it, is UB. But memcmp does not read the value, it reads the object representation, and reading the representation is never UB. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 11:35 ` Alejandro Colomar 2023-02-17 13:34 ` Andreas Schwab @ 2023-02-17 13:48 ` Martin Uecker 2023-02-23 19:23 ` Alex Colomar 1 sibling, 1 reply; 39+ messages in thread From: Martin Uecker @ 2023-02-17 13:48 UTC (permalink / raw) To: Alejandro Colomar, GCC, Jonathan Wakely Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James Am Freitag, dem 17.02.2023 um 12:35 +0100 schrieb Alejandro Colomar: > Hi Martin, > > On 2/17/23 09:12, Martin Uecker wrote: > > Am Freitag, dem 17.02.2023 um 02:04 +0100 schrieb Alejandro Colomar > > > > > > > > > > [...] > > > > > > > > > > > I'm not convinced that it's useful to the end-user to warn about > > > > the > > > > "use of q itself" case. > > > > > > I didn't quote the standard because I couldn't find it. I was > > > searching in C11, > > > and it seems that it was only implicitly Undefined Behavior, without > > > explicit > > > spelling (the value of the pointer was indeterminate, according to > > > C11). > > > > The value becomes indeterminate according to 6.2.4p2 in C11. > > An indeterminate value may be a trap representation 3.19.2 > > If such a trap representation is read (the pointer, not > > just the pointed-to object), the behavior is undefined > > according to 6.2.6.1p5. So it is explitely UB and was > > already in C99 or even before. > > What if the comparison is performed as uintptr_t? According to the standard, it is allowed to convert a valid pointer to uintptr_t and then compare it after the object is freed. Unfortunately, many compilers will miscompile such comparisons. This does not only affect C/C++ but also happens in "safe" Rust: https://godbolt.org/z/e53b5a53q But conversion to uintptr_t after the object is freed is undefined behavior. > You wouldn't have trap representations, would you? > Or we could even go to memcmp(3) to compare as char, > if paranoic enough :) We renamed "trap representations" to "non-value representation" in C2X because people tend to misunderstand this term. It does not necessarily imply a trap. It simply means that the object representation does not correspond to a valid value of the type. A pointer that is not NULL but also does not point to an object would have a trap representation (now: "non-value representation"). A boolean that stores a representation that does not correspond to true or false is also a non-value representation. In such cases already a load (lvalue conversion) of such a value causes undefined behavior. Existing (correct) compiler optimizations can break code which does comparisons of pointers to dead objects. So it is not really safe to use such pointers even if the architecture does not trap on load of such pointers. For pointers, there is also the subtle issue that the representation might be the same in memory as a new valid pointer to a new object, but it is still considered a trap representation due to provenance. > > > > > > > Now C23 will better clarify that reading such a pointer value (not > > > ever dereferencing) is Undefined Behavior. > > > > We did not change this for C23. > > C11: > > The value of a pointer becomes indeterminate when > the object it points to (or just past) > reaches the end of its lifetime. > <https://port70.net/%7Ensz/c/c11/n1570.html#6.2.4p2> > > C2x (N3054 is the latest I know): > > If a pointer value is used in an evaluation after > the object the pointer points to (or just past) > reaches the end of its lifetime, > the behavior is undefined. > <https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3054.pdf#subsection.6.2.4> > > > This new wording doesn't even allow one to use memcmp(3); > just reading the pointer value, however you do it, is UB. memcmp would not use the pointer value but work on the representation bytes and is still allowed. Martin > > > > Only the terminology regarding trap representation > > (now: non-value representation) and indeterminate > > values (now: indeterminate representation) was revised. > > > > > > There are proposal to define bevahior for such > > pointers, but I think this would be a mistake. > > (although somehow I ended up being a co-author > > of this paper), > > > > The reason is that every use of such a pointer > > risks running into sublte issues related to pointer > > provenance. > > > > So in my opinion it would be very useful to warn about > > all uses of invalid pointers, because fixing this is > > much easier than understanding and fixing provenance > > issues which might arise from incorrect use of such > > pointers. > > Agree; making this defined behavior doesn't seem a good idea. > > Cheers, > > Alex > > > > > > > Martin > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 13:48 ` Martin Uecker @ 2023-02-23 19:23 ` Alex Colomar 2023-02-23 19:57 ` Martin Uecker 0 siblings, 1 reply; 39+ messages in thread From: Alex Colomar @ 2023-02-23 19:23 UTC (permalink / raw) To: Martin Uecker, GCC, Jonathan Wakely Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Serge Hallyn [-- Attachment #1.1: Type: text/plain, Size: 728 bytes --] Hi Martin, On 2/17/23 14:48, Martin Uecker wrote: >> This new wording doesn't even allow one to use memcmp(3); >> just reading the pointer value, however you do it, is UB. > > memcmp would not use the pointer value but work > on the representation bytes and is still allowed. Hmm, interesting. It's rather unspecified behavior. Still unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or false randomly; the compiler may compile out the call to memcmp(3), since it knows it won't produce any observable behavior. <https://software.codidact.com/posts/287905> Cheers! Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-23 19:23 ` Alex Colomar @ 2023-02-23 19:57 ` Martin Uecker 2023-02-24 0:02 ` Alex Colomar 0 siblings, 1 reply; 39+ messages in thread From: Martin Uecker @ 2023-02-23 19:57 UTC (permalink / raw) To: Alex Colomar, GCC, Jonathan Wakely Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Serge Hallyn Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar: > Hi Martin, > > On 2/17/23 14:48, Martin Uecker wrote: > > > This new wording doesn't even allow one to use memcmp(3); > > > just reading the pointer value, however you do it, is UB. > > > > memcmp would not use the pointer value but work > > on the representation bytes and is still allowed. > > Hmm, interesting. It's rather unspecified behavior. Still > unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or > false randomly; the compiler may compile out the call to memcmp(3), > since it knows it won't produce any observable behavior. > > <https://software.codidact.com/posts/287905> No, I think several things get mixed up here. The representation of a pointer that becomes invalid does not change. So (0 === memcmp(&p, &p, sizeof(p)) always evaluates to true. Also in general, an unspecified value is simply unspecified but does not change anymore. Reading an uninitialized value of automatic storage whose address was not taken is undefined behavior, so everything is possible afterwards. An uninitialized variable whose address was taken has a representation which can represent an unspecified value or a no-value (trap) representation. Reading the representation itself is always ok and gives consistent results. Reading the variable can be undefined behavior iff it is a trap representation, otherwise you get the unspecified value which is stored there. At least this is my reading of the C standard. Compilers are not full conformant. Martin ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-23 19:57 ` Martin Uecker @ 2023-02-24 0:02 ` Alex Colomar 2023-02-24 1:21 ` Serge E. Hallyn 0 siblings, 1 reply; 39+ messages in thread From: Alex Colomar @ 2023-02-24 0:02 UTC (permalink / raw) To: Martin Uecker, GCC Cc: Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Serge Hallyn, Jonathan Wakely [-- Attachment #1.1: Type: text/plain, Size: 2380 bytes --] Hi Martin, On 2/23/23 20:57, Martin Uecker wrote: > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar: >> Hi Martin, >> >> On 2/17/23 14:48, Martin Uecker wrote: >>>> This new wording doesn't even allow one to use memcmp(3); >>>> just reading the pointer value, however you do it, is UB. >>> >>> memcmp would not use the pointer value but work >>> on the representation bytes and is still allowed. >> >> Hmm, interesting. It's rather unspecified behavior. Still >> unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or >> false randomly; the compiler may compile out the call to memcmp(3), >> since it knows it won't produce any observable behavior. >> >> <https://software.codidact.com/posts/287905> > > No, I think several things get mixed up here. > > The representation of a pointer that becomes invalid > does not change. > > So (0 === memcmp(&p, &p, sizeof(p)) always > evaluates to true. > > Also in general, an unspecified value is simply unspecified > but does not change anymore. > > Reading an uninitialized value of automatic storage whose > address was not taken is undefined behavior, so everything > is possible afterwards. > > An uninitialized variable whose address was taken has a > representation which can represent an unspecified value > or a no-value (trap) representation. Reading the > representation itself is always ok and gives consistent > results. Reading the variable can be undefined behavior > iff it is a trap representation, otherwise you get > the unspecified value which is stored there. > > At least this is my reading of the C standard. Compilers > are not full conformant. Does all this imply that the following is well defined behavior (and shall print what one would expect)? free(p); (void) &p; // take the address // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? printf("%p\n", p); // we took previously its address, // so now it has to hold consistently // the previous value This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely convinced, but might be. Cheers, Alex > > > Martin > > > > > > > -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 0:02 ` Alex Colomar @ 2023-02-24 1:21 ` Serge E. Hallyn 2023-02-24 1:42 ` Alex Colomar 2023-02-24 8:36 ` Martin Uecker 0 siblings, 2 replies; 39+ messages in thread From: Serge E. Hallyn @ 2023-02-24 1:21 UTC (permalink / raw) To: Alex Colomar Cc: Martin Uecker, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Serge Hallyn, Jonathan Wakely On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote: > Hi Martin, > > On 2/23/23 20:57, Martin Uecker wrote: > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar: > > > Hi Martin, > > > > > > On 2/17/23 14:48, Martin Uecker wrote: > > > > > This new wording doesn't even allow one to use memcmp(3); > > > > > just reading the pointer value, however you do it, is UB. > > > > > > > > memcmp would not use the pointer value but work > > > > on the representation bytes and is still allowed. > > > > > > Hmm, interesting. It's rather unspecified behavior. Still > > > unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or > > > false randomly; the compiler may compile out the call to memcmp(3), > > > since it knows it won't produce any observable behavior. > > > > > > <https://software.codidact.com/posts/287905> > > > > No, I think several things get mixed up here. > > > > The representation of a pointer that becomes invalid > > does not change. > > > > So (0 === memcmp(&p, &p, sizeof(p)) always > > evaluates to true. > > > > Also in general, an unspecified value is simply unspecified > > but does not change anymore. Right. p is its own thing - n bytes on the stack containing some value. Once it comes into scope, it doesn't change on its own. And if I do free(p) or o = realloc(p), then the value of p itself - the n bytes on the stack - does not change. I realize C11 appears to have changed that. I fear that in doing so it actually risks increasing the confusion about pointers. IMO it's much easier to reason about o = realloc(p, X); (and more baroque constructions) when keeping in mind that o, p, and the object pointed to by either one are all different things. > > Reading an uninitialized value of automatic storage whose > > address was not taken is undefined behavior, so everything > > is possible afterwards. > > > > An uninitialized variable whose address was taken has a > > representation which can represent an unspecified value > > or a no-value (trap) representation. Reading the > > representation itself is always ok and gives consistent > > results. Reading the variable can be undefined behavior > > iff it is a trap representation, otherwise you get > > the unspecified value which is stored there. > > > > At least this is my reading of the C standard. Compilers > > are not full conformant. > > Does all this imply that the following is well defined behavior (and shall > print what one would expect)? > > free(p); > > (void) &p; // take the address > // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > printf("%p\n", p); // we took previously its address, > // so now it has to hold consistently > // the previous value > > > This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely > convinced, but might be. Again, p is just an n byte variable which happens to have (one hopes) pointed at a previously malloc'd address. And I'd argue that pre-C11, this was not confusing, and would not have felt weird to you. But I am most grateful to you for having brought this to my attention. I may not agree with it and not like it, but it's right there in the spec, so time for me to adjust :) -serge ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 1:21 ` Serge E. Hallyn @ 2023-02-24 1:42 ` Alex Colomar 2023-02-24 3:01 ` Peter Lafreniere ` (2 more replies) 2023-02-24 8:36 ` Martin Uecker 1 sibling, 3 replies; 39+ messages in thread From: Alex Colomar @ 2023-02-24 1:42 UTC (permalink / raw) To: Serge E. Hallyn, Martin Uecker Cc: GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely [-- Attachment #1.1: Type: text/plain, Size: 2800 bytes --] Hi Serge, Martin, On 2/24/23 02:21, Serge E. Hallyn wrote: >> Does all this imply that the following is well defined behavior (and shall >> print what one would expect)? >> >> free(p); >> >> (void) &p; // take the address >> // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? >> >> printf("%p\n", p); // we took previously its address, >> // so now it has to hold consistently >> // the previous value >> >> >> This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely >> convinced, but might be. > > Again, p is just an n byte variable which happens to have (one hopes) > pointed at a previously malloc'd address. > > And I'd argue that pre-C11, this was not confusing, and would not have > felt weird to you. > > But I am most grateful to you for having brought this to my attention. > I may not agree with it and not like it, but it's right there in the > spec, so time for me to adjust :) I'll try to show why this feels weird to me (even in C89): alx@dell7760:~/tmp$ cat pointers.c #include <stdio.h> #include <stdlib.h> int main(void) { char *p, *q; p = malloc(42); if (p == NULL) exit(1); q = realloc(p, 42); if (q == NULL) exit(1); (void) &p; // If we remove this, we get -Wuse-after-free printf("(%p == %p) = %i\n", p, q, (p == q)); } alx@dell7760:~/tmp$ cc -Wall -Wextra pointers.c -Wuse-after-free=3 alx@dell7760:~/tmp$ ./a.out (0x5642cd9022a0 == 0x5642cd9022a0) = 1 This pointers point to different objects (actually, one of them doesn't even point to an object anymore), so they can't compare equal, according to both: <http://port70.net/%7Ensz/c/c11/n1570.html#6.5.9p6> <http://port70.net/~nsz/c/c89/c89-draft.html#3.3.9> (I believe C89 already had the concept of lifetime well defined as it is now, so the object had finished it's lifetime after realloc(3)). How can we justify that true, if the pointer don't point to the same object? And how can we justify a hypothetical false (which compilers don't implement), if compilers will really just read the value? To implement this as well defined behavior, it could result in no other than false, and it would require heavy overhead for the compilers to detect that the seemingly-equal values are indeed different, don't you think? The easiest solution is for the standard to just declare this outlaw, IMO. Maybe it could do an exception for printing, that is, reading a pointer is not a problem in itself, a long as you don't compare it, but I'm not such an expert about this. Cheers, Alex > > -serge -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 1:42 ` Alex Colomar @ 2023-02-24 3:01 ` Peter Lafreniere 2023-02-24 8:52 ` Martin Uecker 2023-02-24 8:43 ` Martin Uecker 2023-02-24 16:10 ` Serge E. Hallyn 2 siblings, 1 reply; 39+ messages in thread From: Peter Lafreniere @ 2023-02-24 3:01 UTC (permalink / raw) To: Alex Colomar Cc: Serge E. Hallyn, Martin Uecker, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely If I may add my thoughts here, On Thursday, February 23rd, 2023 at 20:42, Alex Colomar wrote: > I'll try to show why this feels weird to me (even in C89): > > > alx@dell7760:~/tmp$ cat pointers.c > #include <stdio.h> > > #include <stdlib.h> > > > > int > main(void) > { > char *p, *q; > > p = malloc(42); > if (p == NULL) > exit(1); > > q = realloc(p, 42); > if (q == NULL) > exit(1); > > (void) &p; // If we remove this, we get -Wuse-after-free > > printf("(%p == %p) = %i\n", p, q, (p == q)); > } > alx@dell7760:~/tmp$ cc -Wall -Wextra pointers.c -Wuse-after-free=3 > alx@dell7760:~/tmp$ ./a.out > (0x5642cd9022a0 == 0x5642cd9022a0) = 1 > > > This pointers point to different objects (actually, one of them doesn't > even point to an object anymore), so they can't compare equal, according > to both: In this case p and q both _do_ point to the same object, which is valid. As per [https://port70.net/~nsz/c/c11/n1570.html#3.15], if the pointers both refer to the same storage region *[1] then they, by definition, refer to the same object and so should compare as true. This equality is entirely up to the implementation of your libc's realloc and cannot be depended upon, so the warning is absolutely warranted *[2]. I don't believe that it's correct for gcc to do more based off of external implementation-defined behavior. *[1]: Which may or may not need to be sized. I'm not an expert at reading specs, and besides, in this case the pointer is of the same type and value. This point seems more related to aliasing rules than lifetimes. *[2]: There are, however, genuine cases where the reuse could be valid. Namely that if q is null, then p is guaranteed to still point to a valid allocation. Fortunately, I can't get gcc to trigger a false warning in this case. > > http://port70.net/~nsz/c/c11/n1570.html#6.5.9p6 > > > http://port70.net/~nsz/c/c89/c89-draft.html#3.3.9 Just to be transparent, I'm working exclusively off of the first link. Also, this is just a passing interpretation of the c11 spec. > > > (I believe C89 already had the concept of lifetime well defined as it is > now, so the object had finished it's lifetime after realloc(3)). > > How can we justify that true, if the pointer don't point to the same > object? And how can we justify a hypothetical false (which compilers > don't implement), if compilers will really just read the value? To > implement this as well defined behavior, it could result in no other > than false, and it would require heavy overhead for the compilers to > detect that the seemingly-equal values are indeed different, don't you > think? The easiest solution is for the standard to just declare this > outlaw, IMO. Again, this is an issue with defining an object. > Maybe it could do an exception for printing, that is, reading a pointer > is not a problem in itself, a long as you don't compare it, but I'm not > such an expert about this. One last thought: with the above strict interpretation of the c standard, it would become nigh on impossible to implement the malloc(3) family of functions in c themselves. I vote for the "shared storage" interpretation of the c11 standard that is actually implemented rather than this abstract liveness oriented interpretation. -- Cheers, Peter Lafreniere <n8pjl.ca> "Who malloc()'s for malloc()?" - Myself, 2023-02-24 ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 3:01 ` Peter Lafreniere @ 2023-02-24 8:52 ` Martin Uecker 0 siblings, 0 replies; 39+ messages in thread From: Martin Uecker @ 2023-02-24 8:52 UTC (permalink / raw) To: Peter Lafreniere, Alex Colomar Cc: Serge E. Hallyn, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely Am Freitag, dem 24.02.2023 um 03:01 +0000 schrieb Peter Lafreniere: ... > > > Maybe it could do an exception for printing, that is, reading a pointer > > is not a problem in itself, a long as you don't compare it, but I'm not > > such an expert about this. > > One last thought: with the above strict interpretation of the c standard, > it would become nigh on impossible to implement the malloc(3) family of > functions in c themselves. I vote for the "shared storage" interpretation > of the c11 standard that is actually implemented rather than this abstract > liveness oriented interpretation. This is a bit of a misunderstanding about what "undefined behavior" means in ISO C. It simply means that ISO C does not specify the behavior. This does not mean it is illegal to do something which has undefined behavior. Instead, it means you can not rely on the ISO C standard for portable behavior. So if you implement "malloc" in C itself you will probably rely on "undefined behavior", but this is perfectly fine. The C standard specifies behavior of "malloc", but does not care how it is implemented. Martin ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 1:42 ` Alex Colomar 2023-02-24 3:01 ` Peter Lafreniere @ 2023-02-24 8:43 ` Martin Uecker 2023-02-24 16:10 ` Serge E. Hallyn 2 siblings, 0 replies; 39+ messages in thread From: Martin Uecker @ 2023-02-24 8:43 UTC (permalink / raw) To: Alex Colomar, Serge E. Hallyn Cc: GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely Am Freitag, dem 24.02.2023 um 02:42 +0100 schrieb Alex Colomar: > Hi Serge, Martin, > > On 2/24/23 02:21, Serge E. Hallyn wrote: > > > Does all this imply that the following is well defined behavior (and shall > > > print what one would expect)? > > > > > > free(p); > > > > > > (void) &p; // take the address > > > // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > > > > > printf("%p\n", p); // we took previously its address, > > > // so now it has to hold consistently > > > // the previous value > > > > > > > > > This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely > > > convinced, but might be. > > > > Again, p is just an n byte variable which happens to have (one hopes) > > pointed at a previously malloc'd address. > > > > And I'd argue that pre-C11, this was not confusing, and would not have > > felt weird to you. > > > > But I am most grateful to you for having brought this to my attention. > > I may not agree with it and not like it, but it's right there in the > > spec, so time for me to adjust :) > > I'll try to show why this feels weird to me (even in C89): > > > alx@dell7760:~/tmp$ cat pointers.c > #include <stdio.h> > #include <stdlib.h> > > > int > main(void) > { > char *p, *q; > > p = malloc(42); > if (p == NULL) > exit(1); > > q = realloc(p, 42); > if (q == NULL) > exit(1); > > (void) &p; // If we remove this, we get -Wuse-after-free > > printf("(%p == %p) = %i\n", p, q, (p == q)); > } > alx@dell7760:~/tmp$ cc -Wall -Wextra pointers.c -Wuse-after-free=3 > alx@dell7760:~/tmp$ ./a.out > (0x5642cd9022a0 == 0x5642cd9022a0) = 1 > No, you can't do the comparison or use the value of 'p' because 'p' is not a valid pointer. (The address taken makes no difference here, but it may confuse the compiler so that it does not warn.) > > This pointers point to different objects (actually, one of them doesn't > even point to an object anymore), so they can't compare equal, according > to both: > > <http://port70.net/%7Ensz/c/c11/n1570.html#6.5.9p6> > > <http://port70.net/~nsz/c/c89/c89-draft.html#3.3.9> > > (I believe C89 already had the concept of lifetime well defined as it is > now, so the object had finished it's lifetime after realloc(3)). > > How can we justify that true, if the pointer don't point to the same > object? And how can we justify a hypothetical false (which compilers > don't implement), if compilers will really just read the value? To > implement this as well defined behavior, it could result in no other > than false, and it would require heavy overhead for the compilers to > detect that the seemingly-equal values are indeed different, don't you > think? The easiest solution is for the standard to just declare this > outlaw, IMO. This is undefined behavior, so the comparison can return false or true or crash or whatever. Martin > > Maybe it could do an exception for printing, that is, reading a pointer > is not a problem in itself, a long as you don't compare it, but I'm not > such an expert about this. > > Cheers, > > Alex > > > > > -serge > > -- > <http://www.alejandro-colomar.es/> > GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 1:42 ` Alex Colomar 2023-02-24 3:01 ` Peter Lafreniere 2023-02-24 8:43 ` Martin Uecker @ 2023-02-24 16:10 ` Serge E. Hallyn 2 siblings, 0 replies; 39+ messages in thread From: Serge E. Hallyn @ 2023-02-24 16:10 UTC (permalink / raw) To: Alex Colomar Cc: Serge E. Hallyn, Martin Uecker, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely On Fri, Feb 24, 2023 at 02:42:32AM +0100, Alex Colomar wrote: > Hi Serge, Martin, > > On 2/24/23 02:21, Serge E. Hallyn wrote: > > > Does all this imply that the following is well defined behavior (and shall > > > print what one would expect)? > > > > > > free(p); > > > > > > (void) &p; // take the address > > > // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > > > > > printf("%p\n", p); // we took previously its address, > > > // so now it has to hold consistently > > > // the previous value > > > > > > > > > This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely > > > convinced, but might be. > > > > Again, p is just an n byte variable which happens to have (one hopes) > > pointed at a previously malloc'd address. > > > > And I'd argue that pre-C11, this was not confusing, and would not have > > felt weird to you. > > > > But I am most grateful to you for having brought this to my attention. > > I may not agree with it and not like it, but it's right there in the > > spec, so time for me to adjust :) > > I'll try to show why this feels weird to me (even in C89): > > > alx@dell7760:~/tmp$ cat pointers.c > #include <stdio.h> > #include <stdlib.h> > > > int > main(void) > { > char *p, *q; > > p = malloc(42); > if (p == NULL) > exit(1); > > q = realloc(p, 42); > if (q == NULL) > exit(1); > > (void) &p; // If we remove this, we get -Wuse-after-free (which I would argue is a bug in the compiler) > printf("(%p == %p) = %i\n", p, q, (p == q)); > } > alx@dell7760:~/tmp$ cc -Wall -Wextra pointers.c -Wuse-after-free=3 > alx@dell7760:~/tmp$ ./a.out > (0x5642cd9022a0 == 0x5642cd9022a0) = 1 > > > This pointers point to different objects (actually, one of them doesn't even > point to an object anymore), so they can't compare equal, according to both: > > <http://port70.net/%7Ensz/c/c11/n1570.html#6.5.9p6> > > <http://port70.net/~nsz/c/c89/c89-draft.html#3.3.9> > > (I believe C89 already had the concept of lifetime well defined as it is > now, so the object had finished it's lifetime after realloc(3)). > > How can we justify that true, if the pointer don't point to the same object? Because what's pointed to does not matter. You are comparing the memory address p, not the contents of the memory address. By way of analogy, if I do mkdir -p /tmp/1/a ln -s /tmp/1 /tmp/2 rm -rf /tmp/1 then /tmp/2 is still a symlink. 'stat /tmp/2' still works and is well defined. And if I create a new /tmp/1, then /tmp/2 starts pointing to that. Yes, re-useing p like that is a very bad idea, in many cases :) > And how can we justify a hypothetical false (which compilers don't > implement), if compilers will really just read the value? To implement this > as well defined behavior, it could result in no other than false, and it > would require heavy overhead for the compilers to detect that the > seemingly-equal values are indeed different, don't you think? The easiest > solution is for the standard to just declare this outlaw, IMO. > > Maybe it could do an exception for printing, that is, reading a pointer is > not a problem in itself, a long as you don't compare it, but I'm not such an > expert about this. > > Cheers, > > Alex > > > > > -serge > > -- > <http://www.alejandro-colomar.es/> > GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 1:21 ` Serge E. Hallyn 2023-02-24 1:42 ` Alex Colomar @ 2023-02-24 8:36 ` Martin Uecker 2023-02-24 16:01 ` Serge E. Hallyn 1 sibling, 1 reply; 39+ messages in thread From: Martin Uecker @ 2023-02-24 8:36 UTC (permalink / raw) To: Serge E. Hallyn, Alex Colomar Cc: GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn: > On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote: > > Hi Martin, > > > > On 2/23/23 20:57, Martin Uecker wrote: > > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar: > > > > Hi Martin, > > > > > > > > On 2/17/23 14:48, Martin Uecker wrote: > > > > > > This new wording doesn't even allow one to use memcmp(3); > > > > > > just reading the pointer value, however you do it, is UB. > > > > > > > > > > memcmp would not use the pointer value but work > > > > > on the representation bytes and is still allowed. > > > > > > > > Hmm, interesting. It's rather unspecified behavior. Still > > > > unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or > > > > false randomly; the compiler may compile out the call to memcmp(3), > > > > since it knows it won't produce any observable behavior. > > > > > > > > <https://software.codidact.com/posts/287905> > > > > > > No, I think several things get mixed up here. > > > > > > The representation of a pointer that becomes invalid > > > does not change. > > > > > > So (0 === memcmp(&p, &p, sizeof(p)) always > > > evaluates to true. > > > > > > Also in general, an unspecified value is simply unspecified > > > but does not change anymore. > > Right. p is its own thing - n bytes on the stack containing some value. > Once it comes into scope, it doesn't change on its own. And if I do > free(p) or o = realloc(p), then the value of p itself - the n bytes on > the stack - does not change. Yes, but one comment about terminology:. The C standard differentiates between the representation, i.e. the bytes on the stack, and the value. The representation is converted to a value during lvalue conversion. For an invalid pointer the representation is indeterminate because it now does not point to a valid object anymore. So it is not possible to convert the representation to a value during lvalue conversion. In other words, it does not make sense to speak of the value of the pointer anymore. > I realize C11 appears to have changed that. I fear that in doing so it > actually risks increasing the confusion about pointers. IMO it's much > easier to reason about > > o = realloc(p, X); > > (and more baroque constructions) when keeping in mind that o, p, and the > object pointed to by either one are all different things. > What did change in C11? As far as I know, the pointer model did not change in C11. > > > Reading an uninitialized value of automatic storage whose > > > address was not taken is undefined behavior, so everything > > > is possible afterwards. > > > > > > An uninitialized variable whose address was taken has a > > > representation which can represent an unspecified value > > > or a no-value (trap) representation. Reading the > > > representation itself is always ok and gives consistent > > > results. Reading the variable can be undefined behavior > > > iff it is a trap representation, otherwise you get > > > the unspecified value which is stored there. > > > > > > At least this is my reading of the C standard. Compilers > > > are not full conformant. > > > > Does all this imply that the following is well defined behavior (and shall > > print what one would expect)? > > > > free(p); > > > > (void) &p; // take the address > > // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > > > printf("%p\n", p); // we took previously its address, > > // so now it has to hold consistently > > // the previous value > > > > No, the printf is not well defined, because the lvalue conversion of the pointer with indeterminate representation may lead to undefined behavior. Martin > > This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely > > convinced, but might be. > > Again, p is just an n byte variable which happens to have (one hopes) > pointed at a previously malloc'd address. > > And I'd argue that pre-C11, this was not confusing, and would not have > felt weird to you. > > But I am most grateful to you for having brought this to my attention. > I may not agree with it and not like it, but it's right there in the > spec, so time for me to adjust :) > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 8:36 ` Martin Uecker @ 2023-02-24 16:01 ` Serge E. Hallyn 2023-02-24 16:37 ` Martin Uecker 0 siblings, 1 reply; 39+ messages in thread From: Serge E. Hallyn @ 2023-02-24 16:01 UTC (permalink / raw) To: Martin Uecker Cc: Serge E. Hallyn, Alex Colomar, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely On Fri, Feb 24, 2023 at 09:36:45AM +0100, Martin Uecker wrote: > Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn: > > On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote: > > > Hi Martin, > > > > > > On 2/23/23 20:57, Martin Uecker wrote: > > > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar: > > > > > Hi Martin, > > > > > > > > > > On 2/17/23 14:48, Martin Uecker wrote: > > > > > > > This new wording doesn't even allow one to use memcmp(3); > > > > > > > just reading the pointer value, however you do it, is UB. > > > > > > > > > > > > memcmp would not use the pointer value but work > > > > > > on the representation bytes and is still allowed. > > > > > > > > > > Hmm, interesting. It's rather unspecified behavior. Still > > > > > unpredictable: (memcmp(&p, &p, sizeof(p) == 0) might evaluate to true or > > > > > false randomly; the compiler may compile out the call to memcmp(3), > > > > > since it knows it won't produce any observable behavior. > > > > > > > > > > <https://software.codidact.com/posts/287905> > > > > > > > > No, I think several things get mixed up here. > > > > > > > > The representation of a pointer that becomes invalid > > > > does not change. > > > > > > > > So (0 === memcmp(&p, &p, sizeof(p)) always > > > > evaluates to true. > > > > > > > > Also in general, an unspecified value is simply unspecified > > > > but does not change anymore. > > > > Right. p is its own thing - n bytes on the stack containing some value. > > Once it comes into scope, it doesn't change on its own. And if I do > > free(p) or o = realloc(p), then the value of p itself - the n bytes on > > the stack - does not change. > > Yes, but one comment about terminology:. The C standard > differentiates between the representation, i.e. the bytes on > the stack, and the value. The representation is converted to > a value during lvalue conversion. For an invalid pointer > the representation is indeterminate because it now does not > point to a valid object anymore. So it is not possible to > convert the representation to a value during lvalue conversion. > In other words, it does not make sense to speak of the value > of the pointer anymore. I'm sure there are, especially from an implementer's point of view, great reasons for this. However, as just a user, the "value" of 'void *p' should absolutely not be tied to whatever is at that address. I'm given a simple linear memory space, under which sits an entirely different view obfuscated by page tables, but that doesn't concern me. if I say void *p = -1, then if I print p, then I expect to see that value. Since I'm complaining about standards I'm picking and choosing here, but I'll still point at the printf(3) manpage :) : p The void * pointer argument is printed in hexadecimal (as if by %#x or %#lx). > > I realize C11 appears to have changed that. I fear that in doing so it > > actually risks increasing the confusion about pointers. IMO it's much > > easier to reason about > > > > o = realloc(p, X); > > > > (and more baroque constructions) when keeping in mind that o, p, and the > > object pointed to by either one are all different things. > > > > What did change in C11? As far as I know, the pointer model > did not change in C11. I haven't looked in more detail, and don't really plan to, but my understanding is that the text of: The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime. (especially the last sentence) was new. Maybe the words "value of a pointer" don't mean what I think they mean. But that's the phrase to which I object. The n bytes on the stack, p, are not changed just because something happened with the accounting for the memory at the address represented by that value. If they do, then that's not 'C' any more. > > > > Reading an uninitialized value of automatic storage whose > > > > address was not taken is undefined behavior, so everything > > > > is possible afterwards. > > > > > > > > An uninitialized variable whose address was taken has a > > > > representation which can represent an unspecified value > > > > or a no-value (trap) representation. Reading the > > > > representation itself is always ok and gives consistent > > > > results. Reading the variable can be undefined behavior > > > > iff it is a trap representation, otherwise you get > > > > the unspecified value which is stored there. > > > > > > > > At least this is my reading of the C standard. Compilers > > > > are not full conformant. > > > > > > Does all this imply that the following is well defined behavior (and shall > > > print what one would expect)? > > > > > > free(p); > > > > > > (void) &p; // take the address > > > // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > > > > > printf("%p\n", p); // we took previously its address, > > > // so now it has to hold consistently > > > // the previous value > > > > > > > > No, the printf is not well defined, because the lvalue conversion > of the pointer with indeterminate representation may lead to Sorry, my eyes glaze over at 'lvalue conversion of the pointer' - is that referring to what's at the target address? Because to print the hex value of p, it doesn't matter what's at *p - even if that memory is no longer mapped, it does not affect the %x of p. BTW I appreciate all the insight. I'm arguing my case, but I'm quite certain I'll walk away finally understanding why I'm wrong. > undefined behavior. > > > Martin > > > > > This feels weird. And a bit of a Schroedinger's pointer. I'm not entirely > > > convinced, but might be. > > > > Again, p is just an n byte variable which happens to have (one hopes) > > pointed at a previously malloc'd address. > > > > And I'd argue that pre-C11, this was not confusing, and would not have > > felt weird to you. > > > > But I am most grateful to you for having brought this to my attention. > > I may not agree with it and not like it, but it's right there in the > > spec, so time for me to adjust :) > > > > > > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-24 16:01 ` Serge E. Hallyn @ 2023-02-24 16:37 ` Martin Uecker 0 siblings, 0 replies; 39+ messages in thread From: Martin Uecker @ 2023-02-24 16:37 UTC (permalink / raw) To: Serge E. Hallyn Cc: Alex Colomar, GCC, Iker Pedrosa, Florian Weimer, Paul Eggert, Michael Kerrisk, Jₑₙₛ Gustedt, David Malcolm, Sam James, Jonathan Wakely Am Freitag, dem 24.02.2023 um 10:01 -0600 schrieb Serge E. Hallyn: > On Fri, Feb 24, 2023 at 09:36:45AM +0100, Martin Uecker wrote: > > Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn: ... > > > > Yes, but one comment about terminology:. The C standard > > differentiates between the representation, i.e. the bytes on > > the stack, and the value. The representation is converted to > > a value during lvalue conversion. For an invalid pointer > > the representation is indeterminate because it now does not > > point to a valid object anymore. So it is not possible to > > convert the representation to a value during lvalue conversion. > > In other words, it does not make sense to speak of the value > > of the pointer anymore. > > I'm sure there are, especially from an implementer's point of view, > great reasons for this. > > However, as just a user, the "value" of 'void *p' should absolutely > not be tied to whatever is at that address. Think about it in this way: The set of possible values for a pointer is the set of objects that exist at a point in time. If one object disappears, a pointer can not point to it anymore. So it is not that the pointer changes, but the set of valid values. > I'm given a simple > linear memory space, under which sits an entirely different view > obfuscated by page tables, but that doesn't concern me. if I say > void *p = -1, then if I print p, then I expect to see that value. If you store an integer into a pointer (you need a cast), then this is implementation-defined and may also produce an invalid pointer. > > Since I'm complaining about standards I'm picking and choosing here, > but I'll still point at the printf(3) manpage :) : > > p The void * pointer argument is printed in hexadecimal (as if by %#x > or %#lx). This is valid if the pointer is valid, but if the pointer is invalid, this is undefined behavior. In C one not think about pointers as addresses. They are abstract handles that point to objects, and compilers do exploit this for optimization. If you need an address, you can cast it to uintptr_t (but see below). > > > > I realize C11 appears to have changed that. I fear that in doing so it > > > actually risks increasing the confusion about pointers. IMO it's much > > > easier to reason about > > > > > > o = realloc(p, X); > > > > > > (and more baroque constructions) when keeping in mind that o, p, and the > > > object pointed to by either one are all different things. > > > > > > > What did change in C11? As far as I know, the pointer model > > did not change in C11. > > I haven't looked in more detail, and don't really plan to, but my > understanding is that the text of: > > The lifetime of an object is the portion of program execution during which storage is > guaranteed to be reserved for it. An object exists, has a constant address, and retains > its last-stored value throughout its lifetime. If an object is referred to outside of its > lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when > the object it points to (or just past) reaches the end of its lifetime. > > (especially the last sentence) was new. This is not new. C99 "The value of a pointer becomes indeterminate when the object it points to reaches the end of its lifetime." C90: "The value of a pointer that referred to an object with automatic storage duration that is no longer guaranteed to be reserved is indeterminate." and "The value of a pointer that refers to freed space is indeterminate." > Maybe the words "value of a pointer" don't mean what I think they > mean. But that's the phrase to which I object. The n bytes on > the stack, p, are not changed just because something happened with > the accounting for the memory at the address represented by that > value. If they do, then that's not 'C' any more. It is not about the bytes of the pointer changing. But if the object is freed they do not represent a valid pointer anymore. There were CPUs that trapped when an invalid address is loaded, e.g. because the data segment for the object was removed from the segment tables. So this is a rule in portable 'C' for more than 30 years. Nowadays compilers exploit the knowledge that the object is freed. So you can not reliably use such a pointer. If you do this, your code will be broken on most modern compilers. > > > > > > Reading an uninitialized value of automatic storage whose > > > > > address was not taken is undefined behavior, so everything > > > > > is possible afterwards. > > > > > > > > > > An uninitialized variable whose address was taken has a > > > > > representation which can represent an unspecified value > > > > > or a no-value (trap) representation. Reading the > > > > > representation itself is always ok and gives consistent > > > > > results. Reading the variable can be undefined behavior > > > > > iff it is a trap representation, otherwise you get > > > > > the unspecified value which is stored there. > > > > > > > > > > At least this is my reading of the C standard. Compilers > > > > > are not full conformant. > > > > > > > > Does all this imply that the following is well defined behavior (and shall > > > > print what one would expect)? > > > > > > > > free(p); > > > > > > > > (void) &p; // take the address > > > > // or maybe we should (void) memcmp(&p, &p, sizeof(p)); ? > > > > > > > > printf("%p\n", p); // we took previously its address, > > > > // so now it has to hold consistently > > > > // the previous value > > > > > > > > > > > > No, the printf is not well defined, because the lvalue conversion > > of the pointer with indeterminate representation may lead to > > Sorry, my eyes glaze over at 'lvalue conversion of the pointer' - is > that referring to what's at the target address? I meant the load of the pointer itself, not the load of the object the pointer points to. > Because to print the > hex value of p, it doesn't matter what's at *p - even if that memory > is no longer mapped, it does not affect the %x of p. > > BTW I appreciate all the insight. I'm arguing my case, but I'm quite > certain I'll walk away finally understanding why I'm wrong. I hope my answers help. We thought about these questions a lot when working on the proposed memory model for C (TS 6010) and there is not much room for changing the rules. We could make it well defined to pass around such invalid pointers (and allow printing them). But because one can not reliably use them for anything, this would do more harm than good. In the proposed TS 6010 clarified that casting to uintptr_t gives you an address that behaves as you would expect. But modern compilers do not yet conform to this. But I assume this will get fixed at some point. Martin ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-16 15:15 ` David Malcolm 2023-02-17 1:04 ` Alejandro Colomar @ 2023-02-17 3:48 ` Siddhesh Poyarekar 2023-02-17 11:22 ` Alejandro Colomar ` (2 more replies) 1 sibling, 3 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 3:48 UTC (permalink / raw) To: David Malcolm, Alejandro Colomar, GCC; +Cc: Iker Pedrosa On 2023-02-16 10:15, David Malcolm via Gcc wrote: > I'm not convinced that it's useful to the end-user to warn about the > "use of q itself" case. FWIW, -Wuse-after-free=3 already should do this: At level 3, the warning also diagnoses uses of indeterminate pointers in equality expressions. All uses of indeterminate pointers are undefined but equality tests sometimes appear after calls to "realloc" as an attempt to determine whether the call resulted in relocating the object to a different address. They are diagnosed at a separate level to aid legacy code gradually transition to safe alternatives. For example, the equality test in the function below is diagnosed at this level: Jakub and I had discussed this in the context of _FORTIFY_SOURCE=3 (which is anal about this and can break things) and we got pr#105217, but that is also a best-effort thing, not really a guarantee. IMO the analyzer should go that extra mile and warn for the use of q itself and maybe deprecate -Wuse-after-free=3 in its favour. Sid ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 3:48 ` Siddhesh Poyarekar @ 2023-02-17 11:22 ` Alejandro Colomar 2023-02-17 13:38 ` Siddhesh Poyarekar 2023-02-17 11:24 ` Missed warning (-Wuse-after-free) Jonathan Wakely 2023-02-17 13:44 ` David Malcolm 2 siblings, 1 reply; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 11:22 UTC (permalink / raw) To: Siddhesh Poyarekar, GCC Cc: Iker Pedrosa, David Malcolm, Florian Weimer, Sam James, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Yann Droneaud [-- Attachment #1.1: Type: text/plain, Size: 1486 bytes --] Hi Siddhesh, On 2/17/23 04:48, Siddhesh Poyarekar wrote: > On 2023-02-16 10:15, David Malcolm via Gcc wrote: >> I'm not convinced that it's useful to the end-user to warn about the >> "use of q itself" case. > > FWIW, -Wuse-after-free=3 already should do this: Thanks! It works. I would have expected such a warning to be included in -Wextra. Does it have any false positives (or maybe too many false negatives?) that make it unsuitable for -Wextra? > > At level 3, the warning also diagnoses uses of indeterminate pointers in > equality expressions. All uses of indeterminate pointers are undefined > but equality tests sometimes appear after > calls to "realloc" as an attempt to determine whether the call resulted > in relocating the object to a different address. They are diagnosed at > a separate level to aid legacy code gradually > transition to safe alternatives. For example, the equality test in the > function below is diagnosed at this level: > > Jakub and I had discussed this in the context of _FORTIFY_SOURCE=3 > (which is anal about this and can break things) and we got pr#105217, > but that is also a best-effort thing, not really a guarantee. > > IMO the analyzer should go that extra mile and warn for the use of q > itself and maybe deprecate -Wuse-after-free=3 in its favour. > > Sid Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 11:22 ` Alejandro Colomar @ 2023-02-17 13:38 ` Siddhesh Poyarekar 2023-02-17 14:01 ` Mark Wielaard 2023-02-17 21:20 ` [PATCH] Make -Wuse-after-free=3 the default one in -Wall Alejandro Colomar 0 siblings, 2 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 13:38 UTC (permalink / raw) To: Alejandro Colomar, GCC Cc: Iker Pedrosa, David Malcolm, Florian Weimer, Sam James, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Yann Droneaud On 2023-02-17 06:22, Alejandro Colomar wrote: > Hi Siddhesh, > > On 2/17/23 04:48, Siddhesh Poyarekar wrote: >> On 2023-02-16 10:15, David Malcolm via Gcc wrote: >>> I'm not convinced that it's useful to the end-user to warn about the >>> "use of q itself" case. >> >> FWIW, -Wuse-after-free=3 already should do this: > > Thanks! It works. I would have expected such a warning to be included > in -Wextra. Does it have any false positives (or maybe too many false > negatives?) that make it unsuitable for -Wextra? I don't know why it isn't enabled in -Wextra, it seems like the right thing to do. Sid ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 13:38 ` Siddhesh Poyarekar @ 2023-02-17 14:01 ` Mark Wielaard 2023-02-17 14:06 ` Siddhesh Poyarekar 2023-02-17 21:20 ` [PATCH] Make -Wuse-after-free=3 the default one in -Wall Alejandro Colomar 1 sibling, 1 reply; 39+ messages in thread From: Mark Wielaard @ 2023-02-17 14:01 UTC (permalink / raw) To: Siddhesh Poyarekar, Alejandro Colomar, GCC Cc: Iker Pedrosa, David Malcolm, Florian Weimer, Sam James, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Yann Droneaud On Fri, 2023-02-17 at 08:38 -0500, Siddhesh Poyarekar wrote: > On 2023-02-17 06:22, Alejandro Colomar wrote: > > Hi Siddhesh, > > > > On 2/17/23 04:48, Siddhesh Poyarekar wrote: > > > On 2023-02-16 10:15, David Malcolm via Gcc wrote: > > > > I'm not convinced that it's useful to the end-user to warn about the > > > > "use of q itself" case. > > > > > > FWIW, -Wuse-after-free=3 already should do this: > > > > Thanks! It works. I would have expected such a warning to be included > > in -Wextra. Does it have any false positives (or maybe too many false > > negatives?) that make it unsuitable for -Wextra? > > I don't know why it isn't enabled in -Wextra, it seems like the right > thing to do. Interesting warning flag I didn't know about. It seems to have found an issue in some code trying to free a circular single linked list: https://inbox.sourceware.org/elfutils-devel/20230217140027.125332-1-mark@klomp.org/ The reason people might not know about it, is that the documentation is somewhat unclear. -Wall says it already includes -Wuse-after-free=3: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wall But the -Wuse-after-free option itself says -Wall only enables -Wuse- after-free=2: https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wuse-after-free Also note that it is listed under "Options Controlling C++ Dialect" so it wasn't immediately clear to me that it also works for C. Cheers, Mark ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 14:01 ` Mark Wielaard @ 2023-02-17 14:06 ` Siddhesh Poyarekar 0 siblings, 0 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 14:06 UTC (permalink / raw) To: Mark Wielaard, Alejandro Colomar, GCC Cc: Iker Pedrosa, David Malcolm, Florian Weimer, Sam James, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Yann Droneaud On 2023-02-17 09:01, Mark Wielaard wrote: > The reason people might not know about it, is that the documentation is > somewhat unclear. -Wall says it already includes -Wuse-after-free=3: > https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wall Yeah I posted a patch to fix it only a few minutes ago: https://sourceware.org/pipermail/gcc-patches/2023-February/612229.html Sid ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH] Make -Wuse-after-free=3 the default one in -Wall 2023-02-17 13:38 ` Siddhesh Poyarekar 2023-02-17 14:01 ` Mark Wielaard @ 2023-02-17 21:20 ` Alejandro Colomar 2023-02-17 21:39 ` Siddhesh Poyarekar 1 sibling, 1 reply; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 21:20 UTC (permalink / raw) To: gcc Cc: Alejandro Colomar, Andreas Schwab, David Malcolm, Florian Weimer, Iker Pedrosa, Jens Gustedt, Jonathan Wakely, Mark Wielaard, Martin Uecker, Michael Kerrisk, Paul Eggert, Sam James, Siddhesh Poyarekar, Yann Droneaud Link: <https://inbox.sourceware.org/gcc/3098fd18-9dbf-b4e9-bae5-62ec6fea74cd@opteya.com/T/> Link: <https://github.com/shadow-maint/shadow/pull/649#discussion_r1108350066> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: David Malcolm <dmalcolm@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Iker Pedrosa <ipedrosa@redhat.com> Cc: Jens Gustedt <jens.gustedt@inria.fr> Cc: Jonathan Wakely <jwakely.gcc@gmail.com> Cc: Mark Wielaard <mark@klomp.org> Cc: Martin Uecker <uecker@tugraz.at> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Paul Eggert <eggert@cs.ucla.edu> Cc: Sam James <sam@gentoo.org> Cc: Siddhesh Poyarekar <siddhesh@gotplt.org> Cc: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Alejandro Colomar <alx@kernel.org> --- Hi Siddhesh, Here's a patch for it. It is untested yet. Please have a look at it. I'm not used to GCC customs, so corrections are welcome :) Cheers, Alex gcc/c-family/c.opt | 4 ++-- gcc/doc/invoke.texi | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index c0fea56a8f5..1a3fc2c5d74 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1411,11 +1411,11 @@ C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_unused_const_variable) Warn when a const variable is unused. ; Defining this option here in addition to common.opt is necessary -; in order for the default -Wall setting of -Wuse-after-free=2 to take +; in order for the default -Wall setting of -Wuse-after-free=3 to take ; effect. Wuse-after-free= -LangEnabledBy(C ObjC C++ LTO ObjC++, Wall,2,0) +LangEnabledBy(C ObjC C++ LTO ObjC++, Wall,3,0) ; in common.opt Wvariadic-macros diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7b308cd3c31..d910052ce0c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -4720,7 +4720,7 @@ instead of pointers. This approach obviates needing to adjust the stored pointers after reallocation. @end table -@option{-Wuse-after-free=2} is included in @option{-Wall}. +@option{-Wuse-after-free=3} is included in @option{-Wall}. @item -Wuseless-cast @r{(C++ and Objective-C++ only)} @opindex Wuseless-cast -- 2.39.1 ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] Make -Wuse-after-free=3 the default one in -Wall 2023-02-17 21:20 ` [PATCH] Make -Wuse-after-free=3 the default one in -Wall Alejandro Colomar @ 2023-02-17 21:39 ` Siddhesh Poyarekar 2023-02-17 21:41 ` Siddhesh Poyarekar 2023-02-17 22:58 ` Alejandro Colomar 0 siblings, 2 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 21:39 UTC (permalink / raw) To: Alejandro Colomar, gcc Cc: Alejandro Colomar, Andreas Schwab, David Malcolm, Florian Weimer, Iker Pedrosa, Jens Gustedt, Jonathan Wakely, Mark Wielaard, Martin Uecker, Michael Kerrisk, Paul Eggert, Sam James, Yann Droneaud On 2023-02-17 16:20, Alejandro Colomar wrote: > Link: <https://inbox.sourceware.org/gcc/3098fd18-9dbf-b4e9-bae5-62ec6fea74cd@opteya.com/T/> > Link: <https://github.com/shadow-maint/shadow/pull/649#discussion_r1108350066> > Cc: Andreas Schwab <schwab@linux-m68k.org> > Cc: David Malcolm <dmalcolm@redhat.com> > Cc: Florian Weimer <fweimer@redhat.com> > Cc: Iker Pedrosa <ipedrosa@redhat.com> > Cc: Jens Gustedt <jens.gustedt@inria.fr> > Cc: Jonathan Wakely <jwakely.gcc@gmail.com> > Cc: Mark Wielaard <mark@klomp.org> > Cc: Martin Uecker <uecker@tugraz.at> > Cc: Michael Kerrisk <mtk.manpages@gmail.com> > Cc: Paul Eggert <eggert@cs.ucla.edu> > Cc: Sam James <sam@gentoo.org> > Cc: Siddhesh Poyarekar <siddhesh@gotplt.org> > Cc: Yann Droneaud <ydroneaud@opteya.com> > Signed-off-by: Alejandro Colomar <alx@kernel.org> > --- > > Hi Siddhesh, > > Here's a patch for it. It is untested yet. Please have a look at it. > I'm not used to GCC customs, so corrections are welcome :) > > Cheers, > > Alex You've got the customs right as far as submission is concerned; gcc accepts patches under DCO. I'm not a maintainer though, so I can't approve the change, I can only comment on it in the hope of influencing maintainers' opinions. In any case it's probably suitable as a proposal for gcc 14, given that 13 is in stage 4, regression fixes only. I'm split about where -Wuse-after-free=3 should be enabled. On the one hand, I'd like it to go into -Wall and alongside _FORTIFY_SOURCE=3, given that the latter already breaks the incorrect provenance assumptions in such code patterns. However on the other hand, it may lead to annoyed developers, even though the usage is, strictly speaking, UB. I don't know about the false positive rate of -Wuse-after-free=3 either (specifically in the context of UB-ness of the code it warns about), maybe someone else may be able to chime in on that. Maybe a good compromise here is -Wextra, but if there's consensus developing towards adding it to -Wall, I'll happily jump to that side. Thanks, Sid > > > gcc/c-family/c.opt | 4 ++-- > gcc/doc/invoke.texi | 2 +- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index c0fea56a8f5..1a3fc2c5d74 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -1411,11 +1411,11 @@ C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_unused_const_variable) > Warn when a const variable is unused. > > ; Defining this option here in addition to common.opt is necessary > -; in order for the default -Wall setting of -Wuse-after-free=2 to take > +; in order for the default -Wall setting of -Wuse-after-free=3 to take > ; effect. > > Wuse-after-free= > -LangEnabledBy(C ObjC C++ LTO ObjC++, Wall,2,0) > +LangEnabledBy(C ObjC C++ LTO ObjC++, Wall,3,0) > ; in common.opt > > Wvariadic-macros > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 7b308cd3c31..d910052ce0c 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -4720,7 +4720,7 @@ instead of pointers. This approach obviates needing to adjust the stored > pointers after reallocation. > @end table > > -@option{-Wuse-after-free=2} is included in @option{-Wall}. > +@option{-Wuse-after-free=3} is included in @option{-Wall}. > > @item -Wuseless-cast @r{(C++ and Objective-C++ only)} > @opindex Wuseless-cast ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] Make -Wuse-after-free=3 the default one in -Wall 2023-02-17 21:39 ` Siddhesh Poyarekar @ 2023-02-17 21:41 ` Siddhesh Poyarekar 2023-02-17 22:58 ` Alejandro Colomar 1 sibling, 0 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 21:41 UTC (permalink / raw) To: Alejandro Colomar, gcc Cc: Alejandro Colomar, Andreas Schwab, David Malcolm, Florian Weimer, Iker Pedrosa, Jens Gustedt, Jonathan Wakely, Mark Wielaard, Martin Uecker, Michael Kerrisk, Paul Eggert, Sam James, Yann Droneaud On 2023-02-17 16:39, Siddhesh Poyarekar wrote: > You've got the customs right as far as submission is concerned; gcc Oh, one correction: patches typically go to gcc-patches at gcc dot gnu dot org. Thanks, Sid ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] Make -Wuse-after-free=3 the default one in -Wall 2023-02-17 21:39 ` Siddhesh Poyarekar 2023-02-17 21:41 ` Siddhesh Poyarekar @ 2023-02-17 22:58 ` Alejandro Colomar 2023-02-17 23:03 ` Siddhesh Poyarekar 1 sibling, 1 reply; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 22:58 UTC (permalink / raw) To: Siddhesh Poyarekar, gcc Cc: Alejandro Colomar, Andreas Schwab, David Malcolm, Florian Weimer, Iker Pedrosa, Jens Gustedt, Jonathan Wakely, Mark Wielaard, Martin Uecker, Michael Kerrisk, Paul Eggert, Sam James, Yann Droneaud [-- Attachment #1.1: Type: text/plain, Size: 3046 bytes --] Hi Siddhesh, On 2/17/23 22:39, Siddhesh Poyarekar wrote: > On 2023-02-17 16:20, Alejandro Colomar wrote: >> Link: <https://inbox.sourceware.org/gcc/3098fd18-9dbf-b4e9-bae5-62ec6fea74cd@opteya.com/T/> >> Link: <https://github.com/shadow-maint/shadow/pull/649#discussion_r1108350066> >> Cc: Andreas Schwab <schwab@linux-m68k.org> >> Cc: David Malcolm <dmalcolm@redhat.com> >> Cc: Florian Weimer <fweimer@redhat.com> >> Cc: Iker Pedrosa <ipedrosa@redhat.com> >> Cc: Jens Gustedt <jens.gustedt@inria.fr> >> Cc: Jonathan Wakely <jwakely.gcc@gmail.com> >> Cc: Mark Wielaard <mark@klomp.org> >> Cc: Martin Uecker <uecker@tugraz.at> >> Cc: Michael Kerrisk <mtk.manpages@gmail.com> >> Cc: Paul Eggert <eggert@cs.ucla.edu> >> Cc: Sam James <sam@gentoo.org> >> Cc: Siddhesh Poyarekar <siddhesh@gotplt.org> >> Cc: Yann Droneaud <ydroneaud@opteya.com> >> Signed-off-by: Alejandro Colomar <alx@kernel.org> >> --- >> >> Hi Siddhesh, >> >> Here's a patch for it. It is untested yet. Please have a look at it. >> I'm not used to GCC customs, so corrections are welcome :) >> >> Cheers, >> >> Alex > > You've got the customs right as far as submission is concerned; gcc > accepts patches under DCO. I'm not a maintainer though, so I can't > approve the change, I can only comment on it in the hope of influencing > maintainers' opinions. :) > In any case it's probably suitable as a proposal > for gcc 14, given that 13 is in stage 4, regression fixes only. Sure, 14 is good. > > I'm split about where -Wuse-after-free=3 should be enabled. On the one > hand, I'd like it to go into -Wall and alongside _FORTIFY_SOURCE=3, > given that the latter already breaks the incorrect provenance > assumptions in such code patterns. However on the other hand, it may > lead to annoyed developers, even though the usage is, strictly speaking, > UB. I don't know about the false positive rate of -Wuse-after-free=3 > either (specifically in the context of UB-ness of the code it warns > about), maybe someone else may be able to chime in on that. > > Maybe a good compromise here is -Wextra, but if there's consensus > developing towards adding it to -Wall, I'll happily jump to that side. Since -Wall already had -Wuaf=2, and to not overcomplicate it, I put it in -Wall too. Anyway, I don't expect it to have many false positives, but maybe someone else can chime in. Both are fine, IMO. I use -Wall -Wextra always together, so I wouldn't even notice :) > > Thanks, > Sid On 2/17/23 22:41, Siddhesh Poyarekar wrote: > On 2023-02-17 16:39, Siddhesh Poyarekar wrote: >> You've got the customs right as far as submission is concerned; gcc > > Oh, one correction: patches typically go to gcc-patches at gcc dot gnu > dot org. Okay, I'll take that note for next time. For this one, should I resend, or is it okay as is? Both are fine for me. Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] Make -Wuse-after-free=3 the default one in -Wall 2023-02-17 22:58 ` Alejandro Colomar @ 2023-02-17 23:03 ` Siddhesh Poyarekar 0 siblings, 0 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 23:03 UTC (permalink / raw) To: Alejandro Colomar, gcc Cc: Alejandro Colomar, Andreas Schwab, David Malcolm, Florian Weimer, Iker Pedrosa, Jens Gustedt, Jonathan Wakely, Mark Wielaard, Martin Uecker, Michael Kerrisk, Paul Eggert, Sam James, Yann Droneaud On 2023-02-17 17:58, Alejandro Colomar wrote: > On 2/17/23 22:41, Siddhesh Poyarekar wrote: >> On 2023-02-17 16:39, Siddhesh Poyarekar wrote: >>> You've got the customs right as far as submission is concerned; gcc >> >> Oh, one correction: patches typically go to gcc-patches at gcc dot gnu >> dot org. > > Okay, I'll take that note for next time. For this one, should I resend, > or is it okay as is? Both are fine for me. Probably makes sense to resend to gcc-patches. Thanks, Sid ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 3:48 ` Siddhesh Poyarekar 2023-02-17 11:22 ` Alejandro Colomar @ 2023-02-17 11:24 ` Jonathan Wakely 2023-02-17 11:43 ` Alejandro Colomar 2023-02-17 12:53 ` Siddhesh Poyarekar 2023-02-17 13:44 ` David Malcolm 2 siblings, 2 replies; 39+ messages in thread From: Jonathan Wakely @ 2023-02-17 11:24 UTC (permalink / raw) To: Siddhesh Poyarekar; +Cc: David Malcolm, Alejandro Colomar, GCC, Iker Pedrosa [-- Attachment #1: Type: text/plain, Size: 1906 bytes --] On Fri, 17 Feb 2023, 03:49 Siddhesh Poyarekar, <siddhesh@gotplt.org> wrote: > On 2023-02-16 10:15, David Malcolm via Gcc wrote: > > I'm not convinced that it's useful to the end-user to warn about the > > "use of q itself" case. > > FWIW, -Wuse-after-free=3 already should do this: > > At level 3, the warning also diagnoses uses of indeterminate pointers in > equality expressions. All uses of indeterminate pointers are undefined > but equality tests sometimes appear after > calls to "realloc" as an attempt to determine whether the call resulted > in relocating the object to a different address. They are diagnosed at > a separate level to aid legacy code gradually > transition to safe alternatives. For example, the equality test in the > function below is diagnosed at this level: > > Jakub and I had discussed this in the context of _FORTIFY_SOURCE=3 > (which is anal about this and can break things) and we got pr#105217, > but that is also a best-effort thing, not really a guarantee. > > IMO the analyzer should go that extra mile and warn for the use of q > itself and maybe deprecate -Wuse-after-free=3 in its favour. > Please be aware that in C++ it's implementation-defined, not undefined. That means that an implementation without trap representations for pointers can choose to make it behave just like using (uintptr_t)p. https://cplusplus.github.io/CWG/issues/1438.html https://cplusplus.github.io/CWG/issues/623.html https://cplusplus.github.io/CWG/issues/616.html https://cplusplus.github.io/CWG/issues/312.html We could still warn in C++ (because the code isn't portable) but I would strongly suggest we don't influence C++ codegen based on deallocated pointers being undefined. I don't think gcc supports any targets with trapping pointers, and there are quite enough sources of UB already. We don't need to create traps for users where there are no traps for pointers :-) ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 11:24 ` Missed warning (-Wuse-after-free) Jonathan Wakely @ 2023-02-17 11:43 ` Alejandro Colomar 2023-02-17 12:04 ` Jonathan Wakely 2023-02-17 12:53 ` Siddhesh Poyarekar 1 sibling, 1 reply; 39+ messages in thread From: Alejandro Colomar @ 2023-02-17 11:43 UTC (permalink / raw) To: Jonathan Wakely Cc: David Malcolm, GCC, Iker Pedrosa, Siddhesh Poyarekar, Florian Weimer, Sam James, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Yann Droneaud [-- Attachment #1.1: Type: text/plain, Size: 1420 bytes --] Hi Jonathan, On 2/17/23 12:24, Jonathan Wakely wrote: > Please be aware that in C++ it's implementation-defined, not undefined. > > That means that an implementation without trap representations for pointers > can choose to make it behave just like using (uintptr_t)p. (uintptr_t)p is defined (I believe) for C <= C17. However, as noted in my last email, even that is UB for C2x. Of course, UB means that the compiler might make it defined, but as Martin suggested, that's might have its own issues. > > https://cplusplus.github.io/CWG/issues/1438.html > https://cplusplus.github.io/CWG/issues/623.html > https://cplusplus.github.io/CWG/issues/616.html > https://cplusplus.github.io/CWG/issues/312.html > > We could still warn in C++ (because the code isn't portable) but I would > strongly suggest we don't influence C++ codegen based on deallocated > pointers being undefined. I don't think gcc supports any targets with > trapping pointers, and there are quite enough sources of UB already. We > don't need to create traps for users where there are no traps for pointers > :-) I would warn in C++ too, as some of that code might be in interfaces that should be compatible with C. Maybe not include it in -Wextra in C++ (but include it in C's -Wextra)... Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 11:43 ` Alejandro Colomar @ 2023-02-17 12:04 ` Jonathan Wakely 0 siblings, 0 replies; 39+ messages in thread From: Jonathan Wakely @ 2023-02-17 12:04 UTC (permalink / raw) To: Alejandro Colomar Cc: David Malcolm, GCC, Iker Pedrosa, Siddhesh Poyarekar, Florian Weimer, Sam James, Paul Eggert, Michael Kerrisk, Martin Uecker, Jₑₙₛ Gustedt, Yann Droneaud [-- Attachment #1: Type: text/plain, Size: 1625 bytes --] On Fri, 17 Feb 2023, 11:43 Alejandro Colomar, <alx.manpages@gmail.com> wrote: > Hi Jonathan, > > On 2/17/23 12:24, Jonathan Wakely wrote: > > Please be aware that in C++ it's implementation-defined, not undefined. > > > > That means that an implementation without trap representations for > pointers > > can choose to make it behave just like using (uintptr_t)p. > > (uintptr_t)p is defined (I believe) for C <= C17. However, as noted in my > last email, even that is UB for C2x. Of course, UB means that the compiler > might make it defined, but as Martin suggested, that's might have its own > issues. > Yes, I just meant in C++, and C++ has no such rules. > > > > https://cplusplus.github.io/CWG/issues/1438.html > > https://cplusplus.github.io/CWG/issues/623.html > > https://cplusplus.github.io/CWG/issues/616.html > > https://cplusplus.github.io/CWG/issues/312.html > > > > We could still warn in C++ (because the code isn't portable) but I would > > strongly suggest we don't influence C++ codegen based on deallocated > > pointers being undefined. I don't think gcc supports any targets with > > trapping pointers, and there are quite enough sources of UB already. We > > don't need to create traps for users where there are no traps for > pointers > > :-) > > I would warn in C++ too, as some of that code might be in interfaces that > should be compatible with C. Maybe not include it in -Wextra in C++ (but > include it in C's -Wextra)... > I agree with warning in C++, just not optimizing based on the assumption that any use of an invalid pointer implies unreachable code (as in your original example). ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 11:24 ` Missed warning (-Wuse-after-free) Jonathan Wakely 2023-02-17 11:43 ` Alejandro Colomar @ 2023-02-17 12:53 ` Siddhesh Poyarekar 2023-02-17 14:10 ` Jonathan Wakely 1 sibling, 1 reply; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 12:53 UTC (permalink / raw) To: Jonathan Wakely; +Cc: David Malcolm, Alejandro Colomar, GCC, Iker Pedrosa On 2023-02-17 06:24, Jonathan Wakely wrote: > Please be aware that in C++ it's implementation-defined, not undefined. > > That means that an implementation without trap representations for > pointers can choose to make it behave just like using (uintptr_t)p. > > https://cplusplus.github.io/CWG/issues/1438.html > <https://cplusplus.github.io/CWG/issues/1438.html> > https://cplusplus.github.io/CWG/issues/623.html > <https://cplusplus.github.io/CWG/issues/623.html> > https://cplusplus.github.io/CWG/issues/616.html > <https://cplusplus.github.io/CWG/issues/616.html> > https://cplusplus.github.io/CWG/issues/312.html > <https://cplusplus.github.io/CWG/issues/312.html> > > We could still warn in C++ (because the code isn't portable) but I would > strongly suggest we don't influence C++ codegen based on deallocated > pointers being undefined. I don't think gcc supports any targets with > trapping pointers, and there are quite enough sources of UB already. We > don't need to create traps for users where there are no traps for > pointers :-) The codegen problem is a pointer provenance issue and AFAICT, -Wuse-after-free=3 is also framed in that context and not as a problem with simply taking the numeric value of the pointer to, e.g. log it somewhere. More concretely, something like this is what causes problems: Foo *old = malloc (sz); ... Foo *new = realloc (old, newsz); if (new != old) { old = new; /* Adjust references. */ } /* Otherwise continue using old unchanged */ ... The problem is the assumption that the old pointer continues to be valid because it has the same numeric value as the new one. This is not an uncommon code pattern in C, what about C++? On a fat pointer-like scheme such as the Arm Morello cpu, this won't work at all because even though old and new have the same numeric values, old will have been invalidated. Sid ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 12:53 ` Siddhesh Poyarekar @ 2023-02-17 14:10 ` Jonathan Wakely 0 siblings, 0 replies; 39+ messages in thread From: Jonathan Wakely @ 2023-02-17 14:10 UTC (permalink / raw) To: Siddhesh Poyarekar; +Cc: David Malcolm, Alejandro Colomar, GCC, Iker Pedrosa [-- Attachment #1: Type: text/plain, Size: 2255 bytes --] On Fri, 17 Feb 2023, 12:53 Siddhesh Poyarekar, <siddhesh@gotplt.org> wrote: > On 2023-02-17 06:24, Jonathan Wakely wrote: > > Please be aware that in C++ it's implementation-defined, not undefined. > > > > That means that an implementation without trap representations for > > pointers can choose to make it behave just like using (uintptr_t)p. > > > > https://cplusplus.github.io/CWG/issues/1438.html > > <https://cplusplus.github.io/CWG/issues/1438.html> > > https://cplusplus.github.io/CWG/issues/623.html > > <https://cplusplus.github.io/CWG/issues/623.html> > > https://cplusplus.github.io/CWG/issues/616.html > > <https://cplusplus.github.io/CWG/issues/616.html> > > https://cplusplus.github.io/CWG/issues/312.html > > <https://cplusplus.github.io/CWG/issues/312.html> > > > > We could still warn in C++ (because the code isn't portable) but I would > > strongly suggest we don't influence C++ codegen based on deallocated > > pointers being undefined. I don't think gcc supports any targets with > > trapping pointers, and there are quite enough sources of UB already. We > > don't need to create traps for users where there are no traps for > > pointers :-) > > The codegen problem is a pointer provenance issue and AFAICT, > -Wuse-after-free=3 is also framed in that context and not as a problem > with simply taking the numeric value of the pointer to, e.g. log it > somewhere. > > More concretely, something like this is what causes problems: > > Foo *old = malloc (sz); > ... > Foo *new = realloc (old, newsz); > > if (new != old) > { > old = new; > /* Adjust references. */ > } > > /* Otherwise continue using old unchanged */ > ... > > The problem is the assumption that the old pointer continues to be valid > because it has the same numeric value as the new one. This is not an > uncommon code pattern in C, what about C++? > Nobody uses realloc in C++. It's only safe for trivial types, because anything with user-defined construction/assignment/destruction can't be copied to a new location if the memory is reallocated. > On a fat pointer-like scheme such as the Arm Morello cpu, this won't > work at all because even though old and new have the same numeric > values, old will have been invalidated. > > Sid > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 3:48 ` Siddhesh Poyarekar 2023-02-17 11:22 ` Alejandro Colomar 2023-02-17 11:24 ` Missed warning (-Wuse-after-free) Jonathan Wakely @ 2023-02-17 13:44 ` David Malcolm 2023-02-17 14:01 ` Siddhesh Poyarekar 2 siblings, 1 reply; 39+ messages in thread From: David Malcolm @ 2023-02-17 13:44 UTC (permalink / raw) To: Siddhesh Poyarekar, Alejandro Colomar, GCC; +Cc: Iker Pedrosa On Thu, 2023-02-16 at 22:48 -0500, Siddhesh Poyarekar wrote: > On 2023-02-16 10:15, David Malcolm via Gcc wrote: > > I'm not convinced that it's useful to the end-user to warn about > > the > > "use of q itself" case. > > FWIW, -Wuse-after-free=3 already should do this: > > At level 3, the warning also diagnoses uses of indeterminate pointers > in > equality expressions. All uses of indeterminate pointers are > undefined > but equality tests sometimes appear after > calls to "realloc" as an attempt to determine whether the call > resulted > in relocating the object to a different address. They are diagnosed > at > a separate level to aid legacy code gradually > transition to safe alternatives. This is possibly a silly question, but what *are* these safe alternatives? [1] How does one test to see if an object has been reallocated? Dave [1] Would suggesting "rust" here be too snarky? :-P > For example, the equality test in the > function below is diagnosed at this level: > > Jakub and I had discussed this in the context of _FORTIFY_SOURCE=3 > (which is anal about this and can break things) and we got pr#105217, > but that is also a best-effort thing, not really a guarantee. > > IMO the analyzer should go that extra mile and warn for the use of q > itself and maybe deprecate -Wuse-after-free=3 in its favour. > > Sid > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-17 13:44 ` David Malcolm @ 2023-02-17 14:01 ` Siddhesh Poyarekar 0 siblings, 0 replies; 39+ messages in thread From: Siddhesh Poyarekar @ 2023-02-17 14:01 UTC (permalink / raw) To: David Malcolm, Alejandro Colomar, GCC; +Cc: Iker Pedrosa On 2023-02-17 08:44, David Malcolm wrote: > This is possibly a silly question, but what *are* these safe > alternatives? [1] How does one test to see if an object has been > reallocated? Oops sorry, I snipped off that part when pasting from the man page. Typically such conditionals are used to update pointers within structs (and in some overly adventurous cases, continue using the old pointer!) and the alternative is to save offsets in structs and not pointers. The text I missed quoting here: """ void adjust_pointers (int**, int); void grow (int **p, int n) { int **q = (int**)realloc (p, n *= 2); if (q == p) return; adjust_pointers ((int**)q, n); } To avoid the warning at this level, store offsets into allocated memory instead of pointers. This approach obviates needing to adjust the stored pointers after reallocation. """ Sid > Dave > [1] Would suggesting "rust" here be too snarky? :-P I think the equivalent American expression is "lets take this outside" ;) ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Missed warning (-Wuse-after-free) 2023-02-16 14:35 Missed warning (-Wuse-after-free) Alejandro Colomar 2023-02-16 15:15 ` David Malcolm @ 2023-02-17 8:49 ` Yann Droneaud 1 sibling, 0 replies; 39+ messages in thread From: Yann Droneaud @ 2023-02-17 8:49 UTC (permalink / raw) To: Alejandro Colomar, GCC; +Cc: Iker Pedrosa Hi, Le 16/02/2023 à 15:35, Alejandro Colomar via Gcc a écrit : > Hi! > > I was preparing an example program of a use-after-realloc bug, > when I found that GCC doesn't warn in a case where it should. > > > alx@debian:~/tmp$ cat realloc.c > #include <stdint.h> > #include <stdlib.h> > #include <stdio.h> > #include <string.h> > #include <unistd.h> > > static inline char * > xstrdup(const char *s) > { > char *p; > > p = strdup(s); > if (p == NULL) > exit(EXIT_FAILURE); > return p; > } > > static inline char * > strnul(const char *s) > { > return (char *) s + strlen(s); > } > > int > main(void) > { > char *p, *q; > > p = xstrdup(""); > q = strnul(p); > > if (p == q) > puts("equal before"); > else > exit(EXIT_FAILURE); // It's an empty string; this won't happen > > printf("p = %p; q = %p\n", p, q); > > p = realloc(p, UINT16_MAX); > if (p == NULL) > exit(EXIT_FAILURE); > puts("realloc()"); > > if (p == q) { // Use after realloc. I'd expect a warning here. > puts("equal after"); > } else { > /* Can we get here? > Let's see the options: > > - realloc(3) fails: > We exit immediately. We don't arrive here. > > - realloc(3) doesn't move the memory: > p == q, as before > > - realloc(3) moved the memory: > p is guaranteed to be a unique pointer, > and q is now an invalid pointer. It is > Undefined Behavior to read `q`, so `p == q` > is UB. > > As we see, there's no _defined_ path where this can happen > */ > printf("PID = %i\n", (int) getpid()); > } > > printf("p = %p; q = %p\n", p, q); > } > alx@debian:~/tmp$ cc -Wall -Wextra realloc.c -O3 -fanalyzer > realloc.c: In function ‘main’: > realloc.c:67:9: warning: pointer ‘p’ may be used after ‘realloc’ [-Wuse-after-free] > 67 | printf("p = %p; q = %p\n", p, q); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > realloc.c:39:13: note: call to ‘realloc’ here > 39 | p = realloc(p, UINT16_MAX); > | ^~~~~~~~~~~~~~~~~~~~~~ > alx@debian:~/tmp$ ./a.out > equal before > p = 0x55bff80802a0; q = 0x55bff80802a0 > realloc() > PID = 25222 > p = 0x55bff80806d0; q = 0x55bff80802a0 > > > Did I miss anything? -Wuse-after-free=3 Regards. -- Yann Droneaud OPTEYA ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2023-02-24 16:37 UTC | newest] Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-02-16 14:35 Missed warning (-Wuse-after-free) Alejandro Colomar 2023-02-16 15:15 ` David Malcolm 2023-02-17 1:04 ` Alejandro Colomar 2023-02-17 1:05 ` Alejandro Colomar 2023-02-17 1:56 ` Sam James 2023-02-17 8:12 ` Martin Uecker 2023-02-17 11:35 ` Alejandro Colomar 2023-02-17 13:34 ` Andreas Schwab 2023-02-17 13:48 ` Martin Uecker 2023-02-23 19:23 ` Alex Colomar 2023-02-23 19:57 ` Martin Uecker 2023-02-24 0:02 ` Alex Colomar 2023-02-24 1:21 ` Serge E. Hallyn 2023-02-24 1:42 ` Alex Colomar 2023-02-24 3:01 ` Peter Lafreniere 2023-02-24 8:52 ` Martin Uecker 2023-02-24 8:43 ` Martin Uecker 2023-02-24 16:10 ` Serge E. Hallyn 2023-02-24 8:36 ` Martin Uecker 2023-02-24 16:01 ` Serge E. Hallyn 2023-02-24 16:37 ` Martin Uecker 2023-02-17 3:48 ` Siddhesh Poyarekar 2023-02-17 11:22 ` Alejandro Colomar 2023-02-17 13:38 ` Siddhesh Poyarekar 2023-02-17 14:01 ` Mark Wielaard 2023-02-17 14:06 ` Siddhesh Poyarekar 2023-02-17 21:20 ` [PATCH] Make -Wuse-after-free=3 the default one in -Wall Alejandro Colomar 2023-02-17 21:39 ` Siddhesh Poyarekar 2023-02-17 21:41 ` Siddhesh Poyarekar 2023-02-17 22:58 ` Alejandro Colomar 2023-02-17 23:03 ` Siddhesh Poyarekar 2023-02-17 11:24 ` Missed warning (-Wuse-after-free) Jonathan Wakely 2023-02-17 11:43 ` Alejandro Colomar 2023-02-17 12:04 ` Jonathan Wakely 2023-02-17 12:53 ` Siddhesh Poyarekar 2023-02-17 14:10 ` Jonathan Wakely 2023-02-17 13:44 ` David Malcolm 2023-02-17 14:01 ` Siddhesh Poyarekar 2023-02-17 8:49 ` Yann Droneaud
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).