* GCC turns &~ into | due to undefined bit-shift without warning @ 2019-03-11 8:49 Moritz Strübe 2019-03-11 9:14 ` Jakub Jelinek 0 siblings, 1 reply; 34+ messages in thread From: Moritz Strübe @ 2019-03-11 8:49 UTC (permalink / raw) To: gcc; +Cc: Nicolai Steinkamp Hey, I have the following code: #include <stdint.h> void LL_ADC_SetChannelSingleDiff(uint32_t * val, uint32_t Channel, uint32_t SingleDiff) { *val = (*val & (~(Channel & 0x7FFFFU))) | ((Channel & 0x7FFFFU ) & (0x7FFFFU << (SingleDiff & 0x20U))); } void test(uint32_t * testvar) { LL_ADC_SetChannelSingleDiff(testvar, 0x2 ,0x7FU ); } Starting with gcc 6 and -O2 this code produces an or-instruction instead of an and-not-instruction: https://godbolt.org/z/kGtBfW x86-64 -O1: LL_ADC_SetChannelSingleDiff: and esi, 524287 or DWORD PTR [rdi], esi ret test: and DWORD PTR [rdi], -3 ret x86-64 -O1: LL_ADC_SetChannelSingleDiff: and esi, 524287 or DWORD PTR [rdi], esi ret test: or DWORD PTR [rdi], 2 ret Considering that C11 6.5.7#3 ("If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.") is not very widely known, as it "normally" just works, inverting the intent is quite unexpected. Is there any option that would have helped me with this? Should this be a bug? I know, from the C standard point of view this is ok, but inverting the behavior without warning is really bad in terms of user experience. Clang does the same, but IMO that does not make things any better. Cheers Morty -- Redheads Ltd. Softwaredienstleistungen Schillerstr. 14 90409 Nürnberg Telefon: +49 (0)911 180778-50 E-Mail: moritz.struebe@redheads.de | Web: www.redheads.de Geschäftsführer: Andreas Hanke Sitz der Gesellschaft: Lauf Amtsgericht Nürnberg HRB 22681 Ust-ID: DE 249436843 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 8:49 GCC turns &~ into | due to undefined bit-shift without warning Moritz Strübe @ 2019-03-11 9:14 ` Jakub Jelinek 2019-03-11 11:06 ` Moritz Strübe 2019-03-21 22:20 ` Allan Sandfeld Jensen 0 siblings, 2 replies; 34+ messages in thread From: Jakub Jelinek @ 2019-03-11 9:14 UTC (permalink / raw) To: Moritz Strübe; +Cc: gcc, Nicolai Steinkamp On Mon, Mar 11, 2019 at 08:49:30AM +0000, Moritz Strübe wrote: > Considering that C11 6.5.7#3 ("If the value of the right operand > is negative or is greater than or equal to the width of the promoted > left operand, the behavior is undefined.") is not very widely known, as > it "normally" just works, inverting the intent is quite unexpected. > > Is there any option that would have helped me with this? You could build with -fsanitize=undefined, that would tell you at runtime you have undefined behavior in your code (if the SingleDiff has bit ever 0x20 set). The fact that negative or >= bit precision shifts are UB is widely known, and even if it wouldn't, for the compiler all the UBs are just UBs, the compiler optimizes on the assumption that UB does not happen, so when it sees 32-bit int << (x & 32), it can assume x must be 0 at that point, anything else is UB. GCC has warnings for the simple cases, where one uses negative or too large constant shift, warning in cases like you have would be a false positive for many people, there is nothing wrong with that (if x & 32 always results in 0). Jakub ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 9:14 ` Jakub Jelinek @ 2019-03-11 11:06 ` Moritz Strübe 2019-03-11 11:17 ` Jakub Jelinek 2019-03-11 11:24 ` Vincent Lefevre 2019-03-21 22:20 ` Allan Sandfeld Jensen 1 sibling, 2 replies; 34+ messages in thread From: Moritz Strübe @ 2019-03-11 11:06 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc, Nicolai Steinkamp On 11.03.2019 at 10:14 Jakub Jelinek wrote: > You could build with -fsanitize=undefined, that would tell you at runtime you > have undefined behavior in your code (if the SingleDiff has bit ever 0x20 > set). Yes, that helps. Unfortunately I'm on an embedded system, thus the code size increase is just too big. > The fact that negative or >= bit precision shifts are UB is widely known, > and even if it wouldn't, for the compiler all the UBs are just UBs, the > compiler optimizes on the assumption that UB does not happen, so when it > sees 32-bit int << (x & 32), it can assume x must be 0 at that point, > anything else is UB. Thanks for that explanation. None the less, a compile time warning would be nice. Especially as I this was caused by a library provided by ST. :( Seems like we really need to add more sophisticated static analysis to our CI. Morty -- Redheads Ltd. Softwaredienstleistungen Schillerstr. 14 90409 Nürnberg Telefon: +49 (0)911 180778-50 E-Mail: moritz.struebe@redheads.de | Web: www.redheads.de Geschäftsführer: Andreas Hanke Sitz der Gesellschaft: Lauf Amtsgericht Nürnberg HRB 22681 Ust-ID: DE 249436843 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 11:06 ` Moritz Strübe @ 2019-03-11 11:17 ` Jakub Jelinek 2019-03-20 14:08 ` Moritz Strübe 2019-03-11 11:24 ` Vincent Lefevre 1 sibling, 1 reply; 34+ messages in thread From: Jakub Jelinek @ 2019-03-11 11:17 UTC (permalink / raw) To: Moritz Strübe; +Cc: gcc, Nicolai Steinkamp On Mon, Mar 11, 2019 at 11:06:37AM +0000, Moritz Strübe wrote: > On 11.03.2019 at 10:14 Jakub Jelinek wrote: > > You could build with -fsanitize=undefined, that would tell you at runtime you > > have undefined behavior in your code (if the SingleDiff has bit ever 0x20 > > set). > > Yes, that helps. Unfortunately I'm on an embedded system, thus the code > size increase is just too big. You can -fsanitize-undefined-trap-on-error, which doesn't increase size too much, it is less user-friendly, but still should catch the UB. > > The fact that negative or >= bit precision shifts are UB is widely known, > > and even if it wouldn't, for the compiler all the UBs are just UBs, the > > compiler optimizes on the assumption that UB does not happen, so when it > > sees 32-bit int << (x & 32), it can assume x must be 0 at that point, > > anything else is UB. > > Thanks for that explanation. None the less, a compile time warning would > be nice. Especially as I this was caused by a library provided by ST. :( > Seems like we really need to add more sophisticated static analysis to > our CI. What you think the code would do for int << 32? The probable reason why it is UB in the standard is that each CPU handles that differently, on some shift left by large count results in 0, on others the shift count is modulo the bitsize, on yet others the shift count is also masked, but e.g. by wordsize even when the shifted count is smaller (say int << 32 is 0, but int << 64 is like int << 0). A warning is a bad idea generally, we'd need to warn for all cases where the shift count is not compile time constant, all of those could be out of bounds in theory. Jakub ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 11:17 ` Jakub Jelinek @ 2019-03-20 14:08 ` Moritz Strübe 2019-03-20 14:26 ` Christophe Lyon ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Moritz Strübe @ 2019-03-20 14:08 UTC (permalink / raw) To: Jakub Jelinek, gcc Hey. Am 11.03.2019 um 12:17 schrieb Jakub Jelinek: On Mon, Mar 11, 2019 at 11:06:37AM +0000, Moritz Strübe wrote: On 11.03.2019 at 10:14 Jakub Jelinek wrote: You could build with -fsanitize=undefined, that would tell you at runtime you have undefined behavior in your code (if the SingleDiff has bit ever 0x20 set). Yes, that helps. Unfortunately I'm on an embedded system, thus the code size increase is just too big. You can -fsanitize-undefined-trap-on-error, which doesn't increase size too much, it is less user-friendly, but still should catch the UB. Ok, I played around a bit. Interestingly, if I set -fsanitize=udefined and -fsanitize-undefined-trap-on-error the compiler detects that it will always trap, and optimizes the code accordingly (the code after the trap is removed).* Which kind of brings me to David's argument: Shouldn't the compiler warn if there is undefined behavior it certainly knows of? I do assume though that fsanitize just injects the test-code everywhere and relies on the compiler to remove it at unnecessary places. Would be nice, though. :) Cheers Morty *After fixing the code, it got too big to fit. -- Redheads Ltd. Softwaredienstleistungen Schillerstr. 14 90409 Nürnberg Telefon: +49 (0)911 180778-50 E-Mail: moritz.struebe@redheads.de<mailto:moritz.struebe@redheads.de> | Web: www.redheads.de<http://www.redheads.de> Geschäftsführer: Andreas Hanke Sitz der Gesellschaft: Lauf Amtsgericht Nürnberg HRB 22681 Ust-ID: DE 249436843 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-20 14:08 ` Moritz Strübe @ 2019-03-20 14:26 ` Christophe Lyon 2019-03-20 15:39 ` Moritz Strübe 2019-03-20 15:49 ` Jakub Jelinek 2019-03-20 17:36 ` Andrew Haley 2 siblings, 1 reply; 34+ messages in thread From: Christophe Lyon @ 2019-03-20 14:26 UTC (permalink / raw) To: gcc On 20/03/2019 15:08, Moritz Strübe wrote: > Hey. > > Am 11.03.2019 um 12:17 schrieb Jakub Jelinek: > > On Mon, Mar 11, 2019 at 11:06:37AM +0000, Moritz Strübe wrote: > > > On 11.03.2019 at 10:14 Jakub Jelinek wrote: > > > You could build with -fsanitize=undefined, that would tell you at runtime you > have undefined behavior in your code (if the SingleDiff has bit ever 0x20 > set). > > > Yes, that helps. Unfortunately I'm on an embedded system, thus the code > size increase is just too big. > > > You can -fsanitize-undefined-trap-on-error, which doesn't increase size too > much, it is less user-friendly, but still should catch the UB. > > Wouldn't this fail to link? I thought the sanitizers need some runtime libraries which are only available under linux/macos/android. What do you mean by embedded? Isn't it arm-eabi? > > Ok, I played around a bit. Interestingly, if I set -fsanitize=udefined and -fsanitize-undefined-trap-on-error the compiler detects that it will always trap, and optimizes the code accordingly (the code after the trap is removed).* Which kind of brings me to David's argument: Shouldn't the compiler warn if there is undefined behavior it certainly knows of? > I do assume though that fsanitize just injects the test-code everywhere and relies on the compiler to remove it at unnecessary places. Would be nice, though. :) > Could you confirm in which version of the ST libraries you noticed this bug? I'm told it was fixed on 23-march-2018. Thanks, Christophe > Cheers > Morty > > *After fixing the code, it got too big to fit. > > > -- > Redheads Ltd. Softwaredienstleistungen > Schillerstr. 14 > 90409 Nürnberg > > Telefon: +49 (0)911 180778-50 > E-Mail: moritz.struebe@redheads.de<mailto:moritz.struebe@redheads.de> | Web: www.redheads.de<http://www.redheads.de> > > Geschäftsführer: Andreas Hanke > Sitz der Gesellschaft: Lauf > Amtsgericht Nürnberg HRB 22681 > Ust-ID: DE 249436843 > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-20 14:26 ` Christophe Lyon @ 2019-03-20 15:39 ` Moritz Strübe 0 siblings, 0 replies; 34+ messages in thread From: Moritz Strübe @ 2019-03-20 15:39 UTC (permalink / raw) To: gcc; +Cc: christophe.lyon Hey. Am 20.03.2019 um 15:26 schrieb Christophe Lyon: You can -fsanitize-undefined-trap-on-error, which doesn't increase size too much, it is less user-friendly, but still should catch the UB. Wouldn't this fail to link? I thought the sanitizers need some runtime libraries which are only available under linux/macos/android. What do you mean by embedded? Isn't it arm-eabi? Nope. It inserts a trap, triggering a hard fault (as the manual says). Works just fine. Moritz -- Redheads Ltd. Softwaredienstleistungen Schillerstr. 14 90409 Nürnberg Telefon: +49 (0)911 180778-50 E-Mail: moritz.struebe@redheads.de<mailto:moritz.struebe@redheads.de> | Web: www.redheads.de<http://www.redheads.de> Geschäftsführer: Andreas Hanke Sitz der Gesellschaft: Lauf Amtsgericht Nürnberg HRB 22681 Ust-ID: DE 249436843 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-20 14:08 ` Moritz Strübe 2019-03-20 14:26 ` Christophe Lyon @ 2019-03-20 15:49 ` Jakub Jelinek 2019-03-20 17:36 ` Andrew Haley 2 siblings, 0 replies; 34+ messages in thread From: Jakub Jelinek @ 2019-03-20 15:49 UTC (permalink / raw) To: Moritz Strübe; +Cc: gcc On Wed, Mar 20, 2019 at 02:08:09PM +0000, Moritz Strübe wrote: > Ok, I played around a bit. Interestingly, if I set -fsanitize=udefined and -fsanitize-undefined-trap-on-error the compiler detects that it will always trap, and optimizes the code accordingly (the code after the trap is removed).* Which kind of brings me to David's argument: Shouldn't the compiler warn if there is undefined behavior it certainly knows of? What does it mean certainly knows of? The sanitization inserts (conditional) traps for all the constructs that it sanitizes, you certainly don't want warning for that. Even if the compiler can simplify or optimize out some of the guarding conditionals around the traps, that doesn't mean it isn't in dead code that will never be executed. The only safe warning might be if the compiler can prove that whenever main is called, there will be a trap executed later on, but that is not the case in most programs, as one can't prove for most functions they actually never loop and always return to the caller instead of say exiting, aborting, etc. (and even if main traps immediately, one could have work done in constructors and exit from there). Otherwise, would you like to warn if there is unconditional trap in some function? That function could not be ever called, or it could make some function calls before the trap that would never return (exit, abort, throw exception, infinite loop). Jakub ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-20 14:08 ` Moritz Strübe 2019-03-20 14:26 ` Christophe Lyon 2019-03-20 15:49 ` Jakub Jelinek @ 2019-03-20 17:36 ` Andrew Haley 2019-03-21 8:17 ` Richard Biener 2019-03-21 8:54 ` Moritz Strübe 2 siblings, 2 replies; 34+ messages in thread From: Andrew Haley @ 2019-03-20 17:36 UTC (permalink / raw) To: Moritz Strübe, Jakub Jelinek, gcc On 3/20/19 2:08 PM, Moritz Strübe wrote: > > Ok, I played around a bit. Interestingly, if I set > -fsanitize=udefined and -fsanitize-undefined-trap-on-error the > compiler detects that it will always trap, and optimizes the code > accordingly (the code after the trap is removed).* Which kind of > brings me to David's argument: Shouldn't the compiler warn if there > is undefined behavior it certainly knows of? Maybe an example would help. Consider this code: for (int i = start; i < limit; i++) { foo(i * 5); } Should GCC be entitled to turn it into int limit_tmp = i * 5; for (int i = start * 5; i < limit_tmp; i += 5) { foo(i); } If you answered "Yes, GCC should be allowed to do this", would you want a warning? And how many such warnings might there be in a typical program? -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-20 17:36 ` Andrew Haley @ 2019-03-21 8:17 ` Richard Biener 2019-03-21 8:25 ` Alexander Monakov 2019-03-21 8:54 ` Moritz Strübe 1 sibling, 1 reply; 34+ messages in thread From: Richard Biener @ 2019-03-21 8:17 UTC (permalink / raw) To: Andrew Haley; +Cc: Moritz Strübe, Jakub Jelinek, gcc On Wed, Mar 20, 2019 at 6:36 PM Andrew Haley <aph@redhat.com> wrote: > > On 3/20/19 2:08 PM, Moritz Strübe wrote: > > > > Ok, I played around a bit. Interestingly, if I set > > -fsanitize=udefined and -fsanitize-undefined-trap-on-error the > > compiler detects that it will always trap, and optimizes the code > > accordingly (the code after the trap is removed).* Which kind of > > brings me to David's argument: Shouldn't the compiler warn if there > > is undefined behavior it certainly knows of? > > Maybe an example would help. > > Consider this code: > > for (int i = start; i < limit; i++) { > foo(i * 5); > } > > Should GCC be entitled to turn it into > > int limit_tmp = i * 5; > for (int i = start * 5; i < limit_tmp; i += 5) { > foo(i); > } > > If you answered "Yes, GCC should be allowed to do this", would you > want a warning? And how many such warnings might there be in a typical > program? I assume i is signed int. Even then GCC may not do this unless it knows the loop is entered (start < limit). Richard. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. <https://www.redhat.com> > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-21 8:17 ` Richard Biener @ 2019-03-21 8:25 ` Alexander Monakov 2019-03-21 8:35 ` Richard Biener 0 siblings, 1 reply; 34+ messages in thread From: Alexander Monakov @ 2019-03-21 8:25 UTC (permalink / raw) To: Richard Biener; +Cc: Andrew Haley, Moritz Strübe, Jakub Jelinek, gcc On Thu, 21 Mar 2019, Richard Biener wrote: > > Maybe an example would help. > > > > Consider this code: > > > > for (int i = start; i < limit; i++) { > > foo(i * 5); > > } > > > > Should GCC be entitled to turn it into > > > > int limit_tmp = i * 5; > > for (int i = start * 5; i < limit_tmp; i += 5) { > > foo(i); > > } > > > > If you answered "Yes, GCC should be allowed to do this", would you > > want a warning? And how many such warnings might there be in a typical > > program? > > I assume i is signed int. Even then GCC may not do this unless it knows > the loop is entered (start < limit). Additionally, the compiler needs to prove that 'foo' always returns normally (i.e. cannot invoke exit/longjmp or such). Alexander ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-21 8:25 ` Alexander Monakov @ 2019-03-21 8:35 ` Richard Biener 0 siblings, 0 replies; 34+ messages in thread From: Richard Biener @ 2019-03-21 8:35 UTC (permalink / raw) To: Alexander Monakov; +Cc: Andrew Haley, Moritz Strübe, Jakub Jelinek, gcc On Thu, Mar 21, 2019 at 9:25 AM Alexander Monakov <amonakov@ispras.ru> wrote: > > On Thu, 21 Mar 2019, Richard Biener wrote: > > > Maybe an example would help. > > > > > > Consider this code: > > > > > > for (int i = start; i < limit; i++) { > > > foo(i * 5); > > > } > > > > > > Should GCC be entitled to turn it into > > > > > > int limit_tmp = i * 5; > > > for (int i = start * 5; i < limit_tmp; i += 5) { > > > foo(i); > > > } > > > > > > If you answered "Yes, GCC should be allowed to do this", would you > > > want a warning? And how many such warnings might there be in a typical > > > program? > > > > I assume i is signed int. Even then GCC may not do this unless it knows > > the loop is entered (start < limit). > > Additionally, the compiler needs to prove that 'foo' always returns normally > (i.e. cannot invoke exit/longjmp or such). Ah, yes. Andrews example was probably meaning limit_tmp = limit * 5, not i * 5. Computing start * 5 is fine if the loop is entered. Richard. > > Alexander ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-20 17:36 ` Andrew Haley 2019-03-21 8:17 ` Richard Biener @ 2019-03-21 8:54 ` Moritz Strübe 2019-03-21 9:52 ` Andrew Haley 1 sibling, 1 reply; 34+ messages in thread From: Moritz Strübe @ 2019-03-21 8:54 UTC (permalink / raw) To: Andrew Haley, Jakub Jelinek, gcc Hey. Am 20.03.2019 um 18:36 schrieb Andrew Haley: > On 3/20/19 2:08 PM, Moritz Strübe wrote: >> Ok, I played around a bit. Interestingly, if I set >> -fsanitize=udefined and -fsanitize-undefined-trap-on-error the >> compiler detects that it will always trap, and optimizes the code >> accordingly (the code after the trap is removed).* Which kind of >> brings me to David's argument: Shouldn't the compiler warn if there >> is undefined behavior it certainly knows of? > Maybe an example would help. > > Consider this code: > > for (int i = start; i < limit; i++) { > foo(i * 5); > } > > Should GCC be entitled to turn it into > > int limit_tmp = i * 5; > for (int i = start * 5; i < limit_tmp; i += 5) { > foo(i); > } > > If you answered "Yes, GCC should be allowed to do this", would you > want a warning? And how many such warnings might there be in a typical > program? Ok, let me see whether I get your point. I assume that should be "int limit_tmp = limit * 5;". In the original version I have a potential integer overflow while passing a parameter. While in the second version, I have a potential overflow in limit_tmp and therefore the loop range and number of calls of foo is changed. I think I start getting your point, but I none the less think it would be really nice to have an option(!) to warn me about such things nonetheless. Use cases would be libraries, or at least their interfaces and critical software or just support finding potential bugs. Especially when using third party libraries this would can help find potential issues. Would it be possible to annotate the inserted checks with a debug symbol or similar? That way one could compile using LTO and then search for the remaining symbols? That would allow static analysis tools to search for these symbols and annotate the code. Cheers Moritz -- Redheads Ltd. Softwaredienstleistungen Schillerstr. 14 90409 Nürnberg Telefon: +49 (0)911 180778-50 E-Mail: moritz.struebe@redheads.de | Web: www.redheads.de Geschäftsführer: Andreas Hanke Sitz der Gesellschaft: Lauf Amtsgericht Nürnberg HRB 22681 Ust-ID: DE 249436843 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-21 8:54 ` Moritz Strübe @ 2019-03-21 9:52 ` Andrew Haley 0 siblings, 0 replies; 34+ messages in thread From: Andrew Haley @ 2019-03-21 9:52 UTC (permalink / raw) To: Moritz Strübe, Jakub Jelinek, gcc On 3/21/19 8:53 AM, Moritz Strübe wrote: > Hey. > > Am 20.03.2019 um 18:36 schrieb Andrew Haley: >> On 3/20/19 2:08 PM, Moritz Strübe wrote: >>> Ok, I played around a bit. Interestingly, if I set >>> -fsanitize=udefined and -fsanitize-undefined-trap-on-error the >>> compiler detects that it will always trap, and optimizes the code >>> accordingly (the code after the trap is removed).* Which kind of >>> brings me to David's argument: Shouldn't the compiler warn if there >>> is undefined behavior it certainly knows of? >> Maybe an example would help. >> >> Consider this code: >> >> for (int i = start; i < limit; i++) { >> foo(i * 5); >> } >> >> Should GCC be entitled to turn it into >> >> int limit_tmp = i * 5; >> for (int i = start * 5; i < limit_tmp; i += 5) { >> foo(i); >> } >> >> If you answered "Yes, GCC should be allowed to do this", would you >> want a warning? And how many such warnings might there be in a typical >> program? > > Ok, let me see whether I get your point. I assume that should be "int > limit_tmp = limit * 5;". Yes, sorry. > In the original version I have a potential integer overflow while > passing a parameter. While in the second version, I have a potential > overflow in limit_tmp and therefore the loop range and number of calls > of foo is changed. That's right. > I think I start getting your point, but I none the less think it would > be really nice to have an option(!) to warn me about such things > nonetheless. There aren't necesarily points in the compiler where GCC says "look, this would be UB, so delete the code." Sometimes GCC simply assumes that things like overflows cannot happen, so it ignores the possibility. The code I provided is an example of that. I suppose we could utilize the sanitize=undefined framework and emit a warning everywhere a runtime check was inserted. That will at least allow you to check in every case that the overflow, null pointer exception, etc, cannot happen. There would be a lot of warnings. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 11:06 ` Moritz Strübe 2019-03-11 11:17 ` Jakub Jelinek @ 2019-03-11 11:24 ` Vincent Lefevre 2019-03-11 12:51 ` David Brown 1 sibling, 1 reply; 34+ messages in thread From: Vincent Lefevre @ 2019-03-11 11:24 UTC (permalink / raw) To: Moritz Strübe; +Cc: Jakub Jelinek, gcc, Nicolai Steinkamp On 2019-03-11 11:06:37 +0000, Moritz Strübe wrote: > On 11.03.2019 at 10:14 Jakub Jelinek wrote: > > The fact that negative or >= bit precision shifts are UB is widely known, [...] And even in the case where the compiler maps the shift directly to the asm shift (without optimizations), the behavior may depend on the processor. > Thanks for that explanation. None the less, a compile time warning > would be nice. It already does by default: -Wshift-count-negative Warn if shift count is negative. This warning is enabled by default. -Wshift-count-overflow Warn if shift count >= width of type. This warning is enabled by default. Of course, if the compiler cannot guess that there will be such an issue, it will not emit the warning. You certainly don't want a warning for each non-trivial shift just because the compiler cannot know whether the constraint on the shift count will be satisfied. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 11:24 ` Vincent Lefevre @ 2019-03-11 12:51 ` David Brown 2019-03-12 15:40 ` Vincent Lefevre 0 siblings, 1 reply; 34+ messages in thread From: David Brown @ 2019-03-11 12:51 UTC (permalink / raw) To: Moritz Strübe, Jakub Jelinek, gcc, Nicolai Steinkamp On 11/03/2019 12:24, Vincent Lefevre wrote: > On 2019-03-11 11:06:37 +0000, Moritz Strübe wrote: >> On 11.03.2019 at 10:14 Jakub Jelinek wrote: >>> The fact that negative or >= bit precision shifts are UB is widely known, > [...] > > And even in the case where the compiler maps the shift directly to > the asm shift (without optimizations), the behavior may depend on > the processor. > >> Thanks for that explanation. None the less, a compile time warning >> would be nice. > > It already does by default: > > -Wshift-count-negative > Warn if shift count is negative. This warning is enabled > by default. > > -Wshift-count-overflow > Warn if shift count >= width of type. This warning is > enabled by default. > > Of course, if the compiler cannot guess that there will be such > an issue, it will not emit the warning. You certainly don't want > a warning for each non-trivial shift just because the compiler > cannot know whether the constraint on the shift count will be > satisfied. > While the compiler clearly can't give a warning on calculated shifts without massive amounts of false positives, it is able to give warnings when there is a shift by a compile-time known constant value that is invalid. In the case of the OP's test function, inlining and constant propagation means that the shift value /is/ known to the compiler - it uses it for optimisation (in this case, it uses the undefined behaviour to "simplify" the calculations). Am I right in thinking that this is because the pass that checks the shift sizes for warnings here comes before the relevant inlining or constant propagation passes? And if so, would it be possible to change the ordering here? Perhaps it would be possible to have a warning for when the compiler optimises based on undefined behaviour, when the undefined behaviour comes from values that are known to the compiler at compile time? (When the values are not known at compile time, optimising on the assumption that the undefined behaviour doesn't happen is fair enough, and you can't warn there without lots of false positives.) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 12:51 ` David Brown @ 2019-03-12 15:40 ` Vincent Lefevre 2019-03-12 20:57 ` David Brown 0 siblings, 1 reply; 34+ messages in thread From: Vincent Lefevre @ 2019-03-12 15:40 UTC (permalink / raw) To: gcc On 2019-03-11 13:51:21 +0100, David Brown wrote: > On 11/03/2019 12:24, Vincent Lefevre wrote: > > It already does by default: > > > > -Wshift-count-negative > > Warn if shift count is negative. This warning is enabled > > by default. > > > > -Wshift-count-overflow > > Warn if shift count >= width of type. This warning is > > enabled by default. > > > > Of course, if the compiler cannot guess that there will be such > > an issue, it will not emit the warning. You certainly don't want > > a warning for each non-trivial shift just because the compiler > > cannot know whether the constraint on the shift count will be > > satisfied. > > While the compiler clearly can't give a warning on calculated shifts > without massive amounts of false positives, it is able to give warnings > when there is a shift by a compile-time known constant value that is > invalid. In the case of the OP's test function, inlining and constant > propagation means that the shift value /is/ known to the compiler - it > uses it for optimisation (in this case, it uses the undefined behaviour > to "simplify" the calculations). > > Am I right in thinking that this is because the pass that checks the > shift sizes for warnings here comes before the relevant inlining or > constant propagation passes? And if so, would it be possible to change > the ordering here? To generate a warning, the compiler would also have to make sure that the inlined code with an out-of-range shift is actually not dead code. In practice, it may happen that constraints are not satisfied on some platforms, but the reason is that on such platforms the code will never be executed (this is code written to take care of other platforms). -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-12 15:40 ` Vincent Lefevre @ 2019-03-12 20:57 ` David Brown 2019-03-13 2:25 ` Vincent Lefevre 0 siblings, 1 reply; 34+ messages in thread From: David Brown @ 2019-03-12 20:57 UTC (permalink / raw) To: gcc On 12/03/2019 16:40, Vincent Lefevre wrote: > On 2019-03-11 13:51:21 +0100, David Brown wrote: >> On 11/03/2019 12:24, Vincent Lefevre wrote: >>> It already does by default: >>> >>> -Wshift-count-negative >>> Warn if shift count is negative. This warning is enabled >>> by default. >>> >>> -Wshift-count-overflow >>> Warn if shift count >= width of type. This warning is >>> enabled by default. >>> >>> Of course, if the compiler cannot guess that there will be such >>> an issue, it will not emit the warning. You certainly don't want >>> a warning for each non-trivial shift just because the compiler >>> cannot know whether the constraint on the shift count will be >>> satisfied. >> >> While the compiler clearly can't give a warning on calculated shifts >> without massive amounts of false positives, it is able to give warnings >> when there is a shift by a compile-time known constant value that is >> invalid. In the case of the OP's test function, inlining and constant >> propagation means that the shift value /is/ known to the compiler - it >> uses it for optimisation (in this case, it uses the undefined behaviour >> to "simplify" the calculations). >> >> Am I right in thinking that this is because the pass that checks the >> shift sizes for warnings here comes before the relevant inlining or >> constant propagation passes? And if so, would it be possible to change >> the ordering here? > > To generate a warning, the compiler would also have to make sure > that the inlined code with an out-of-range shift is actually not > dead code. In practice, it may happen that constraints are not > satisfied on some platforms, but the reason is that on such > platforms the code will never be executed (this is code written > to take care of other platforms). > I disagree. To generate an unconditional error (rejecting the program), the compiler would need such proof - such as by tracing execution from main(). But to generate a warning activated specifically by the user, there is no such requirement. It's fine to give a warning based on the code written, rather than on code that the compiler knows without doubt will be executed. (The warning here would be on the function "test" which calls the function with the shift, not on that function itself - since it is only when used in "test" that the compiler can see that there is undefined behaviour.) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-12 20:57 ` David Brown @ 2019-03-13 2:25 ` Vincent Lefevre 2019-03-13 10:18 ` David Brown 0 siblings, 1 reply; 34+ messages in thread From: Vincent Lefevre @ 2019-03-13 2:25 UTC (permalink / raw) To: gcc On 2019-03-12 21:56:59 +0100, David Brown wrote: > I disagree. To generate an unconditional error (rejecting the program), the > compiler would need such proof - such as by tracing execution from main(). > But to generate a warning activated specifically by the user, there is no > such requirement. It's fine to give a warning based on the code written, > rather than on code that the compiler knows without doubt will be executed. There's already a bug about spurious warnings on shift counts: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210 -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-13 2:25 ` Vincent Lefevre @ 2019-03-13 10:18 ` David Brown 2019-03-26 22:51 ` Vincent Lefevre 0 siblings, 1 reply; 34+ messages in thread From: David Brown @ 2019-03-13 10:18 UTC (permalink / raw) To: gcc On 13/03/2019 03:25, Vincent Lefevre wrote: > On 2019-03-12 21:56:59 +0100, David Brown wrote: >> I disagree. To generate an unconditional error (rejecting the program), the >> compiler would need such proof - such as by tracing execution from main(). >> But to generate a warning activated specifically by the user, there is no >> such requirement. It's fine to give a warning based on the code written, >> rather than on code that the compiler knows without doubt will be executed. > > There's already a bug about spurious warnings on shift counts: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210 > You can divide code into three groups (with the exact divisions varying by compiler switches and version): 1. Code that the compiler knows for sure will run in every execution of the program, generally because it can track the code flow from main(). 2. Code that the compiler knows will /not/ run, due to things like constant propagation, inlining, etc. 3. Code that the compiler does not know if it will run or not. Code in group 1 here is usually quite small. Code in group 2 can be large, especially with C++ header libraries, templates, etc. The compiler will often eliminate such code and avoid generating any object code. gcc used to have a warning for when it found "group 2" code and eliminated it - that warning was removed as gcc got smarter, and the false positives were overwhelming. Most code is in group 3. I would say that if the compiler finds undefined behaviour in group 1 code, it should give an unconditional error message, almost regardless of compiler switches. (Many people will disagree with me here - that's okay. Fortunately for everyone else, I am not the one who decides these things in gcc!). Certainly that is standards-condoned behaviour. To be useful to the developer, warnings have to be applied to group 3 code. That does mean a risk of false positives - some code will be group 2 (never run) though the compiler doesn't know it. I am arguing here that a warning like this should be applied to group 3 code - you are suggesting it should only apply to group 1. The bug report you linked was for code in group 2 - code that the compiler can (or should be able to) see is never run. I can see it makes sense to disable or hide warnings from such code, but it may be useful to have them anyway. I expect people to have different preferences here. (I see in that bug report, solutions are complicated because C lets you "disable" a block by writing "if (0)", and then lets you jump into it from outside with labels and goto's. Perhaps that should automatically trigger a warning saying "Your code is made of spaghetti. Any other warnings may be unreliable with many false positives and false negatives".) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-13 10:18 ` David Brown @ 2019-03-26 22:51 ` Vincent Lefevre 0 siblings, 0 replies; 34+ messages in thread From: Vincent Lefevre @ 2019-03-26 22:51 UTC (permalink / raw) To: gcc On 2019-03-13 11:18:02 +0100, David Brown wrote: > On 13/03/2019 03:25, Vincent Lefevre wrote: > > On 2019-03-12 21:56:59 +0100, David Brown wrote: > >> I disagree. To generate an unconditional error (rejecting the > >> program), the compiler would need such proof - such as by tracing > >> execution from main(). But to generate a warning activated > >> specifically by the user, there is no such requirement. It's fine > >> to give a warning based on the code written, rather than on code > >> that the compiler knows without doubt will be executed. > > > > There's already a bug about spurious warnings on shift counts: > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210 > > > > You can divide code into three groups (with the exact divisions varying > by compiler switches and version): > > 1. Code that the compiler knows for sure will run in every execution of > the program, generally because it can track the code flow from main(). > > 2. Code that the compiler knows will /not/ run, due to things like > constant propagation, inlining, etc. > > 3. Code that the compiler does not know if it will run or not. Actually more than the fact whether the code will be run or not, what is important is the concept of reachability. This will give: 1. Code that the compiler knows for sure will run under some conditions (e.g. particular values of inputs). 2. Code that the compiler knows will never run (what I called dead code). 3. Code for which the compile can't decide. > Code in group 1 here is usually quite small. Code in group 2 can be > large, especially with C++ header libraries, templates, etc. The > compiler will often eliminate such code and avoid generating any object > code. gcc used to have a warning for when it found "group 2" code and > eliminated it - that warning was removed as gcc got smarter, and the > false positives were overwhelming. > > Most code is in group 3. It depends on how the code is written. The programmer could try to avoid group 3 by giving hints to the compiler, e.g. with __builtin_unreachable(). I wish this were standardized in C. > I am arguing here that a warning like this should be applied to group 3 > code - you are suggesting it should only apply to group 1. No, I was just suggesting that the compiler should be smart enough to detect dead code (when possible). In the past, there was a similar issue with -Wmaybe-uninitialized, which was rather useless (too many false positives in complex code). Fortunetaly, this has improved a lot. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-11 9:14 ` Jakub Jelinek 2019-03-11 11:06 ` Moritz Strübe @ 2019-03-21 22:20 ` Allan Sandfeld Jensen 2019-03-21 22:31 ` Jakub Jelinek 2019-03-22 10:02 ` Andrew Haley 1 sibling, 2 replies; 34+ messages in thread From: Allan Sandfeld Jensen @ 2019-03-21 22:20 UTC (permalink / raw) To: gcc, Jakub Jelinek On Montag, 11. März 2019 10:14:49 CET Jakub Jelinek wrote: > On Mon, Mar 11, 2019 at 08:49:30AM +0000, Moritz Strübe wrote: > > Considering that C11 6.5.7#3 ("If the value of the right operand > > is negative or is greater than or equal to the width of the promoted > > left operand, the behavior is undefined.") is not very widely known, as > > it "normally" just works, inverting the intent is quite unexpected. > > > > Is there any option that would have helped me with this? > > You could build with -fsanitize=undefined, that would tell you at runtime > you have undefined behavior in your code (if the SingleDiff has bit ever > 0x20 set). > > The fact that negative or >= bit precision shifts are UB is widely known, > and even if it wouldn't, for the compiler all the UBs are just UBs, the > compiler optimizes on the assumption that UB does not happen, so when it > sees 32-bit int << (x & 32), it can assume x must be 0 at that point, > anything else is UB. > Hmm, I am curious. How strongly would gcc assume x is 0? What if you have some expression that is undefined if x is not zero, but x really isn't zero and the result is temporarily undefined, but then another statement or part of the expression fixes the final result to something defined regardless of the intermediate. Would the compiler make assumptions that the intermediate value is never undefined, and possibly carry that analysed information over into other expressions? From having fixed UBSAN warnings, I have seen many cases where undefined behavior was performed, but where the code was aware of it and the final result of the expression was well defined nonetheless. 'Allan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-21 22:20 ` Allan Sandfeld Jensen @ 2019-03-21 22:31 ` Jakub Jelinek 2019-03-22 9:27 ` Allan Sandfeld Jensen 2019-03-22 10:02 ` Andrew Haley 1 sibling, 1 reply; 34+ messages in thread From: Jakub Jelinek @ 2019-03-21 22:31 UTC (permalink / raw) To: Allan Sandfeld Jensen; +Cc: gcc On Thu, Mar 21, 2019 at 11:19:54PM +0100, Allan Sandfeld Jensen wrote: > Hmm, I am curious. How strongly would gcc assume x is 0? If x is not 0, then it is undefined behavior and anything can happen, so yes, it can assume x is 0, sometimes gcc does that, sometimes not, it is not required to do that. > From having fixed UBSAN warnings, I have seen many cases where undefined > behavior was performed, but where the code was aware of it and the final Any program where it printed something (talking about -fsanitize=undefined, not the few sanitizers that go beyond what is required by the language) is undefined, period. It can happen to "work" as some users expect, it can crash, it can format your disk or anything else. There is no well defined after a process runs into UB. Jakub ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-21 22:31 ` Jakub Jelinek @ 2019-03-22 9:27 ` Allan Sandfeld Jensen 2019-03-22 9:50 ` Jakub Jelinek 0 siblings, 1 reply; 34+ messages in thread From: Allan Sandfeld Jensen @ 2019-03-22 9:27 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc On Donnerstag, 21. März 2019 23:31:48 CET Jakub Jelinek wrote: > On Thu, Mar 21, 2019 at 11:19:54PM +0100, Allan Sandfeld Jensen wrote: > > Hmm, I am curious. How strongly would gcc assume x is 0? > > If x is not 0, then it is undefined behavior and anything can happen, > so yes, it can assume x is 0, sometimes gcc does that, sometimes not, > it is not required to do that. > > > From having fixed UBSAN warnings, I have seen many cases where undefined > > behavior was performed, but where the code was aware of it and the final > > Any program where it printed something (talking about -fsanitize=undefined, > not the few sanitizers that go beyond what is required by the language) > is undefined, period. It can happen to "work" as some users expect, it can > crash, it can format your disk or anything else. There is no well defined > after a process runs into UB. > That's nonsense and you know it. There are plenty of things that are undefined by the C standard that we rely on anyway. But getting back to the question, well GCC carry such information further, and thus break code that is otherwise correct behaving on all known architectures, just because the C standard hasn't decided on one of two possible results? 'Allan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 9:27 ` Allan Sandfeld Jensen @ 2019-03-22 9:50 ` Jakub Jelinek 0 siblings, 0 replies; 34+ messages in thread From: Jakub Jelinek @ 2019-03-22 9:50 UTC (permalink / raw) To: Allan Sandfeld Jensen; +Cc: gcc On Fri, Mar 22, 2019 at 10:27:38AM +0100, Allan Sandfeld Jensen wrote: > But getting back to the question, well GCC carry such information further, and > thus break code that is otherwise correct behaving on all known architectures, > just because the C standard hasn't decided on one of two possible results? Of course it will, as will do any other optimizing compilers. An optimizing compiler optimizes on the assumption that undefined behavior does not happen. It is not done with the intent to punish those that write bad code, but with the intent to generate better code for valid code. Say if the standard says that signed integer overflow is undefined behavior, then not taking advantage of that means significant performance degradation of e.g. many loops with signed integer IVs or signed integer computations in it. You can compare performance of normal code vs. one built with additional -fwrapv. And in that case we provide a switch that makes it well defined behavior at the expense of making code slower. For out of bound shifts, there is no option like -fout-of-bound-shift={zero,masked,undefined}, it isn't worth it. And no, out of bounds shift don't have just two possible results even in HW, as I said, sometimes it is masked with a different mask from the bitmask of the type, at other times the architecture has multiple different instructions and some of them have one behavior and others have another behavior. Jakub ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-21 22:20 ` Allan Sandfeld Jensen 2019-03-21 22:31 ` Jakub Jelinek @ 2019-03-22 10:02 ` Andrew Haley 2019-03-22 10:20 ` Allan Sandfeld Jensen 1 sibling, 1 reply; 34+ messages in thread From: Andrew Haley @ 2019-03-22 10:02 UTC (permalink / raw) To: Allan Sandfeld Jensen, gcc, Jakub Jelinek On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: > From having fixed UBSAN warnings, I have seen many cases where undefined > behavior was performed, but where the code was aware of it and the final > result of the expression was well defined nonetheless. Is this belief about undefined behaviour commonplace among C programmers? There's nothing in the standard to justify it: any expression which contains UB is undefined. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 10:02 ` Andrew Haley @ 2019-03-22 10:20 ` Allan Sandfeld Jensen 2019-03-22 12:28 ` David Brown 2019-03-22 13:38 ` Andrew Haley 0 siblings, 2 replies; 34+ messages in thread From: Allan Sandfeld Jensen @ 2019-03-22 10:20 UTC (permalink / raw) To: gcc; +Cc: Andrew Haley, Jakub Jelinek On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote: > On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: > > From having fixed UBSAN warnings, I have seen many cases where undefined > > behavior was performed, but where the code was aware of it and the final > > result of the expression was well defined nonetheless. > > Is this belief about undefined behaviour commonplace among C programmers? > There's nothing in the standard to justify it: any expression which contains > UB is undefined. Yes, even GCC uses undefined behavior when it is considered defined for specific architecture, whether it be the result of unaligned access, negative shifts, etc. There is a lot of the warnings that UBSAN warns about that you will find both in GCC itself, the Linux kernel and many other places. 'Allan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 10:20 ` Allan Sandfeld Jensen @ 2019-03-22 12:28 ` David Brown 2019-03-22 12:40 ` Jakub Jelinek 2019-03-22 13:38 ` Andrew Haley 1 sibling, 1 reply; 34+ messages in thread From: David Brown @ 2019-03-22 12:28 UTC (permalink / raw) To: Allan Sandfeld Jensen, gcc; +Cc: Andrew Haley, Jakub Jelinek On 22/03/2019 11:20, Allan Sandfeld Jensen wrote: > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote: >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: >>> From having fixed UBSAN warnings, I have seen many cases where undefined >>> behavior was performed, but where the code was aware of it and the final >>> result of the expression was well defined nonetheless. >> >> Is this belief about undefined behaviour commonplace among C programmers? >> There's nothing in the standard to justify it: any expression which contains >> UB is undefined. > > Yes, even GCC uses undefined behavior when it is considered defined for > specific architecture, whether it be the result of unaligned access, negative > shifts, etc. There is a lot of the warnings that UBSAN warns about that you > will find both in GCC itself, the Linux kernel and many other places. > You are mixing up several things here. Behaviour can be undefined by the C standard, but defined elsewhere - in the implementation (i.e., compiler - with whatever flags are choosen), in the target ABI, in additional standards such as POSIX. If you compile C code with "gcc -fwrapv", then signed integer overflow is fully defined (as wrapping behaviour), regardless of what is written in the C standards. If you compile a "hello, world!" C program, then the C standards do not define how those words get on your screen. As far as /C/ is concerned, the workings of "printf" are undefined behaviour - because the C standard does not define them. Clearly, virtually any C program relies strongly on behaviour that is not defined by the C standards. However, expressions such as an overflowing signed shift are generally not defined /anywhere/. (It is, of course, possible for a particular C compiler to define them.) Unfortunately, it is true that some C programmers think it is fine to rely on undefined behaviour if it all seems to work fine in their tests - even when they /know/ it is undefined and unsafe. All we can do here is try to educate them - teach them that this is not good practice. It is also unfortunately the case that with older and weaker compilers, you could rely on their undocumented treatment of some kinds of undefined behaviour, and that this was the only way to get efficient results for certain coding problems. Correct, fully defined code was much slower on such compilers, while on better compilers the safe code works efficiently. If you can't avoid such old compilers, then at least use conditional compilation and pre-processor checks to ensure that the bad code is only used on the poor tools. And of course there are also lots of programmers whose knowledge of C is imperfect, or who make mistakes (and very, very few who don't!) - some code accidentally relies on the results of undefined behaviour. Even the gcc and Linux authors are not immune to the occasional bug in their code. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 12:28 ` David Brown @ 2019-03-22 12:40 ` Jakub Jelinek 0 siblings, 0 replies; 34+ messages in thread From: Jakub Jelinek @ 2019-03-22 12:40 UTC (permalink / raw) To: David Brown; +Cc: Allan Sandfeld Jensen, gcc, Andrew Haley On Fri, Mar 22, 2019 at 01:28:39PM +0100, David Brown wrote: > If you compile a "hello, world!" C program, then the C standards do not > define how those words get on your screen. As far as /C/ is concerned, > the workings of "printf" are undefined behaviour - because the C > standard does not define them. No, printf behavior is not undefined behavior. It is defined in terms of behavior on the streams (e.g. C99 7.19.2) and the exact streams behavior has some well defined and some implementation defined parts. Undefined behavior is not something not specified in the C standard, it is when the C standard says that in such and such case the behavior is undefined. The glossary (C99) says: implementation-defined behavior unspecified behavior where each implementation documents how the choice is made EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right. undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). EXAMPLE An example of undefined behavior is the behavior on integer overflow. Jakub ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 10:20 ` Allan Sandfeld Jensen 2019-03-22 12:28 ` David Brown @ 2019-03-22 13:38 ` Andrew Haley 2019-03-22 14:35 ` Allan Sandfeld Jensen 1 sibling, 1 reply; 34+ messages in thread From: Andrew Haley @ 2019-03-22 13:38 UTC (permalink / raw) To: Allan Sandfeld Jensen, gcc; +Cc: Jakub Jelinek On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote: > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote: >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: >>> From having fixed UBSAN warnings, I have seen many cases where undefined >>> behavior was performed, but where the code was aware of it and the final >>> result of the expression was well defined nonetheless. >> >> Is this belief about undefined behaviour commonplace among C programmers? >> There's nothing in the standard to justify it: any expression which contains >> UB is undefined. > > Yes, even GCC uses undefined behavior when it is considered defined for > specific architecture, If it's defined for a specific architecture it's not undefined. Any compiler is entitled to do anything with UB, and "anything" includes extending the language to make it well defined. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 13:38 ` Andrew Haley @ 2019-03-22 14:35 ` Allan Sandfeld Jensen 2019-03-22 22:08 ` Andrew Pinski 0 siblings, 1 reply; 34+ messages in thread From: Allan Sandfeld Jensen @ 2019-03-22 14:35 UTC (permalink / raw) To: gcc; +Cc: Andrew Haley, Jakub Jelinek On Freitag, 22. März 2019 14:38:10 CET Andrew Haley wrote: > On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote: > > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote: > >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: > >>> From having fixed UBSAN warnings, I have seen many cases where undefined > >>> behavior was performed, but where the code was aware of it and the final > >>> result of the expression was well defined nonetheless. > >> > >> Is this belief about undefined behaviour commonplace among C programmers? > >> There's nothing in the standard to justify it: any expression which > >> contains UB is undefined. > > > > Yes, even GCC uses undefined behavior when it is considered defined for > > specific architecture, > > If it's defined for a specific architecture it's not undefined. Any compiler > is entitled to do anything with UB, and "anything" includes extending the > language to make it well defined. True, but in the context of "things UBSAN warns about", that includes architecture specific details. And isn't unaligned access real undefined behavior that just happens to work on x86 (and newer ARM)? There are also stuff like type-punning unions which is not architecture specific, technically undefined, but which GCC explicitly tolerates (and needs to since some NEON intrinsics use it). 'Allan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 14:35 ` Allan Sandfeld Jensen @ 2019-03-22 22:08 ` Andrew Pinski 2019-03-22 22:38 ` Andrew Pinski 2019-03-22 23:42 ` Joseph Myers 0 siblings, 2 replies; 34+ messages in thread From: Andrew Pinski @ 2019-03-22 22:08 UTC (permalink / raw) To: Allan Sandfeld Jensen; +Cc: GCC Mailing List, Andrew Haley, Jakub Jelinek On Fri, Mar 22, 2019 at 7:35 AM Allan Sandfeld Jensen <linux@carewolf.com> wrote: > > On Freitag, 22. März 2019 14:38:10 CET Andrew Haley wrote: > > On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote: > > > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote: > > >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: > > >>> From having fixed UBSAN warnings, I have seen many cases where undefined > > >>> behavior was performed, but where the code was aware of it and the final > > >>> result of the expression was well defined nonetheless. > > >> > > >> Is this belief about undefined behaviour commonplace among C programmers? > > >> There's nothing in the standard to justify it: any expression which > > >> contains UB is undefined. > > > > > > Yes, even GCC uses undefined behavior when it is considered defined for > > > specific architecture, > > > > If it's defined for a specific architecture it's not undefined. Any compiler > > is entitled to do anything with UB, and "anything" includes extending the > > language to make it well defined. > > True, but in the context of "things UBSAN warns about", that includes > architecture specific details. > > And isn't unaligned access real undefined behavior that just happens to work > on x86 (and newer ARM)? > > There are also stuff like type-punning unions which is not architecture > specific, technically undefined, but which GCC explicitly tolerates (and needs > to since some NEON intrinsics use it). For type-punning unions, GCC goes out of its way and documents that is defined. This documented at: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing "The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type." Maybe it should be more explicit saying even though the C/C++ Langauges make it undefined, GCC makes it defined. As for unaligned accesses, they don't always work even on x86 because GCC does assume (in some cases; since 4.0 at least) the alignment that is required for that type. For an example the auto-vectorizer does use that fact about the alignment and types. There has been some bug reports about those cases too. Even on x86, there are some instructions (mostly SSE) which take only aligned memory locations (in some micro-architecture, the aligned load/store instructions give better performance than the unaligned ones) and will cause a fault. Thanks, Andrew Pinski > > 'Allan > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 22:08 ` Andrew Pinski @ 2019-03-22 22:38 ` Andrew Pinski 2019-03-22 23:42 ` Joseph Myers 1 sibling, 0 replies; 34+ messages in thread From: Andrew Pinski @ 2019-03-22 22:38 UTC (permalink / raw) To: Allan Sandfeld Jensen; +Cc: GCC Mailing List, Andrew Haley, Jakub Jelinek On Fri, Mar 22, 2019 at 3:08 PM Andrew Pinski <pinskia@gmail.com> wrote: > > On Fri, Mar 22, 2019 at 7:35 AM Allan Sandfeld Jensen > <linux@carewolf.com> wrote: > > > > On Freitag, 22. März 2019 14:38:10 CET Andrew Haley wrote: > > > On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote: > > > > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote: > > > >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote: > > > >>> From having fixed UBSAN warnings, I have seen many cases where undefined > > > >>> behavior was performed, but where the code was aware of it and the final > > > >>> result of the expression was well defined nonetheless. > > > >> > > > >> Is this belief about undefined behaviour commonplace among C programmers? > > > >> There's nothing in the standard to justify it: any expression which > > > >> contains UB is undefined. > > > > > > > > Yes, even GCC uses undefined behavior when it is considered defined for > > > > specific architecture, > > > > > > If it's defined for a specific architecture it's not undefined. Any compiler > > > is entitled to do anything with UB, and "anything" includes extending the > > > language to make it well defined. > > > > True, but in the context of "things UBSAN warns about", that includes > > architecture specific details. > > > > And isn't unaligned access real undefined behavior that just happens to work > > on x86 (and newer ARM)? > > > > There are also stuff like type-punning unions which is not architecture > > specific, technically undefined, but which GCC explicitly tolerates (and needs > > to since some NEON intrinsics use it). > > For type-punning unions, GCC goes out of its way and documents that is > defined. This documented at: > https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing > "The practice of reading from a different union member than the one > most recently written to (called “type-punning”) is common. Even with > -fstrict-aliasing, type-punning is allowed, provided the memory is > accessed through the union type." > Maybe it should be more explicit saying even though the C/C++ > Langauges make it undefined, GCC makes it defined. It also referenced in the implementation defined section of the manual: https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html#Structures-unions-enumerations-and-bit-fields-implementation Oh and it is implementation defined in C90 (but undefined in C99 and C++) :). So it is more complex than what you pointed out. Thanks, Andrew > As for unaligned accesses, they don't always work even on x86 because > GCC does assume (in some cases; since 4.0 at least) the alignment that > is required for that type. For an example the auto-vectorizer does > use that fact about the alignment and types. There has been some bug > reports about those cases too. Even on x86, there are some > instructions (mostly SSE) which take only aligned memory locations (in > some micro-architecture, the aligned load/store instructions give > better performance than the unaligned ones) and will cause a fault. > > Thanks, > Andrew Pinski > > > > > 'Allan > > > > > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: GCC turns &~ into | due to undefined bit-shift without warning 2019-03-22 22:08 ` Andrew Pinski 2019-03-22 22:38 ` Andrew Pinski @ 2019-03-22 23:42 ` Joseph Myers 1 sibling, 0 replies; 34+ messages in thread From: Joseph Myers @ 2019-03-22 23:42 UTC (permalink / raw) To: Andrew Pinski Cc: Allan Sandfeld Jensen, GCC Mailing List, Andrew Haley, Jakub Jelinek [-- Attachment #1: Type: text/plain, Size: 576 bytes --] On Fri, 22 Mar 2019, Andrew Pinski wrote: > "The practice of reading from a different union member than the one > most recently written to (called âtype-punningâ) is common. Even with > -fstrict-aliasing, type-punning is allowed, provided the memory is > accessed through the union type." > Maybe it should be more explicit saying even though the C/C++ > Langauges make it undefined, GCC makes it defined. Type-punning is defined in C99 TC3 and later (albeit only in a non-normative footnote, but the intent is clear). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2019-03-26 22:51 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-11 8:49 GCC turns &~ into | due to undefined bit-shift without warning Moritz Strübe 2019-03-11 9:14 ` Jakub Jelinek 2019-03-11 11:06 ` Moritz Strübe 2019-03-11 11:17 ` Jakub Jelinek 2019-03-20 14:08 ` Moritz Strübe 2019-03-20 14:26 ` Christophe Lyon 2019-03-20 15:39 ` Moritz Strübe 2019-03-20 15:49 ` Jakub Jelinek 2019-03-20 17:36 ` Andrew Haley 2019-03-21 8:17 ` Richard Biener 2019-03-21 8:25 ` Alexander Monakov 2019-03-21 8:35 ` Richard Biener 2019-03-21 8:54 ` Moritz Strübe 2019-03-21 9:52 ` Andrew Haley 2019-03-11 11:24 ` Vincent Lefevre 2019-03-11 12:51 ` David Brown 2019-03-12 15:40 ` Vincent Lefevre 2019-03-12 20:57 ` David Brown 2019-03-13 2:25 ` Vincent Lefevre 2019-03-13 10:18 ` David Brown 2019-03-26 22:51 ` Vincent Lefevre 2019-03-21 22:20 ` Allan Sandfeld Jensen 2019-03-21 22:31 ` Jakub Jelinek 2019-03-22 9:27 ` Allan Sandfeld Jensen 2019-03-22 9:50 ` Jakub Jelinek 2019-03-22 10:02 ` Andrew Haley 2019-03-22 10:20 ` Allan Sandfeld Jensen 2019-03-22 12:28 ` David Brown 2019-03-22 12:40 ` Jakub Jelinek 2019-03-22 13:38 ` Andrew Haley 2019-03-22 14:35 ` Allan Sandfeld Jensen 2019-03-22 22:08 ` Andrew Pinski 2019-03-22 22:38 ` Andrew Pinski 2019-03-22 23:42 ` Joseph Myers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).