public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* GCC turns &~ into | due to undefined bit-shift without warning
@ 2019-03-11  8:49 Moritz Strübe
  2019-03-11  9:14 ` Jakub Jelinek
  0 siblings, 1 reply; 34+ messages in thread
From: Moritz Strübe @ 2019-03-11  8:49 UTC (permalink / raw)
  To: gcc; +Cc: Nicolai Steinkamp

Hey,

I have the following code:

#include <stdint.h>

void LL_ADC_SetChannelSingleDiff(uint32_t * val, uint32_t Channel, 
uint32_t SingleDiff)
{
     *val = (*val & (~(Channel & 0x7FFFFU))) | ((Channel & 0x7FFFFU ) & 
(0x7FFFFU << (SingleDiff & 0x20U)));
}

void test(uint32_t * testvar) {
     LL_ADC_SetChannelSingleDiff(testvar, 0x2 ,0x7FU );
}

Starting with gcc 6 and -O2 this code produces an or-instruction instead 
of an and-not-instruction:

https://godbolt.org/z/kGtBfW

x86-64 -O1:
LL_ADC_SetChannelSingleDiff:
         and     esi, 524287
         or      DWORD PTR [rdi], esi
         ret
test:
         and     DWORD PTR [rdi], -3
         ret

x86-64 -O1:
LL_ADC_SetChannelSingleDiff:
         and     esi, 524287
         or      DWORD PTR [rdi], esi
         ret
test:
         or      DWORD PTR [rdi], 2
         ret




Considering that C11 6.5.7#3 ("If  the  value  of  the  right operand  
is  negative  or  is greater than or equal to the width of the promoted 
left operand, the behavior is undefined.") is not very widely known, as 
it "normally" just works, inverting the intent is quite unexpected.

Is there any option that would have helped me with this?

Should this be a bug? I know, from the C standard point of view this is 
ok, but inverting the behavior without warning is really bad in terms of 
user experience.

Clang does the same, but IMO that does not make things any better.

Cheers
Morty


-- 
Redheads Ltd. Softwaredienstleistungen
Schillerstr. 14
90409 Nürnberg

Telefon: +49 (0)911 180778-50
E-Mail: moritz.struebe@redheads.de | Web: www.redheads.de

Geschäftsführer: Andreas Hanke
Sitz der Gesellschaft: Lauf
Amtsgericht Nürnberg HRB 22681
Ust-ID: DE 249436843


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11  8:49 GCC turns &~ into | due to undefined bit-shift without warning Moritz Strübe
@ 2019-03-11  9:14 ` Jakub Jelinek
  2019-03-11 11:06   ` Moritz Strübe
  2019-03-21 22:20   ` Allan Sandfeld Jensen
  0 siblings, 2 replies; 34+ messages in thread
From: Jakub Jelinek @ 2019-03-11  9:14 UTC (permalink / raw)
  To: Moritz Strübe; +Cc: gcc, Nicolai Steinkamp

On Mon, Mar 11, 2019 at 08:49:30AM +0000, Moritz Strübe wrote:
> Considering that C11 6.5.7#3 ("If  the  value  of  the  right operand  
> is  negative  or  is greater than or equal to the width of the promoted 
> left operand, the behavior is undefined.") is not very widely known, as 
> it "normally" just works, inverting the intent is quite unexpected.
> 
> Is there any option that would have helped me with this?

You could build with -fsanitize=undefined, that would tell you at runtime you
have undefined behavior in your code (if the SingleDiff has bit ever 0x20
set).

The fact that negative or >= bit precision shifts are UB is widely known,
and even if it wouldn't, for the compiler all the UBs are just UBs, the
compiler optimizes on the assumption that UB does not happen, so when it
sees 32-bit int << (x & 32), it can assume x must be 0 at that point,
anything else is UB.

GCC has warnings for the simple cases, where one uses negative or too large
constant shift, warning in cases like you have would be a false positive for
many people, there is nothing wrong with that (if x & 32 always results in
0).

	Jakub

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11  9:14 ` Jakub Jelinek
@ 2019-03-11 11:06   ` Moritz Strübe
  2019-03-11 11:17     ` Jakub Jelinek
  2019-03-11 11:24     ` Vincent Lefevre
  2019-03-21 22:20   ` Allan Sandfeld Jensen
  1 sibling, 2 replies; 34+ messages in thread
From: Moritz Strübe @ 2019-03-11 11:06 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, Nicolai Steinkamp

On 11.03.2019 at 10:14 Jakub Jelinek wrote:
> You could build with -fsanitize=undefined, that would tell you at runtime you
> have undefined behavior in your code (if the SingleDiff has bit ever 0x20
> set).

Yes, that helps. Unfortunately I'm on an embedded system, thus the code 
size increase is just too big.

> The fact that negative or >= bit precision shifts are UB is widely known,
> and even if it wouldn't, for the compiler all the UBs are just UBs, the
> compiler optimizes on the assumption that UB does not happen, so when it
> sees 32-bit int << (x & 32), it can assume x must be 0 at that point,
> anything else is UB.

Thanks for that explanation. None the less, a compile time warning would 
be nice. Especially as I this was caused by a library provided by ST. :( 
Seems like we really need to add more sophisticated static analysis to 
our CI.

Morty


-- 
Redheads Ltd. Softwaredienstleistungen
Schillerstr. 14
90409 Nürnberg

Telefon: +49 (0)911 180778-50
E-Mail: moritz.struebe@redheads.de | Web: www.redheads.de

Geschäftsführer: Andreas Hanke
Sitz der Gesellschaft: Lauf
Amtsgericht Nürnberg HRB 22681
Ust-ID: DE 249436843


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11 11:06   ` Moritz Strübe
@ 2019-03-11 11:17     ` Jakub Jelinek
  2019-03-20 14:08       ` Moritz Strübe
  2019-03-11 11:24     ` Vincent Lefevre
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Jelinek @ 2019-03-11 11:17 UTC (permalink / raw)
  To: Moritz Strübe; +Cc: gcc, Nicolai Steinkamp

On Mon, Mar 11, 2019 at 11:06:37AM +0000, Moritz Strübe wrote:
> On 11.03.2019 at 10:14 Jakub Jelinek wrote:
> > You could build with -fsanitize=undefined, that would tell you at runtime you
> > have undefined behavior in your code (if the SingleDiff has bit ever 0x20
> > set).
> 
> Yes, that helps. Unfortunately I'm on an embedded system, thus the code 
> size increase is just too big.

You can -fsanitize-undefined-trap-on-error, which doesn't increase size too
much, it is less user-friendly, but still should catch the UB.

> > The fact that negative or >= bit precision shifts are UB is widely known,
> > and even if it wouldn't, for the compiler all the UBs are just UBs, the
> > compiler optimizes on the assumption that UB does not happen, so when it
> > sees 32-bit int << (x & 32), it can assume x must be 0 at that point,
> > anything else is UB.
> 
> Thanks for that explanation. None the less, a compile time warning would 
> be nice. Especially as I this was caused by a library provided by ST. :( 
> Seems like we really need to add more sophisticated static analysis to 
> our CI.

What you think the code would do for int << 32?  The probable reason why it
is UB in the standard is that each CPU handles that differently, on some
shift left by large count results in 0, on others the shift count is modulo
the bitsize, on yet others the shift count is also masked, but e.g. by
wordsize even when the shifted count is smaller (say int << 32 is 0, but int
<< 64 is like int << 0).

A warning is a bad idea generally, we'd need to warn for all cases where the
shift count is not compile time constant, all of those could be out of
bounds in theory.

	Jakub

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11 11:06   ` Moritz Strübe
  2019-03-11 11:17     ` Jakub Jelinek
@ 2019-03-11 11:24     ` Vincent Lefevre
  2019-03-11 12:51       ` David Brown
  1 sibling, 1 reply; 34+ messages in thread
From: Vincent Lefevre @ 2019-03-11 11:24 UTC (permalink / raw)
  To: Moritz Strübe; +Cc: Jakub Jelinek, gcc, Nicolai Steinkamp

On 2019-03-11 11:06:37 +0000, Moritz Strübe wrote:
> On 11.03.2019 at 10:14 Jakub Jelinek wrote:
> > The fact that negative or >= bit precision shifts are UB is widely known,
[...]

And even in the case where the compiler maps the shift directly to
the asm shift (without optimizations), the behavior may depend on
the processor.

> Thanks for that explanation. None the less, a compile time warning
> would be nice.

It already does by default:

       -Wshift-count-negative
           Warn if shift count is negative. This warning is enabled
           by default.

       -Wshift-count-overflow
           Warn if shift count >= width of type. This warning is
           enabled by default.

Of course, if the compiler cannot guess that there will be such
an issue, it will not emit the warning. You certainly don't want
a warning for each non-trivial shift just because the compiler
cannot know whether the constraint on the shift count will be
satisfied.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11 11:24     ` Vincent Lefevre
@ 2019-03-11 12:51       ` David Brown
  2019-03-12 15:40         ` Vincent Lefevre
  0 siblings, 1 reply; 34+ messages in thread
From: David Brown @ 2019-03-11 12:51 UTC (permalink / raw)
  To: Moritz Strübe, Jakub Jelinek, gcc, Nicolai Steinkamp

On 11/03/2019 12:24, Vincent Lefevre wrote:
> On 2019-03-11 11:06:37 +0000, Moritz Strübe wrote:
>> On 11.03.2019 at 10:14 Jakub Jelinek wrote:
>>> The fact that negative or >= bit precision shifts are UB is widely known,
> [...]
> 
> And even in the case where the compiler maps the shift directly to
> the asm shift (without optimizations), the behavior may depend on
> the processor.
> 
>> Thanks for that explanation. None the less, a compile time warning
>> would be nice.
> 
> It already does by default:
> 
>        -Wshift-count-negative
>            Warn if shift count is negative. This warning is enabled
>            by default.
> 
>        -Wshift-count-overflow
>            Warn if shift count >= width of type. This warning is
>            enabled by default.
> 
> Of course, if the compiler cannot guess that there will be such
> an issue, it will not emit the warning. You certainly don't want
> a warning for each non-trivial shift just because the compiler
> cannot know whether the constraint on the shift count will be
> satisfied.
> 

While the compiler clearly can't give a warning on calculated shifts
without massive amounts of false positives, it is able to give warnings
when there is a shift by a compile-time known constant value that is
invalid.  In the case of the OP's test function, inlining and constant
propagation means that the shift value /is/ known to the compiler - it
uses it for optimisation (in this case, it uses the undefined behaviour
to "simplify" the calculations).

Am I right in thinking that this is because the pass that checks the
shift sizes for warnings here comes before the relevant inlining or
constant propagation passes?  And if so, would it be possible to change
the ordering here?

Perhaps it would be possible to have a warning for when the compiler
optimises based on undefined behaviour, when the undefined behaviour
comes from values that are known to the compiler at compile time?  (When
the values are not known at compile time, optimising on the assumption
that the undefined behaviour doesn't happen is fair enough, and you
can't warn there without lots of false positives.)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11 12:51       ` David Brown
@ 2019-03-12 15:40         ` Vincent Lefevre
  2019-03-12 20:57           ` David Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Vincent Lefevre @ 2019-03-12 15:40 UTC (permalink / raw)
  To: gcc

On 2019-03-11 13:51:21 +0100, David Brown wrote:
> On 11/03/2019 12:24, Vincent Lefevre wrote:
> > It already does by default:
> > 
> >        -Wshift-count-negative
> >            Warn if shift count is negative. This warning is enabled
> >            by default.
> > 
> >        -Wshift-count-overflow
> >            Warn if shift count >= width of type. This warning is
> >            enabled by default.
> > 
> > Of course, if the compiler cannot guess that there will be such
> > an issue, it will not emit the warning. You certainly don't want
> > a warning for each non-trivial shift just because the compiler
> > cannot know whether the constraint on the shift count will be
> > satisfied.
> 
> While the compiler clearly can't give a warning on calculated shifts
> without massive amounts of false positives, it is able to give warnings
> when there is a shift by a compile-time known constant value that is
> invalid.  In the case of the OP's test function, inlining and constant
> propagation means that the shift value /is/ known to the compiler - it
> uses it for optimisation (in this case, it uses the undefined behaviour
> to "simplify" the calculations).
> 
> Am I right in thinking that this is because the pass that checks the
> shift sizes for warnings here comes before the relevant inlining or
> constant propagation passes?  And if so, would it be possible to change
> the ordering here?

To generate a warning, the compiler would also have to make sure
that the inlined code with an out-of-range shift is actually not
dead code. In practice, it may happen that constraints are not
satisfied on some platforms, but the reason is that on such
platforms the code will never be executed (this is code written
to take care of other platforms).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-12 15:40         ` Vincent Lefevre
@ 2019-03-12 20:57           ` David Brown
  2019-03-13  2:25             ` Vincent Lefevre
  0 siblings, 1 reply; 34+ messages in thread
From: David Brown @ 2019-03-12 20:57 UTC (permalink / raw)
  To: gcc

On 12/03/2019 16:40, Vincent Lefevre wrote:
> On 2019-03-11 13:51:21 +0100, David Brown wrote:
>> On 11/03/2019 12:24, Vincent Lefevre wrote:
>>> It already does by default:
>>>
>>>         -Wshift-count-negative
>>>             Warn if shift count is negative. This warning is enabled
>>>             by default.
>>>
>>>         -Wshift-count-overflow
>>>             Warn if shift count >= width of type. This warning is
>>>             enabled by default.
>>>
>>> Of course, if the compiler cannot guess that there will be such
>>> an issue, it will not emit the warning. You certainly don't want
>>> a warning for each non-trivial shift just because the compiler
>>> cannot know whether the constraint on the shift count will be
>>> satisfied.
>>
>> While the compiler clearly can't give a warning on calculated shifts
>> without massive amounts of false positives, it is able to give warnings
>> when there is a shift by a compile-time known constant value that is
>> invalid.  In the case of the OP's test function, inlining and constant
>> propagation means that the shift value /is/ known to the compiler - it
>> uses it for optimisation (in this case, it uses the undefined behaviour
>> to "simplify" the calculations).
>>
>> Am I right in thinking that this is because the pass that checks the
>> shift sizes for warnings here comes before the relevant inlining or
>> constant propagation passes?  And if so, would it be possible to change
>> the ordering here?
> 
> To generate a warning, the compiler would also have to make sure
> that the inlined code with an out-of-range shift is actually not
> dead code. In practice, it may happen that constraints are not
> satisfied on some platforms, but the reason is that on such
> platforms the code will never be executed (this is code written
> to take care of other platforms).
> 

I disagree.  To generate an unconditional error (rejecting the program), 
the compiler would need such proof - such as by tracing execution from 
main().  But to generate a warning activated specifically by the user, 
there is no such requirement.  It's fine to give a warning based on the 
code written, rather than on code that the compiler knows without doubt 
will be executed.

(The warning here would be on the function "test" which calls the 
function with the shift, not on that function itself - since it is only 
when used in "test" that the compiler can see that there is undefined 
behaviour.)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-12 20:57           ` David Brown
@ 2019-03-13  2:25             ` Vincent Lefevre
  2019-03-13 10:18               ` David Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Vincent Lefevre @ 2019-03-13  2:25 UTC (permalink / raw)
  To: gcc

On 2019-03-12 21:56:59 +0100, David Brown wrote:
> I disagree.  To generate an unconditional error (rejecting the program), the
> compiler would need such proof - such as by tracing execution from main().
> But to generate a warning activated specifically by the user, there is no
> such requirement.  It's fine to give a warning based on the code written,
> rather than on code that the compiler knows without doubt will be executed.

There's already a bug about spurious warnings on shift counts:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-13  2:25             ` Vincent Lefevre
@ 2019-03-13 10:18               ` David Brown
  2019-03-26 22:51                 ` Vincent Lefevre
  0 siblings, 1 reply; 34+ messages in thread
From: David Brown @ 2019-03-13 10:18 UTC (permalink / raw)
  To: gcc

On 13/03/2019 03:25, Vincent Lefevre wrote:
> On 2019-03-12 21:56:59 +0100, David Brown wrote:
>> I disagree.  To generate an unconditional error (rejecting the program), the
>> compiler would need such proof - such as by tracing execution from main().
>> But to generate a warning activated specifically by the user, there is no
>> such requirement.  It's fine to give a warning based on the code written,
>> rather than on code that the compiler knows without doubt will be executed.
> 
> There's already a bug about spurious warnings on shift counts:
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210
> 

You can divide code into three groups (with the exact divisions varying
by compiler switches and version):

1. Code that the compiler knows for sure will run in every execution of
the program, generally because it can track the code flow from main().

2. Code that the compiler knows will /not/ run, due to things like
constant propagation, inlining, etc.

3. Code that the compiler does not know if it will run or not.



Code in group 1 here is usually quite small.  Code in group 2 can be
large, especially with C++ header libraries, templates, etc.  The
compiler will often eliminate such code and avoid generating any object
code.  gcc used to have a warning for when it found "group 2" code and
eliminated it - that warning was removed as gcc got smarter, and the
false positives were overwhelming.

Most code is in group 3.


I would say that if the compiler finds undefined behaviour in group 1
code, it should give an unconditional error message, almost regardless
of compiler switches.  (Many people will disagree with me here - that's
okay.  Fortunately for everyone else, I am not the one who decides these
things in gcc!).  Certainly that is standards-condoned behaviour.

To be useful to the developer, warnings have to be applied to group 3
code.  That does mean a risk of false positives - some code will be
group 2 (never run) though the compiler doesn't know it.

I am arguing here that a warning like this should be applied to group 3
code - you are suggesting it should only apply to group 1.

The bug report you linked was for code in group 2 - code that the
compiler can (or should be able to) see is never run.  I can see it
makes sense to disable or hide warnings from such code, but it may be
useful to have them anyway.  I expect people to have different
preferences here.


(I see in that bug report, solutions are complicated because C lets you
"disable" a block by writing "if (0)", and then lets you jump into it
from outside with labels and goto's.  Perhaps that should automatically
trigger a warning saying "Your code is made of spaghetti.  Any other
warnings may be unreliable with many false positives and false negatives".)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11 11:17     ` Jakub Jelinek
@ 2019-03-20 14:08       ` Moritz Strübe
  2019-03-20 14:26         ` Christophe Lyon
                           ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Moritz Strübe @ 2019-03-20 14:08 UTC (permalink / raw)
  To: Jakub Jelinek, gcc

Hey.

Am 11.03.2019 um 12:17 schrieb Jakub Jelinek:

On Mon, Mar 11, 2019 at 11:06:37AM +0000, Moritz Strübe wrote:


On 11.03.2019 at 10:14 Jakub Jelinek wrote:


You could build with -fsanitize=undefined, that would tell you at runtime you
have undefined behavior in your code (if the SingleDiff has bit ever 0x20
set).


Yes, that helps. Unfortunately I'm on an embedded system, thus the code
size increase is just too big.


You can -fsanitize-undefined-trap-on-error, which doesn't increase size too
much, it is less user-friendly, but still should catch the UB.



Ok, I played around a bit. Interestingly, if I set -fsanitize=udefined and -fsanitize-undefined-trap-on-error the compiler detects that it will always trap, and optimizes the code accordingly (the code after the trap is removed).*  Which kind of brings me to David's argument: Shouldn't the compiler warn if there is undefined behavior it certainly knows of?
I do assume though that fsanitize just injects the test-code everywhere and relies on the compiler to remove it at unnecessary places. Would be nice, though. :)

Cheers
Morty

*After fixing the code, it got too big to fit.


--
Redheads Ltd. Softwaredienstleistungen
Schillerstr. 14
90409 Nürnberg

Telefon: +49 (0)911 180778-50
E-Mail: moritz.struebe@redheads.de<mailto:moritz.struebe@redheads.de> | Web: www.redheads.de<http://www.redheads.de>

Geschäftsführer: Andreas Hanke
Sitz der Gesellschaft: Lauf
Amtsgericht Nürnberg HRB 22681
Ust-ID: DE 249436843


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-20 14:08       ` Moritz Strübe
@ 2019-03-20 14:26         ` Christophe Lyon
  2019-03-20 15:39           ` Moritz Strübe
  2019-03-20 15:49         ` Jakub Jelinek
  2019-03-20 17:36         ` Andrew Haley
  2 siblings, 1 reply; 34+ messages in thread
From: Christophe Lyon @ 2019-03-20 14:26 UTC (permalink / raw)
  To: gcc

On 20/03/2019 15:08, Moritz Strübe wrote:
> Hey.
> 
> Am 11.03.2019 um 12:17 schrieb Jakub Jelinek:
> 
> On Mon, Mar 11, 2019 at 11:06:37AM +0000, Moritz Strübe wrote:
> 
> 
> On 11.03.2019 at 10:14 Jakub Jelinek wrote:
> 
> 
> You could build with -fsanitize=undefined, that would tell you at runtime you
> have undefined behavior in your code (if the SingleDiff has bit ever 0x20
> set).
> 
> 
> Yes, that helps. Unfortunately I'm on an embedded system, thus the code
> size increase is just too big.
> 
> 
> You can -fsanitize-undefined-trap-on-error, which doesn't increase size too
> much, it is less user-friendly, but still should catch the UB.
> 
> 

Wouldn't this fail to link? I thought the sanitizers need some runtime libraries which are only available under linux/macos/android. What do you mean by embedded? Isn't it arm-eabi?

> 
> Ok, I played around a bit. Interestingly, if I set -fsanitize=udefined and -fsanitize-undefined-trap-on-error the compiler detects that it will always trap, and optimizes the code accordingly (the code after the trap is removed).*  Which kind of brings me to David's argument: Shouldn't the compiler warn if there is undefined behavior it certainly knows of?
> I do assume though that fsanitize just injects the test-code everywhere and relies on the compiler to remove it at unnecessary places. Would be nice, though. :)
> 

Could you confirm in which version of the ST libraries you noticed this bug?
I'm told it was fixed on 23-march-2018.

Thanks,

Christophe


> Cheers
> Morty
> 
> *After fixing the code, it got too big to fit.
> 
> 
> --
> Redheads Ltd. Softwaredienstleistungen
> Schillerstr. 14
> 90409 Nürnberg
> 
> Telefon: +49 (0)911 180778-50
> E-Mail: moritz.struebe@redheads.de<mailto:moritz.struebe@redheads.de> | Web: www.redheads.de<http://www.redheads.de>
> 
> Geschäftsführer: Andreas Hanke
> Sitz der Gesellschaft: Lauf
> Amtsgericht Nürnberg HRB 22681
> Ust-ID: DE 249436843
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-20 14:26         ` Christophe Lyon
@ 2019-03-20 15:39           ` Moritz Strübe
  0 siblings, 0 replies; 34+ messages in thread
From: Moritz Strübe @ 2019-03-20 15:39 UTC (permalink / raw)
  To: gcc; +Cc: christophe.lyon

Hey.

Am 20.03.2019 um 15:26 schrieb Christophe Lyon:

You can -fsanitize-undefined-trap-on-error, which doesn't increase size too
much, it is less user-friendly, but still should catch the UB.



Wouldn't this fail to link? I thought the sanitizers need some runtime libraries which are only available under linux/macos/android. What do you mean by embedded? Isn't it arm-eabi?


Nope. It inserts a trap, triggering a hard fault (as the manual says). Works just fine.

Moritz

--
Redheads Ltd. Softwaredienstleistungen
Schillerstr. 14
90409 Nürnberg

Telefon: +49 (0)911 180778-50
E-Mail: moritz.struebe@redheads.de<mailto:moritz.struebe@redheads.de> | Web: www.redheads.de<http://www.redheads.de>

Geschäftsführer: Andreas Hanke
Sitz der Gesellschaft: Lauf
Amtsgericht Nürnberg HRB 22681
Ust-ID: DE 249436843


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-20 14:08       ` Moritz Strübe
  2019-03-20 14:26         ` Christophe Lyon
@ 2019-03-20 15:49         ` Jakub Jelinek
  2019-03-20 17:36         ` Andrew Haley
  2 siblings, 0 replies; 34+ messages in thread
From: Jakub Jelinek @ 2019-03-20 15:49 UTC (permalink / raw)
  To: Moritz Strübe; +Cc: gcc

On Wed, Mar 20, 2019 at 02:08:09PM +0000, Moritz Strübe wrote:
> Ok, I played around a bit. Interestingly, if I set -fsanitize=udefined and -fsanitize-undefined-trap-on-error the compiler detects that it will always trap, and optimizes the code accordingly (the code after the trap is removed).*  Which kind of brings me to David's argument: Shouldn't the compiler warn if there is undefined behavior it certainly knows of?

What does it mean certainly knows of?
The sanitization inserts (conditional) traps for all the constructs
that it sanitizes, you certainly don't want warning for that.
Even if the compiler can simplify or optimize out some of the guarding
conditionals around the traps, that doesn't mean it isn't in dead code that
will never be executed.
The only safe warning might be if the compiler can prove that whenever main
is called, there will be a trap executed later on, but that is not the case
in most programs, as one can't prove for most functions they actually never
loop and always return to the caller instead of say exiting, aborting, etc.
(and even if main traps immediately, one could have work done in
constructors and exit from there).  Otherwise, would you like to warn if
there is unconditional trap in some function?  That function could not be
ever called, or it could make some function calls before the trap that would
never return (exit, abort, throw exception, infinite loop).

	Jakub

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-20 14:08       ` Moritz Strübe
  2019-03-20 14:26         ` Christophe Lyon
  2019-03-20 15:49         ` Jakub Jelinek
@ 2019-03-20 17:36         ` Andrew Haley
  2019-03-21  8:17           ` Richard Biener
  2019-03-21  8:54           ` Moritz Strübe
  2 siblings, 2 replies; 34+ messages in thread
From: Andrew Haley @ 2019-03-20 17:36 UTC (permalink / raw)
  To: Moritz Strübe, Jakub Jelinek, gcc

On 3/20/19 2:08 PM, Moritz Strübe wrote:
> 
> Ok, I played around a bit. Interestingly, if I set
> -fsanitize=udefined and -fsanitize-undefined-trap-on-error the
> compiler detects that it will always trap, and optimizes the code
> accordingly (the code after the trap is removed).* Which kind of
> brings me to David's argument: Shouldn't the compiler warn if there
> is undefined behavior it certainly knows of?

Maybe an example would help.

Consider this code:

for (int i = start; i < limit; i++) {
  foo(i * 5);
}

Should GCC be entitled to turn it into

int limit_tmp = i * 5;
for (int i = start * 5; i < limit_tmp; i += 5) {
  foo(i);
}

If you answered "Yes, GCC should be allowed to do this", would you
want a warning? And how many such warnings might there be in a typical
program?

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-20 17:36         ` Andrew Haley
@ 2019-03-21  8:17           ` Richard Biener
  2019-03-21  8:25             ` Alexander Monakov
  2019-03-21  8:54           ` Moritz Strübe
  1 sibling, 1 reply; 34+ messages in thread
From: Richard Biener @ 2019-03-21  8:17 UTC (permalink / raw)
  To: Andrew Haley; +Cc: Moritz Strübe, Jakub Jelinek, gcc

On Wed, Mar 20, 2019 at 6:36 PM Andrew Haley <aph@redhat.com> wrote:
>
> On 3/20/19 2:08 PM, Moritz Strübe wrote:
> >
> > Ok, I played around a bit. Interestingly, if I set
> > -fsanitize=udefined and -fsanitize-undefined-trap-on-error the
> > compiler detects that it will always trap, and optimizes the code
> > accordingly (the code after the trap is removed).* Which kind of
> > brings me to David's argument: Shouldn't the compiler warn if there
> > is undefined behavior it certainly knows of?
>
> Maybe an example would help.
>
> Consider this code:
>
> for (int i = start; i < limit; i++) {
>   foo(i * 5);
> }
>
> Should GCC be entitled to turn it into
>
> int limit_tmp = i * 5;
> for (int i = start * 5; i < limit_tmp; i += 5) {
>   foo(i);
> }
>
> If you answered "Yes, GCC should be allowed to do this", would you
> want a warning? And how many such warnings might there be in a typical
> program?

I assume i is signed int.  Even then GCC may not do this unless it knows
the loop is entered (start < limit).

Richard.

>
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-21  8:17           ` Richard Biener
@ 2019-03-21  8:25             ` Alexander Monakov
  2019-03-21  8:35               ` Richard Biener
  0 siblings, 1 reply; 34+ messages in thread
From: Alexander Monakov @ 2019-03-21  8:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Haley, Moritz Strübe, Jakub Jelinek, gcc

On Thu, 21 Mar 2019, Richard Biener wrote:
> > Maybe an example would help.
> >
> > Consider this code:
> >
> > for (int i = start; i < limit; i++) {
> >   foo(i * 5);
> > }
> >
> > Should GCC be entitled to turn it into
> >
> > int limit_tmp = i * 5;
> > for (int i = start * 5; i < limit_tmp; i += 5) {
> >   foo(i);
> > }
> >
> > If you answered "Yes, GCC should be allowed to do this", would you
> > want a warning? And how many such warnings might there be in a typical
> > program?
> 
> I assume i is signed int.  Even then GCC may not do this unless it knows
> the loop is entered (start < limit).

Additionally, the compiler needs to prove that 'foo' always returns normally 
(i.e. cannot invoke exit/longjmp or such).

Alexander

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-21  8:25             ` Alexander Monakov
@ 2019-03-21  8:35               ` Richard Biener
  0 siblings, 0 replies; 34+ messages in thread
From: Richard Biener @ 2019-03-21  8:35 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Andrew Haley, Moritz Strübe, Jakub Jelinek, gcc

On Thu, Mar 21, 2019 at 9:25 AM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Thu, 21 Mar 2019, Richard Biener wrote:
> > > Maybe an example would help.
> > >
> > > Consider this code:
> > >
> > > for (int i = start; i < limit; i++) {
> > >   foo(i * 5);
> > > }
> > >
> > > Should GCC be entitled to turn it into
> > >
> > > int limit_tmp = i * 5;
> > > for (int i = start * 5; i < limit_tmp; i += 5) {
> > >   foo(i);
> > > }
> > >
> > > If you answered "Yes, GCC should be allowed to do this", would you
> > > want a warning? And how many such warnings might there be in a typical
> > > program?
> >
> > I assume i is signed int.  Even then GCC may not do this unless it knows
> > the loop is entered (start < limit).
>
> Additionally, the compiler needs to prove that 'foo' always returns normally
> (i.e. cannot invoke exit/longjmp or such).

Ah, yes.  Andrews example was probably meaning limit_tmp = limit * 5, not i * 5.
Computing start * 5 is fine if the loop is entered.

Richard.

>
> Alexander

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-20 17:36         ` Andrew Haley
  2019-03-21  8:17           ` Richard Biener
@ 2019-03-21  8:54           ` Moritz Strübe
  2019-03-21  9:52             ` Andrew Haley
  1 sibling, 1 reply; 34+ messages in thread
From: Moritz Strübe @ 2019-03-21  8:54 UTC (permalink / raw)
  To: Andrew Haley, Jakub Jelinek, gcc

Hey.

Am 20.03.2019 um 18:36 schrieb Andrew Haley:
> On 3/20/19 2:08 PM, Moritz Strübe wrote:
>> Ok, I played around a bit. Interestingly, if I set
>> -fsanitize=udefined and -fsanitize-undefined-trap-on-error the
>> compiler detects that it will always trap, and optimizes the code
>> accordingly (the code after the trap is removed).* Which kind of
>> brings me to David's argument: Shouldn't the compiler warn if there
>> is undefined behavior it certainly knows of?
> Maybe an example would help.
>
> Consider this code:
>
> for (int i = start; i < limit; i++) {
>    foo(i * 5);
> }
>
> Should GCC be entitled to turn it into
>
> int limit_tmp = i * 5;
> for (int i = start * 5; i < limit_tmp; i += 5) {
>    foo(i);
> }
>
> If you answered "Yes, GCC should be allowed to do this", would you
> want a warning? And how many such warnings might there be in a typical
> program?

Ok, let me see whether I get your point. I assume that should be "int 
limit_tmp = limit * 5;".
In the original version I have a potential integer overflow while 
passing a parameter. While in the second version, I have a potential 
overflow in limit_tmp and therefore the loop range and number of calls 
of foo is changed.
I think I start getting your point, but I none the less think it would 
be really nice to have an option(!) to warn me about such things 
nonetheless. Use cases would be libraries, or at least their interfaces 
and critical software or just support finding potential bugs. Especially 
when using third party libraries this would can help find potential issues.
Would it be possible to annotate the inserted checks with a debug symbol 
or similar? That way one could compile using LTO and then search for the 
remaining symbols? That would allow static analysis tools to search for 
these symbols and annotate the code.

Cheers
Moritz


-- 
Redheads Ltd. Softwaredienstleistungen
Schillerstr. 14
90409 Nürnberg

Telefon: +49 (0)911 180778-50
E-Mail: moritz.struebe@redheads.de | Web: www.redheads.de

Geschäftsführer: Andreas Hanke
Sitz der Gesellschaft: Lauf
Amtsgericht Nürnberg HRB 22681
Ust-ID: DE 249436843


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-21  8:54           ` Moritz Strübe
@ 2019-03-21  9:52             ` Andrew Haley
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Haley @ 2019-03-21  9:52 UTC (permalink / raw)
  To: Moritz Strübe, Jakub Jelinek, gcc

On 3/21/19 8:53 AM, Moritz Strübe wrote:
> Hey.
> 
> Am 20.03.2019 um 18:36 schrieb Andrew Haley:
>> On 3/20/19 2:08 PM, Moritz Strübe wrote:
>>> Ok, I played around a bit. Interestingly, if I set
>>> -fsanitize=udefined and -fsanitize-undefined-trap-on-error the
>>> compiler detects that it will always trap, and optimizes the code
>>> accordingly (the code after the trap is removed).* Which kind of
>>> brings me to David's argument: Shouldn't the compiler warn if there
>>> is undefined behavior it certainly knows of?
>> Maybe an example would help.
>>
>> Consider this code:
>>
>> for (int i = start; i < limit; i++) {
>>    foo(i * 5);
>> }
>>
>> Should GCC be entitled to turn it into
>>
>> int limit_tmp = i * 5;
>> for (int i = start * 5; i < limit_tmp; i += 5) {
>>    foo(i);
>> }
>>
>> If you answered "Yes, GCC should be allowed to do this", would you
>> want a warning? And how many such warnings might there be in a typical
>> program?
> 
> Ok, let me see whether I get your point. I assume that should be "int 
> limit_tmp = limit * 5;".

Yes, sorry.

> In the original version I have a potential integer overflow while 
> passing a parameter. While in the second version, I have a potential 
> overflow in limit_tmp and therefore the loop range and number of calls 
> of foo is changed.

That's right.

> I think I start getting your point, but I none the less think it would 
> be really nice to have an option(!) to warn me about such things 
> nonetheless.

There aren't necesarily points in the compiler where GCC says "look,
this would be UB, so delete the code." Sometimes GCC simply assumes
that things like overflows cannot happen, so it ignores the
possibility. The code I provided is an example of that.

I suppose we could utilize the sanitize=undefined framework and emit a
warning everywhere a runtime check was inserted. That will at least
allow you to check in every case that the overflow, null pointer
exception, etc, cannot happen.

There would be a lot of warnings.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-11  9:14 ` Jakub Jelinek
  2019-03-11 11:06   ` Moritz Strübe
@ 2019-03-21 22:20   ` Allan Sandfeld Jensen
  2019-03-21 22:31     ` Jakub Jelinek
  2019-03-22 10:02     ` Andrew Haley
  1 sibling, 2 replies; 34+ messages in thread
From: Allan Sandfeld Jensen @ 2019-03-21 22:20 UTC (permalink / raw)
  To: gcc, Jakub Jelinek

On Montag, 11. März 2019 10:14:49 CET Jakub Jelinek wrote:
> On Mon, Mar 11, 2019 at 08:49:30AM +0000, Moritz Strübe wrote:
> > Considering that C11 6.5.7#3 ("If  the  value  of  the  right operand 
> > is  negative  or  is greater than or equal to the width of the promoted
> > left operand, the behavior is undefined.") is not very widely known, as
> > it "normally" just works, inverting the intent is quite unexpected.
> > 
> > Is there any option that would have helped me with this?
> 
> You could build with -fsanitize=undefined, that would tell you at runtime
> you have undefined behavior in your code (if the SingleDiff has bit ever
> 0x20 set).
> 
> The fact that negative or >= bit precision shifts are UB is widely known,
> and even if it wouldn't, for the compiler all the UBs are just UBs, the
> compiler optimizes on the assumption that UB does not happen, so when it
> sees 32-bit int << (x & 32), it can assume x must be 0 at that point,
> anything else is UB.
> 
Hmm, I am curious. How strongly would gcc assume x is 0?

What if you have some expression that is undefined if x is not zero, but x 
really isn't zero and the result is temporarily undefined, but then another 
statement or part of the expression fixes the final result to something 
defined regardless of the intermediate. Would the compiler make assumptions 
that the intermediate value is never undefined, and possibly carry that 
analysed information over into other expressions?

From having fixed UBSAN warnings, I have seen many cases where undefined 
behavior was performed, but where the code was aware of it and the final 
result of the expression was well defined nonetheless.

'Allan


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-21 22:20   ` Allan Sandfeld Jensen
@ 2019-03-21 22:31     ` Jakub Jelinek
  2019-03-22  9:27       ` Allan Sandfeld Jensen
  2019-03-22 10:02     ` Andrew Haley
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Jelinek @ 2019-03-21 22:31 UTC (permalink / raw)
  To: Allan Sandfeld Jensen; +Cc: gcc

On Thu, Mar 21, 2019 at 11:19:54PM +0100, Allan Sandfeld Jensen wrote:
> Hmm, I am curious. How strongly would gcc assume x is 0?

If x is not 0, then it is undefined behavior and anything can happen,
so yes, it can assume x is 0, sometimes gcc does that, sometimes not,
it is not required to do that.

> From having fixed UBSAN warnings, I have seen many cases where undefined 
> behavior was performed, but where the code was aware of it and the final 

Any program where it printed something (talking about -fsanitize=undefined,
not the few sanitizers that go beyond what is required by the language)
is undefined, period.  It can happen to "work" as some users expect, it can
crash, it can format your disk or anything else.  There is no well defined
after a process runs into UB.

	Jakub

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-21 22:31     ` Jakub Jelinek
@ 2019-03-22  9:27       ` Allan Sandfeld Jensen
  2019-03-22  9:50         ` Jakub Jelinek
  0 siblings, 1 reply; 34+ messages in thread
From: Allan Sandfeld Jensen @ 2019-03-22  9:27 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc

On Donnerstag, 21. März 2019 23:31:48 CET Jakub Jelinek wrote:
> On Thu, Mar 21, 2019 at 11:19:54PM +0100, Allan Sandfeld Jensen wrote:
> > Hmm, I am curious. How strongly would gcc assume x is 0?
> 
> If x is not 0, then it is undefined behavior and anything can happen,
> so yes, it can assume x is 0, sometimes gcc does that, sometimes not,
> it is not required to do that.
> 
> > From having fixed UBSAN warnings, I have seen many cases where undefined
> > behavior was performed, but where the code was aware of it and the final
> 
> Any program where it printed something (talking about -fsanitize=undefined,
> not the few sanitizers that go beyond what is required by the language)
> is undefined, period.  It can happen to "work" as some users expect, it can
> crash, it can format your disk or anything else.  There is no well defined
> after a process runs into UB.
> 
That's nonsense and you know it. There are plenty of things that are undefined 
by the C standard that we rely on anyway.

But getting back to the question, well GCC carry such information further, and 
thus break code that is otherwise correct behaving on all known architectures, 
just because the C standard hasn't decided on one of two possible results?

'Allan



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22  9:27       ` Allan Sandfeld Jensen
@ 2019-03-22  9:50         ` Jakub Jelinek
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Jelinek @ 2019-03-22  9:50 UTC (permalink / raw)
  To: Allan Sandfeld Jensen; +Cc: gcc

On Fri, Mar 22, 2019 at 10:27:38AM +0100, Allan Sandfeld Jensen wrote:
> But getting back to the question, well GCC carry such information further, and 
> thus break code that is otherwise correct behaving on all known architectures, 
> just because the C standard hasn't decided on one of two possible results?

Of course it will, as will do any other optimizing compilers.

An optimizing compiler optimizes on the assumption that undefined behavior
does not happen.  It is not done with the intent to punish those that write
bad code, but with the intent to generate better code for valid code.

Say if the standard says that signed integer overflow is undefined behavior,
then not taking advantage of that means significant performance degradation
of e.g. many loops with signed integer IVs or signed integer computations in
it.  You can compare performance of normal code vs. one built with
additional -fwrapv.  And in that case we provide a switch that makes it well
defined behavior at the expense of making code slower.
For out of bound shifts, there is no option like
-fout-of-bound-shift={zero,masked,undefined}, it isn't worth it.

And no, out of bounds shift don't have just two possible results even in HW,
as I said, sometimes it is masked with a different mask from the bitmask
of the type, at other times the architecture has multiple different
instructions and some of them have one behavior and others have another
behavior.

	Jakub

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-21 22:20   ` Allan Sandfeld Jensen
  2019-03-21 22:31     ` Jakub Jelinek
@ 2019-03-22 10:02     ` Andrew Haley
  2019-03-22 10:20       ` Allan Sandfeld Jensen
  1 sibling, 1 reply; 34+ messages in thread
From: Andrew Haley @ 2019-03-22 10:02 UTC (permalink / raw)
  To: Allan Sandfeld Jensen, gcc, Jakub Jelinek

On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
> From having fixed UBSAN warnings, I have seen many cases where undefined 
> behavior was performed, but where the code was aware of it and the final 
> result of the expression was well defined nonetheless.

Is this belief about undefined behaviour commonplace among C programmers?
There's nothing in the standard to justify it: any expression which contains
UB is undefined.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 10:02     ` Andrew Haley
@ 2019-03-22 10:20       ` Allan Sandfeld Jensen
  2019-03-22 12:28         ` David Brown
  2019-03-22 13:38         ` Andrew Haley
  0 siblings, 2 replies; 34+ messages in thread
From: Allan Sandfeld Jensen @ 2019-03-22 10:20 UTC (permalink / raw)
  To: gcc; +Cc: Andrew Haley, Jakub Jelinek

On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote:
> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
> > From having fixed UBSAN warnings, I have seen many cases where undefined
> > behavior was performed, but where the code was aware of it and the final
> > result of the expression was well defined nonetheless.
> 
> Is this belief about undefined behaviour commonplace among C programmers?
> There's nothing in the standard to justify it: any expression which contains
> UB is undefined.

Yes, even GCC uses undefined behavior when it is considered defined for 
specific architecture, whether it be the result of unaligned access, negative 
shifts, etc. There is a lot of the warnings that UBSAN warns about that you 
will find both in GCC itself, the Linux kernel and many other places.

'Allan



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 10:20       ` Allan Sandfeld Jensen
@ 2019-03-22 12:28         ` David Brown
  2019-03-22 12:40           ` Jakub Jelinek
  2019-03-22 13:38         ` Andrew Haley
  1 sibling, 1 reply; 34+ messages in thread
From: David Brown @ 2019-03-22 12:28 UTC (permalink / raw)
  To: Allan Sandfeld Jensen, gcc; +Cc: Andrew Haley, Jakub Jelinek

On 22/03/2019 11:20, Allan Sandfeld Jensen wrote:
> On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote:
>> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
>>> From having fixed UBSAN warnings, I have seen many cases where undefined
>>> behavior was performed, but where the code was aware of it and the final
>>> result of the expression was well defined nonetheless.
>>
>> Is this belief about undefined behaviour commonplace among C programmers?
>> There's nothing in the standard to justify it: any expression which contains
>> UB is undefined.
> 
> Yes, even GCC uses undefined behavior when it is considered defined for 
> specific architecture, whether it be the result of unaligned access, negative 
> shifts, etc. There is a lot of the warnings that UBSAN warns about that you 
> will find both in GCC itself, the Linux kernel and many other places.
> 

You are mixing up several things here.

Behaviour can be undefined by the C standard, but defined elsewhere - in
the implementation (i.e., compiler - with whatever flags are choosen),
in the target ABI, in additional standards such as POSIX.

If you compile C code with "gcc -fwrapv", then signed integer overflow
is fully defined (as wrapping behaviour), regardless of what is written
in the C standards.

If you compile a "hello, world!" C program, then the C standards do not
define how those words get on your screen.  As far as /C/ is concerned,
the workings of "printf" are undefined behaviour - because the C
standard does not define them.

Clearly, virtually any C program relies strongly on behaviour that is
not defined by the C standards.


However, expressions such as an overflowing signed shift are generally
not defined /anywhere/.  (It is, of course, possible for a particular C
compiler to define them.)


Unfortunately, it is true that some C programmers think it is fine to
rely on undefined behaviour if it all seems to work fine in their tests
- even when they /know/ it is undefined and unsafe.  All we can do here
is try to educate them - teach them that this is not good practice.

It is also unfortunately the case that with older and weaker compilers,
you could rely on their undocumented treatment of some kinds of
undefined behaviour, and that this was the only way to get efficient
results for certain coding problems.  Correct, fully defined code was
much slower on such compilers, while on better compilers the safe code
works efficiently.  If you can't avoid such old compilers, then at least
use conditional compilation and pre-processor checks to ensure that the
bad code is only used on the poor tools.

And of course there are also lots of programmers whose knowledge of C is
imperfect, or who make mistakes (and very, very few who don't!) - some
code accidentally relies on the results of undefined behaviour.  Even
the gcc and Linux authors are not immune to the occasional bug in their
code.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 12:28         ` David Brown
@ 2019-03-22 12:40           ` Jakub Jelinek
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Jelinek @ 2019-03-22 12:40 UTC (permalink / raw)
  To: David Brown; +Cc: Allan Sandfeld Jensen, gcc, Andrew Haley

On Fri, Mar 22, 2019 at 01:28:39PM +0100, David Brown wrote:
> If you compile a "hello, world!" C program, then the C standards do not
> define how those words get on your screen.  As far as /C/ is concerned,
> the workings of "printf" are undefined behaviour - because the C
> standard does not define them.

No, printf behavior is not undefined behavior.  It is defined in terms of
behavior on the streams (e.g. C99 7.19.2) and the exact streams behavior has
some well defined and some implementation defined parts.

Undefined behavior is not something not specified in the C standard, it is
when the C standard says that in such and such case the behavior is
undefined.

The glossary (C99) says:

implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made

EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit
when a signed integer is shifted right.

undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements

NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable
results, to behaving during translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).
EXAMPLE An example of undefined behavior is the behavior on integer overflow.

	Jakub

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 10:20       ` Allan Sandfeld Jensen
  2019-03-22 12:28         ` David Brown
@ 2019-03-22 13:38         ` Andrew Haley
  2019-03-22 14:35           ` Allan Sandfeld Jensen
  1 sibling, 1 reply; 34+ messages in thread
From: Andrew Haley @ 2019-03-22 13:38 UTC (permalink / raw)
  To: Allan Sandfeld Jensen, gcc; +Cc: Jakub Jelinek

On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote:
> On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote:
>> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
>>> From having fixed UBSAN warnings, I have seen many cases where undefined
>>> behavior was performed, but where the code was aware of it and the final
>>> result of the expression was well defined nonetheless.
>>
>> Is this belief about undefined behaviour commonplace among C programmers?
>> There's nothing in the standard to justify it: any expression which contains
>> UB is undefined.
> 
> Yes, even GCC uses undefined behavior when it is considered defined for 
> specific architecture,

If it's defined for a specific architecture it's not undefined. Any compiler
is entitled to do anything with UB, and "anything" includes extending the
language to make it well defined.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 13:38         ` Andrew Haley
@ 2019-03-22 14:35           ` Allan Sandfeld Jensen
  2019-03-22 22:08             ` Andrew Pinski
  0 siblings, 1 reply; 34+ messages in thread
From: Allan Sandfeld Jensen @ 2019-03-22 14:35 UTC (permalink / raw)
  To: gcc; +Cc: Andrew Haley, Jakub Jelinek

On Freitag, 22. März 2019 14:38:10 CET Andrew Haley wrote:
> On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote:
> > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote:
> >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
> >>> From having fixed UBSAN warnings, I have seen many cases where undefined
> >>> behavior was performed, but where the code was aware of it and the final
> >>> result of the expression was well defined nonetheless.
> >> 
> >> Is this belief about undefined behaviour commonplace among C programmers?
> >> There's nothing in the standard to justify it: any expression which
> >> contains UB is undefined.
> > 
> > Yes, even GCC uses undefined behavior when it is considered defined for
> > specific architecture,
> 
> If it's defined for a specific architecture it's not undefined. Any compiler
> is entitled to do anything with UB, and "anything" includes extending the
> language to make it well defined.

True, but in the context of "things UBSAN warns about", that includes 
architecture specific details.

And isn't unaligned access real undefined behavior that just happens to work 
on x86 (and newer ARM)?

There are also stuff like type-punning unions which is not architecture 
specific, technically undefined, but which GCC explicitly tolerates (and needs 
to since some NEON intrinsics use it).

'Allan




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 14:35           ` Allan Sandfeld Jensen
@ 2019-03-22 22:08             ` Andrew Pinski
  2019-03-22 22:38               ` Andrew Pinski
  2019-03-22 23:42               ` Joseph Myers
  0 siblings, 2 replies; 34+ messages in thread
From: Andrew Pinski @ 2019-03-22 22:08 UTC (permalink / raw)
  To: Allan Sandfeld Jensen; +Cc: GCC Mailing List, Andrew Haley, Jakub Jelinek

On Fri, Mar 22, 2019 at 7:35 AM Allan Sandfeld Jensen
<linux@carewolf.com> wrote:
>
> On Freitag, 22. März 2019 14:38:10 CET Andrew Haley wrote:
> > On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote:
> > > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote:
> > >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
> > >>> From having fixed UBSAN warnings, I have seen many cases where undefined
> > >>> behavior was performed, but where the code was aware of it and the final
> > >>> result of the expression was well defined nonetheless.
> > >>
> > >> Is this belief about undefined behaviour commonplace among C programmers?
> > >> There's nothing in the standard to justify it: any expression which
> > >> contains UB is undefined.
> > >
> > > Yes, even GCC uses undefined behavior when it is considered defined for
> > > specific architecture,
> >
> > If it's defined for a specific architecture it's not undefined. Any compiler
> > is entitled to do anything with UB, and "anything" includes extending the
> > language to make it well defined.
>
> True, but in the context of "things UBSAN warns about", that includes
> architecture specific details.
>
> And isn't unaligned access real undefined behavior that just happens to work
> on x86 (and newer ARM)?
>
> There are also stuff like type-punning unions which is not architecture
> specific, technically undefined, but which GCC explicitly tolerates (and needs
> to since some NEON intrinsics use it).

For type-punning unions, GCC goes out of its way and documents that is
defined.  This documented at:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing
"The practice of reading from a different union member than the one
most recently written to (called “type-punning”) is common. Even with
-fstrict-aliasing, type-punning is allowed, provided the memory is
accessed through the union type."
Maybe it should be more explicit saying even though the C/C++
Langauges make it undefined, GCC makes it defined.
As for unaligned accesses, they don't always work even on x86 because
GCC does assume (in some cases; since 4.0 at least) the alignment that
is required for that type.  For an example the auto-vectorizer does
use that fact about the alignment and types.  There has been some bug
reports about those cases too.   Even on x86, there are some
instructions (mostly SSE) which take only aligned memory locations (in
some micro-architecture, the aligned load/store instructions give
better performance than the unaligned ones) and will cause a fault.

Thanks,
Andrew Pinski

>
> 'Allan
>
>
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 22:08             ` Andrew Pinski
@ 2019-03-22 22:38               ` Andrew Pinski
  2019-03-22 23:42               ` Joseph Myers
  1 sibling, 0 replies; 34+ messages in thread
From: Andrew Pinski @ 2019-03-22 22:38 UTC (permalink / raw)
  To: Allan Sandfeld Jensen; +Cc: GCC Mailing List, Andrew Haley, Jakub Jelinek

On Fri, Mar 22, 2019 at 3:08 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Fri, Mar 22, 2019 at 7:35 AM Allan Sandfeld Jensen
> <linux@carewolf.com> wrote:
> >
> > On Freitag, 22. März 2019 14:38:10 CET Andrew Haley wrote:
> > > On 3/22/19 10:20 AM, Allan Sandfeld Jensen wrote:
> > > > On Freitag, 22. März 2019 11:02:39 CET Andrew Haley wrote:
> > > >> On 3/21/19 10:19 PM, Allan Sandfeld Jensen wrote:
> > > >>> From having fixed UBSAN warnings, I have seen many cases where undefined
> > > >>> behavior was performed, but where the code was aware of it and the final
> > > >>> result of the expression was well defined nonetheless.
> > > >>
> > > >> Is this belief about undefined behaviour commonplace among C programmers?
> > > >> There's nothing in the standard to justify it: any expression which
> > > >> contains UB is undefined.
> > > >
> > > > Yes, even GCC uses undefined behavior when it is considered defined for
> > > > specific architecture,
> > >
> > > If it's defined for a specific architecture it's not undefined. Any compiler
> > > is entitled to do anything with UB, and "anything" includes extending the
> > > language to make it well defined.
> >
> > True, but in the context of "things UBSAN warns about", that includes
> > architecture specific details.
> >
> > And isn't unaligned access real undefined behavior that just happens to work
> > on x86 (and newer ARM)?
> >
> > There are also stuff like type-punning unions which is not architecture
> > specific, technically undefined, but which GCC explicitly tolerates (and needs
> > to since some NEON intrinsics use it).
>
> For type-punning unions, GCC goes out of its way and documents that is
> defined.  This documented at:
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing
> "The practice of reading from a different union member than the one
> most recently written to (called “type-punning”) is common. Even with
> -fstrict-aliasing, type-punning is allowed, provided the memory is
> accessed through the union type."
> Maybe it should be more explicit saying even though the C/C++
> Langauges make it undefined, GCC makes it defined.

It also referenced in the implementation defined section of the manual:
https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html#Structures-unions-enumerations-and-bit-fields-implementation
Oh and it is implementation defined in C90 (but undefined in C99 and
C++) :).  So it is more complex than what you pointed out.

Thanks,
Andrew


> As for unaligned accesses, they don't always work even on x86 because
> GCC does assume (in some cases; since 4.0 at least) the alignment that
> is required for that type.  For an example the auto-vectorizer does
> use that fact about the alignment and types.  There has been some bug
> reports about those cases too.   Even on x86, there are some
> instructions (mostly SSE) which take only aligned memory locations (in
> some micro-architecture, the aligned load/store instructions give
> better performance than the unaligned ones) and will cause a fault.
>
> Thanks,
> Andrew Pinski
>
> >
> > 'Allan
> >
> >
> >
> >

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-22 22:08             ` Andrew Pinski
  2019-03-22 22:38               ` Andrew Pinski
@ 2019-03-22 23:42               ` Joseph Myers
  1 sibling, 0 replies; 34+ messages in thread
From: Joseph Myers @ 2019-03-22 23:42 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Allan Sandfeld Jensen, GCC Mailing List, Andrew Haley, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

On Fri, 22 Mar 2019, Andrew Pinski wrote:

> "The practice of reading from a different union member than the one
> most recently written to (called “type-punning”) is common. Even with
> -fstrict-aliasing, type-punning is allowed, provided the memory is
> accessed through the union type."
> Maybe it should be more explicit saying even though the C/C++
> Langauges make it undefined, GCC makes it defined.

Type-punning is defined in C99 TC3 and later (albeit only in a 
non-normative footnote, but the intent is clear).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: GCC turns &~ into | due to undefined bit-shift without warning
  2019-03-13 10:18               ` David Brown
@ 2019-03-26 22:51                 ` Vincent Lefevre
  0 siblings, 0 replies; 34+ messages in thread
From: Vincent Lefevre @ 2019-03-26 22:51 UTC (permalink / raw)
  To: gcc

On 2019-03-13 11:18:02 +0100, David Brown wrote:
> On 13/03/2019 03:25, Vincent Lefevre wrote:
> > On 2019-03-12 21:56:59 +0100, David Brown wrote:
> >> I disagree. To generate an unconditional error (rejecting the
> >> program), the compiler would need such proof - such as by tracing
> >> execution from main(). But to generate a warning activated
> >> specifically by the user, there is no such requirement. It's fine
> >> to give a warning based on the code written, rather than on code
> >> that the compiler knows without doubt will be executed.
> > 
> > There's already a bug about spurious warnings on shift counts:
> > 
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210
> > 
> 
> You can divide code into three groups (with the exact divisions varying
> by compiler switches and version):
> 
> 1. Code that the compiler knows for sure will run in every execution of
> the program, generally because it can track the code flow from main().
> 
> 2. Code that the compiler knows will /not/ run, due to things like
> constant propagation, inlining, etc.
> 
> 3. Code that the compiler does not know if it will run or not.

Actually more than the fact whether the code will be run or not,
what is important is the concept of reachability. This will give:

1. Code that the compiler knows for sure will run under some
conditions (e.g. particular values of inputs).

2. Code that the compiler knows will never run (what I called dead code).

3. Code for which the compile can't decide.

> Code in group 1 here is usually quite small.  Code in group 2 can be
> large, especially with C++ header libraries, templates, etc.  The
> compiler will often eliminate such code and avoid generating any object
> code.  gcc used to have a warning for when it found "group 2" code and
> eliminated it - that warning was removed as gcc got smarter, and the
> false positives were overwhelming.
> 
> Most code is in group 3.

It depends on how the code is written. The programmer could try
to avoid group 3 by giving hints to the compiler, e.g. with
__builtin_unreachable(). I wish this were standardized in C.

> I am arguing here that a warning like this should be applied to group 3
> code - you are suggesting it should only apply to group 1.

No, I was just suggesting that the compiler should be smart enough
to detect dead code (when possible). In the past, there was a similar
issue with -Wmaybe-uninitialized, which was rather useless (too many
false positives in complex code). Fortunetaly, this has improved a
lot.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2019-03-26 22:51 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-11  8:49 GCC turns &~ into | due to undefined bit-shift without warning Moritz Strübe
2019-03-11  9:14 ` Jakub Jelinek
2019-03-11 11:06   ` Moritz Strübe
2019-03-11 11:17     ` Jakub Jelinek
2019-03-20 14:08       ` Moritz Strübe
2019-03-20 14:26         ` Christophe Lyon
2019-03-20 15:39           ` Moritz Strübe
2019-03-20 15:49         ` Jakub Jelinek
2019-03-20 17:36         ` Andrew Haley
2019-03-21  8:17           ` Richard Biener
2019-03-21  8:25             ` Alexander Monakov
2019-03-21  8:35               ` Richard Biener
2019-03-21  8:54           ` Moritz Strübe
2019-03-21  9:52             ` Andrew Haley
2019-03-11 11:24     ` Vincent Lefevre
2019-03-11 12:51       ` David Brown
2019-03-12 15:40         ` Vincent Lefevre
2019-03-12 20:57           ` David Brown
2019-03-13  2:25             ` Vincent Lefevre
2019-03-13 10:18               ` David Brown
2019-03-26 22:51                 ` Vincent Lefevre
2019-03-21 22:20   ` Allan Sandfeld Jensen
2019-03-21 22:31     ` Jakub Jelinek
2019-03-22  9:27       ` Allan Sandfeld Jensen
2019-03-22  9:50         ` Jakub Jelinek
2019-03-22 10:02     ` Andrew Haley
2019-03-22 10:20       ` Allan Sandfeld Jensen
2019-03-22 12:28         ` David Brown
2019-03-22 12:40           ` Jakub Jelinek
2019-03-22 13:38         ` Andrew Haley
2019-03-22 14:35           ` Allan Sandfeld Jensen
2019-03-22 22:08             ` Andrew Pinski
2019-03-22 22:38               ` Andrew Pinski
2019-03-22 23:42               ` Joseph Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).