public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
@ 2024-05-20 10:03 raanan.lori at getnexar dot com
  2024-05-20 17:56 ` [Bug c++/115160] " jakub at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: raanan.lori at getnexar dot com @ 2024-05-20 10:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

            Bug ID: 115160
           Summary: Enabling undefined behaviour sanitizer causes or'ed
                    bit shift to report wrong result
           Product: gcc
           Version: 9.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: raanan.lori at getnexar dot com
  Target Milestone: ---

I've tried to enable gcc undefined behaviour sanitizer recently and one of my
tests encountered an issue where bit shifts in certain code snippet will report
a wrong result, i narrowed it down to this snippet.

I used compiler explorer to find that the same behaviour is also in the latest
release of gcc 14.4, tested with gcc 9.4 and 11.4

This was run under Unbuntu LTS 22.04 x86_64

My compile flags: `g++ -std=c++17 -O0  -fsanitize=undefined -Wall
-fno-sanitize-recover main.cpp"

The main.cpp code to replicate this issue:
```
#include <iostream>
#include <cstdint>
#include <vector>

int main() {
    std::vector<uint8_t> data;    
    data.emplace_back(1);
    data.emplace_back(0);
    data.emplace_back(0);
    data.emplace_back(0);
    auto b = data.begin();

    std::cout << (int)( (*b++)  | (*b++ << 8) | (*b++ << 16 ) | ( *b << 24 )
)<< std::endl;
    return 0;
}
```
expected output should be : 1
actual output: 256
note: no error or warning is produced for undefined behaviour

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c++/115160] Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
  2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
@ 2024-05-20 17:56 ` jakub at gcc dot gnu.org
  2024-05-21  9:41 ` raanan.lori at getnexar dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-05-20 17:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
This is just bogus testcase, anything can happen.
If you use
  unsigned char data[] = { 1, 0, 0, 0 }, *b = data;
instead of the
    std::vector<uint8_t> data;    
    data.emplace_back(1);
    data.emplace_back(0);
    data.emplace_back(0);
    data.emplace_back(0);
    auto b = data.begin();
lines, g++ -Wall diagnoses it:
pr115160.C: In function ‘int main()’:
pr115160.C:12:50: warning: operation on ‘b’ may be undefined [-Wsequence-point]
   12 |   std::cout << (int)( (*b++)  | (*b++ << 8) | (*b++ << 16 ) | ( *b <<
24 ) )<< std::endl;
      |                                                 ~^~
pr115160.C:12:50: warning: operation on ‘b’ may be undefined [-Wsequence-point]
pr115160.C:12:50: warning: operation on ‘b’ may be undefined [-Wsequence-point]
pr115160.C:12:50: warning: operation on ‘b’ may be undefined [-Wsequence-point]

While https://wg21.link/p0145r3 in C++17 made evaluation order in some cases
well defined, it certainly didn't do that for non-overloaded operator| or
similar operators.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c++/115160] Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
  2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
  2024-05-20 17:56 ` [Bug c++/115160] " jakub at gcc dot gnu.org
@ 2024-05-21  9:41 ` raanan.lori at getnexar dot com
  2024-05-21  9:53 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: raanan.lori at getnexar dot com @ 2024-05-21  9:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

--- Comment #2 from Ra'anan Lori <raanan.lori at getnexar dot com> ---
it's sad to see this response, 
since the major issue here is that a user that wants to use the undefined
behaviour sanitizer in such a case will not only get no warning but also still
have the UB and will get a changed value because of it


the "bogus" test case, is simplified example from a much bigger code base that
illustrates the exact issue.

i would expect that a detection for cpp code (instead of raw c) would also
catch the UB, but it seems it won't.

so i would at least expect the UB sanitizer to warn against not really working
when used with cpp code.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c++/115160] Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
  2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
  2024-05-20 17:56 ` [Bug c++/115160] " jakub at gcc dot gnu.org
  2024-05-21  9:41 ` raanan.lori at getnexar dot com
@ 2024-05-21  9:53 ` jakub at gcc dot gnu.org
  2024-05-21 10:07 ` raanan.lori at getnexar dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-05-21  9:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
There is no sequence point analyzer in any compiler I'm aware of and I think it
would be extremely hard to implement that.
GCC has -Wsequence-point warning, which can handle lots of cases, but with the
C++ abstractions the front-end can't really know what the different function
calls will do and whether they have some undesirable side-effects or not.
With the pointers, the compiler sees the post-increment operations, with
overloaded operators it doesn't, it sees calls to some operator and it doesn't
know what the operator will do.  Those functions could be pure and not change
anything on the shared object, or they can change it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c++/115160] Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
  2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
                   ` (2 preceding siblings ...)
  2024-05-21  9:53 ` jakub at gcc dot gnu.org
@ 2024-05-21 10:07 ` raanan.lori at getnexar dot com
  2024-05-21 11:46 ` redi at gcc dot gnu.org
  2024-05-21 12:09 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: raanan.lori at getnexar dot com @ 2024-05-21 10:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

--- Comment #4 from Ra'anan Lori <raanan.lori at getnexar dot com> ---
okay, thanks for the explanation

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c++/115160] Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
  2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
                   ` (3 preceding siblings ...)
  2024-05-21 10:07 ` raanan.lori at getnexar dot com
@ 2024-05-21 11:46 ` redi at gcc dot gnu.org
  2024-05-21 12:09 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2024-05-21 11:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Specifically, std::vector<uint8_t>::iterator::operator++(int) is "just a
function", so the compiler doesn't "know" that it modifies the iterator every
time it's called. To the compiler, the code looks like:

(deref(inc(b)) | (deref(inc(b)) << 8) | (deref(inc(b)) << 16) | (deref(inc(b))
<< 24)

And this isn't clear that this modifies b and that the result depends on the
unspecified order that each deref(inc(b)) subexpression is evaluated.

But that doesn't really matter anyway. I don't think there is actually any
undefined behaviour here at all, so UBsan should not give any errors.
Evaluation of each *b++ subexpression happens before the evaluation of the next
one. But the standard doesn't specify which one is "the next one". It's not
undefined, but it's unspecified. So it's valid for the compiler to evaluate it
left-to-right, but also valid to evaluate it right-to-left, or any other order
as long as the b++ evaluations do not interleave.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c++/115160] Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result
  2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
                   ` (4 preceding siblings ...)
  2024-05-21 11:46 ` redi at gcc dot gnu.org
@ 2024-05-21 12:09 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-05-21 12:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115160

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
There is no inc in the last term.

Anyway, simplified example would be
int deref (int *);
int *inc (int *&);

int
baz (int *p)
{
  return deref (inc (p)) | (deref (inc (p)) << 8) | (deref (inc (p)) << 16) |
(deref (p) << 24);
}
and while without the sanitization GCC chooses to evaluate this left to right,
with
sanitization it first turns it into essentially:
  return deref (inc (p)) | ((tmp1 = deref (inc (p))), check_shift (tmp1, 8),
tmp1 << 8) | ((tmp2 = deref (inc (p))), check_shift (tmp2, 16), tmp2 << 16) |
((tmp3 = deref (p)), check_shift (tmp3, 24), tmp3 << 24);
but then further optimizations move the comma expression left hand sides before
the whole |, so it first evaluates the second deref+inc + checks the shift by
8, then
the third deref+inc + checks the shift by 16 etc. and then only the |
expressions and this is still valid due to the unspecified order of | operand
evaluations.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-21 12:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-20 10:03 [Bug c++/115160] New: Enabling undefined behaviour sanitizer causes or'ed bit shift to report wrong result raanan.lori at getnexar dot com
2024-05-20 17:56 ` [Bug c++/115160] " jakub at gcc dot gnu.org
2024-05-21  9:41 ` raanan.lori at getnexar dot com
2024-05-21  9:53 ` jakub at gcc dot gnu.org
2024-05-21 10:07 ` raanan.lori at getnexar dot com
2024-05-21 11:46 ` redi at gcc dot gnu.org
2024-05-21 12:09 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).