public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same
@ 2021-01-18 14:41 doko at debian dot org
  2021-01-19  7:43 ` [Bug libstdc++/98731] " rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: doko at debian dot org @ 2021-01-18 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

            Bug ID: 98731
           Summary: s390x: Large classes of std::bitset and
                    std::vector<bool> hash the same
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: doko at debian dot org
  Target Milestone: ---

[forwarded from https://bugs.debian.org/977638]

same behavior with GCC 8, 9 and 10.

On s390x, std::hash returns identical values for large classes of
std::bitset and std::vector<bool>:

    $ cat bug.cc
    #include <bitset>
    #include <functional>
    #include <iostream>
    #include <vector>

    int main() {
      std::bitset<2> a("00"), b("01");
      std::vector<bool> c = {false, true, false, true};
      std::vector<bool> d = {true, false, true, false};

      std::bitset<9> e("000000000"), f("010101010");
      std::vector<bool> g = {true, true, true, true, true, true, true, true,
true};
      std::vector<bool> h = {false, false, false, true, true,
                             false, false, false, false};

      std::hash<std::bitset<2>> h1;
      std::hash<std::bitset<9>> h2;
      std::hash<std::vector<bool>> h3;

      std::cout << h1(a) << '\n'
                << h1(b) << '\n'
                << h3(c) << '\n'
                << h3(d) << "\n\n"
                << h2(e) << '\n'
                << h2(f) << '\n'
                << h3(g) << '\n'
                << h3(h) << '\n';
    }
    $ g++ -o bug bug.cc
    $ ./bug
    7857072875483051545
    7857072875483051545
    7857072875483051545
    7857072875483051545

    4158372090644325695
    4158372090644325695
    4158372090644325695
    4158372090644325695

It appears that the hash value is completely dependent on the size of
the object in bytes. 1–8-bit values all hash to 7857072875483051545;
9–16-bit values all hash to 4158372090644325695; and though bug.cc
doesn’t demonstrate it, 17-bit values hash to 14756137038141193723.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/98731] s390x: Large classes of std::bitset and std::vector<bool> hash the same
  2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
@ 2021-01-19  7:43 ` rguenth at gcc dot gnu.org
  2021-01-19 10:00 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-19  7:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |libstdc++

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I bet it's the default specialization.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/98731] s390x: Large classes of std::bitset and std::vector<bool> hash the same
  2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
  2021-01-19  7:43 ` [Bug libstdc++/98731] " rguenth at gcc dot gnu.org
@ 2021-01-19 10:00 ` redi at gcc dot gnu.org
  2021-01-19 18:02 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-01-19 10:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-01-19
           Assignee|unassigned at gcc dot gnu.org      |redi at gcc dot gnu.org
             Status|UNCONFIRMED                 |ASSIGNED
     Ever confirmed|0                           |1

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I think we just hash the bytes of the storage, but assume the first bytes are
populated. For a bitset<2> we have an unsigned long long where only the two
least significant bits are used. For LE they're in the first byte, but for BE
they're in the last byte.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/98731] s390x: Large classes of std::bitset and std::vector<bool> hash the same
  2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
  2021-01-19  7:43 ` [Bug libstdc++/98731] " rguenth at gcc dot gnu.org
  2021-01-19 10:00 ` redi at gcc dot gnu.org
@ 2021-01-19 18:02 ` redi at gcc dot gnu.org
  2021-01-21 11:35 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-01-19 18:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Yes the problem is that a bitset<2> uses the two least significant bits of an
unsigned long, so we want to hash a single byte. But we take the address of the
unsigned long and then hash the first byte. For BE we need to hash the last
byte.

With a bitset of more than 64 bits we have a layout like this:

[76543210][...43210]

We assume the value bytes are contiguous and just hash N bytes from the address
of the first long, and then the next 5 bytes. We should be hashing the last 5
bytes of the last word instead.

I haven't looked at hash<vector<bool>> but it's probably the same bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/98731] s390x: Large classes of std::bitset and std::vector<bool> hash the same
  2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
                   ` (2 preceding siblings ...)
  2021-01-19 18:02 ` redi at gcc dot gnu.org
@ 2021-01-21 11:35 ` redi at gcc dot gnu.org
  2021-04-19 10:40 ` redi at gcc dot gnu.org
  2021-09-29 18:15 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-01-21 11:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|s390x-linux-gnu             |s390x-linux-gnu,
                   |                            |powerpc-*-*, powerpc64-*-*
           Keywords|                            |ABI

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Matthias Klose from comment #0)
> It appears that the hash value is completely dependent on the size of
> the object in bytes.

It's not *completely* dependent on the size. Only the last x.size()%64 bits are
hashed incorrectly. For x.size() < 32 it's completely dependent, because we
never look at the right bits. For larger numbers of bits we look at *some* of
them.

Fixing it is an ABI break though, because it would mean that the same value
produces a different hash when compiled with an old GCC or a fixed GCC.
Inserting an element into an unordered_map in one TU and then looking it up in
another TU could fail.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/98731] s390x: Large classes of std::bitset and std::vector<bool> hash the same
  2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
                   ` (3 preceding siblings ...)
  2021-01-21 11:35 ` redi at gcc dot gnu.org
@ 2021-04-19 10:40 ` redi at gcc dot gnu.org
  2021-09-29 18:15 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-19 10:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
           Assignee|redi at gcc dot gnu.org            |unassigned at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/98731] s390x: Large classes of std::bitset and std::vector<bool> hash the same
  2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
                   ` (4 preceding siblings ...)
  2021-04-19 10:40 ` redi at gcc dot gnu.org
@ 2021-09-29 18:15 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-09-29 18:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98731

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |miladfarca at gmail dot com

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
*** Bug 102531 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-29 18:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 14:41 [Bug target/98731] New: s390x: Large classes of std::bitset and std::vector<bool> hash the same doko at debian dot org
2021-01-19  7:43 ` [Bug libstdc++/98731] " rguenth at gcc dot gnu.org
2021-01-19 10:00 ` redi at gcc dot gnu.org
2021-01-19 18:02 ` redi at gcc dot gnu.org
2021-01-21 11:35 ` redi at gcc dot gnu.org
2021-04-19 10:40 ` redi at gcc dot gnu.org
2021-09-29 18:15 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).