public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/59406] New: functions labelled FNV-1a in tr1/functional_hash.h are not FNV-1a
@ 2013-12-06 11:57 g1pi at libero dot it
  2013-12-06 15:39 ` [Bug libstdc++/59406] " g1pi at libero dot it
  2015-03-31 15:17 ` redi at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: g1pi at libero dot it @ 2013-12-06 11:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59406

            Bug ID: 59406
           Summary: functions labelled FNV-1a in tr1/functional_hash.h are
                    not FNV-1a
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: g1pi at libero dot it

Created attachment 31390
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31390&action=edit
preprocessed program

As far as I know, the FNV and FNV-1a algorithms process octects, i.e. unsigned
chars.  (see e.g. http://tools.ietf.org/html/draft-eastlake-fnv-03#section-2 )

The implementations in tr1/functional_hash.h work
on chars, instead, and their output is wrong (in architectures where char is
signed) when the input byte sequence includes bytes with the MSB set.

For example
    std::tr1::_Fnv_hash_base<4>::hash("\x80", 1)
returns 83079839, instead of the correct value 2232128415.

The problem resides in the type cast:

    template<>
    struct _Fnv_hash_base<4>
    {
      template<typename _Tp>
        static size_t
        hash(const _Tp* __ptr, size_t __clength)
        {
          size_t __result = static_cast<size_t>(2166136261UL);
>>>       const char* __cptr = reinterpret_cast<const char*>(__ptr);    <<<
          for (; __clength; --__clength)
            {
              __result ^= static_cast<size_t>(*__cptr++);
              __result *= static_cast<size_t>(16777619UL);
            }
          return __result;
        }
    };

The highlighed line should read:
const unsigned char* __cptr = reinterpret_cast<const unsigned char*>(__ptr);

Since the FNV primes were carefully chosen in order to minimize
collision rates and maximize dispersion, there might be consequences in
the performance of data structures that build on tr1::hash template,
perhaps tr1::unordered_map and tr1::unordered_set.

Here is a short program exposing the bug:

#include <iostream>
#include <tr1/unordered_map>

int main()
{
    std::cout << std::tr1::_Fnv_hash_base<4>::hash("\x80", 1)
        << " computed, 2232128415 expected\n";

}

and here is the output of g++ -v ...:

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i486-linux-gnu/4.7/lto-wrapper
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-5'
--with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs
--enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.7 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object
--enable-plugin --enable-objc-gc --enable-targets=all --with-arch-32=i586
--with-tune=generic --enable-checking=release --build=i486-linux-gnu
--host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-5) 
COLLECT_GCC_OPTIONS='-O' '-Wextra' '-Wall' '-ansi' '-pedantic' '-v'
'-save-temps' '-fno-strict-aliasing' '-fwrapv' '-shared-libgcc'
'-mtune=generic' '-march=i586'
 /usr/lib/gcc/i486-linux-gnu/4.7/cc1plus -E -quiet -v -imultiarch
i386-linux-gnu -D_GNU_SOURCE fnv-bad.cc -mtune=generic -march=i586 -ansi
-Wextra -Wall -pedantic -fno-strict-aliasing -fwrapv -O -fpch-preprocess -o
fnv-bad.ii
ignoring nonexistent directory "/usr/local/include/i386-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/i486-linux-gnu/4.7/../../../../i486-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/4.7
 /usr/include/c++/4.7/i486-linux-gnu
 /usr/include/c++/4.7/backward
 /usr/lib/gcc/i486-linux-gnu/4.7/include
 /usr/local/include
 /usr/lib/gcc/i486-linux-gnu/4.7/include-fixed
 /usr/include/i386-linux-gnu
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-O' '-Wextra' '-Wall' '-ansi' '-pedantic' '-v'
'-save-temps' '-fno-strict-aliasing' '-fwrapv' '-shared-libgcc'
'-mtune=generic' '-march=i586'
 /usr/lib/gcc/i486-linux-gnu/4.7/cc1plus -fpreprocessed fnv-bad.ii -quiet
-dumpbase fnv-bad.cc -mtune=generic -march=i586 -auxbase fnv-bad -O -Wextra
-Wall -pedantic -ansi -version -fno-strict-aliasing -fwrapv -o fnv-bad.s
GNU C++ (Debian 4.7.2-5) version 4.7.2 (i486-linux-gnu)
        compiled by GNU C version 4.7.2, GMP version 5.0.5, MPFR version
3.1.0-p10, MPC version 0.9
GGC heuristics: --param ggc-min-expand=63 --param ggc-min-heapsize=62264
GNU C++ (Debian 4.7.2-5) version 4.7.2 (i486-linux-gnu)
        compiled by GNU C version 4.7.2, GMP version 5.0.5, MPFR version
3.1.0-p10, MPC version 0.9
GGC heuristics: --param ggc-min-expand=63 --param ggc-min-heapsize=62264
Compiler executable checksum: 62bfd556e00a93e3d7f66f6876d73826
COLLECT_GCC_OPTIONS='-O' '-Wextra' '-Wall' '-ansi' '-pedantic' '-v'
'-save-temps' '-fno-strict-aliasing' '-fwrapv' '-shared-libgcc'
'-mtune=generic' '-march=i586'
 as -v --32 -o fnv-bad.o fnv-bad.s
GNU assembler version 2.22 (i486-linux-gnu) using BFD version (GNU Binutils for
Debian) 2.22
COMPILER_PATH=/usr/lib/gcc/i486-linux-gnu/4.7/:/usr/lib/gcc/i486-linux-gnu/4.7/:/usr/lib/gcc/i486-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/4.7/:/usr/lib/gcc/i486-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/i486-linux-gnu/4.7/:/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/4.7/../../../../lib/:/lib/i386-linux-gnu/:/lib/../lib/:/usr/lib/i386-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/i486-linux-gnu/4.7/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-O' '-Wextra' '-Wall' '-ansi' '-pedantic' '-v'
'-save-temps' '-fno-strict-aliasing' '-fwrapv' '-shared-libgcc'
'-mtune=generic' '-march=i586'
 /usr/lib/gcc/i486-linux-gnu/4.7/collect2 --sysroot=/ --build-id
--no-add-needed --eh-frame-hdr -m elf_i386 --hash-style=both -dynamic-linker
/lib/ld-linux.so.2
/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu/crt1.o
/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu/crti.o
/usr/lib/gcc/i486-linux-gnu/4.7/crtbegin.o -L/usr/lib/gcc/i486-linux-gnu/4.7
-L/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu
-L/usr/lib/gcc/i486-linux-gnu/4.7/../../../../lib -L/lib/i386-linux-gnu
-L/lib/../lib -L/usr/lib/i386-linux-gnu -L/usr/lib/../lib
-L/usr/lib/gcc/i486-linux-gnu/4.7/../../.. fnv-bad.o -lstdc++ -lm -lgcc_s -lgcc
-lc -lgcc_s -lgcc /usr/lib/gcc/i486-linux-gnu/4.7/crtend.o
/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu/crtn.o

In attachment, the preprocessed program (.ii).  This problem was also reported
to Debian (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=731330).

Best regards,
        g


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug libstdc++/59406] functions labelled FNV-1a in tr1/functional_hash.h are not FNV-1a
  2013-12-06 11:57 [Bug libstdc++/59406] New: functions labelled FNV-1a in tr1/functional_hash.h are not FNV-1a g1pi at libero dot it
@ 2013-12-06 15:39 ` g1pi at libero dot it
  2015-03-31 15:17 ` redi at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: g1pi at libero dot it @ 2013-12-06 15:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59406

--- Comment #2 from g1pi at libero dot it ---
(In reply to Jonathan Wakely from comment #1)
> We're not really making any non-critical changes to TR1 code now.

Yet std::_Fnv_hash_bytes in non-TR1 code has the same problem (lines 116 and
161 of svn://gcc.gnu.org/svn/gcc/trunk/libstdc++-v3/libsupc++/hash_bytes.cc at
revision 205748).  The following program emits 83079839 instead of the expected
FNV-1a result.

#include <iostream>
#include <unordered_map>
int main() { 
    std::cout << std::_Fnv_hash_bytes("\x80", 1, 2166136261UL) << '\n';
}

Don't know if those functions are used somewhere or are leftovers.  The
comments in the file suggest that Murmur hash is used in the std::hash template
instead of FNV-1a.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug libstdc++/59406] functions labelled FNV-1a in tr1/functional_hash.h are not FNV-1a
  2013-12-06 11:57 [Bug libstdc++/59406] New: functions labelled FNV-1a in tr1/functional_hash.h are not FNV-1a g1pi at libero dot it
  2013-12-06 15:39 ` [Bug libstdc++/59406] " g1pi at libero dot it
@ 2015-03-31 15:17 ` redi at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: redi at gcc dot gnu.org @ 2015-03-31 15:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59406

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-03-31
     Ever confirmed|0                           |1

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to g1 from comment #2)
> Don't know if those functions are used somewhere or are leftovers.  The
> comments in the file suggest that Murmur hash is used in the std::hash
> template instead of FNV-1a.

They're just leftovers retained for backwards compatibility in applications
that need the old hashing algorithm, so changing them is probably not a good
idea as it would not be backwards compatible.

I'll add a comment to libsupsc++/hash_bytes.cc saying it's not really FNV-1a
but I don't think we should change it now.

Thanks for pointing it out though.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-31 14:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-06 11:57 [Bug libstdc++/59406] New: functions labelled FNV-1a in tr1/functional_hash.h are not FNV-1a g1pi at libero dot it
2013-12-06 15:39 ` [Bug libstdc++/59406] " g1pi at libero dot it
2015-03-31 15:17 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).