public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Enhance std::hash for pointers
@ 2015-05-06 20:10 François Dumont
  2015-05-08  8:02 ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: François Dumont @ 2015-05-06 20:10 UTC (permalink / raw)
  To: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

Hi

     Following Marc Glisse comment #4 
on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to 
propose this enhancement to the hash functor for pointers. It simply 
gets rid of the irrelevant bits on pointers hash code based on memory 
alignment of the pointed type. The only drawback I can think of is that 
the type needs to be complete at std::hash instantiation time but is it 
really an issue ?

     IMO it is quite obvious that the resulting hash code will be better 
but if anyone has a good method to prove it I can try to implement it. 
The test I have added in quality.cc is very basic and just reflect 
enhancement following Marc's comment.

2015-05-05  François Dumont <fdumont@gcc.gnu.org>

     * include/bits/functional_hash.h
     (std::__detail::_Lowest_power_of_two<size_t>): New.
     (std::hash<_Tp*>::operator()): Use latter.
     * testsuite/20_util/hash/quality.cc (pointer_quality_test): New.

Tested under Linux x86_64.

François


[-- Attachment #2: pointer_hash.patch --]
[-- Type: text/x-patch, Size: 1782 bytes --]

diff --git a/libstdc++-v3/include/bits/functional_hash.h b/libstdc++-v3/include/bits/functional_hash.h
index d94843f..a217f8a 100644
--- a/libstdc++-v3/include/bits/functional_hash.h
+++ b/libstdc++-v3/include/bits/functional_hash.h
@@ -36,6 +36,29 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
+namespace __detail
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  // Compute highest power of 2 lower or equal to __n.
+  template<size_t __n>
+    struct _Lowest_power_of_two
+    {
+      static const size_t __val
+        = _Lowest_power_of_two< (__n >> 1) >::__val + 1;
+    };
+
+  template<>
+    struct _Lowest_power_of_two<1>
+    { static const size_t __val = 0; };
+
+  template<>
+    struct _Lowest_power_of_two<0>
+    { static const size_t __val = 0; };
+
+_GLIBCXX_END_NAMESPACE_VERSION
+}
+
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /** @defgroup hashes Hashes
@@ -63,7 +86,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     {
       size_t
       operator()(_Tp* __p) const noexcept
-      { return reinterpret_cast<size_t>(__p); }
+      {
+	return reinterpret_cast<size_t>(__p)
+	  >> __detail::_Lowest_power_of_two<alignof(_Tp)>::__val;
+      }
     };
 
   // Explicit specializations for integer types.
diff --git a/libstdc++-v3/testsuite/20_util/hash/quality.cc b/libstdc++-v3/testsuite/20_util/hash/quality.cc
index af417ed..d9c72c7 100644
--- a/libstdc++-v3/testsuite/20_util/hash/quality.cc
+++ b/libstdc++-v3/testsuite/20_util/hash/quality.cc
@@ -164,9 +164,20 @@ quality_test()
     }
 }
 
+void
+pointer_quality_test()
+{
+  bool test __attribute__((unused)) = true;
+
+  double d1, d2;
+  std::hash<double*> dh;
+  VERIFY( dh(&d1) % sizeof(double) != dh(&d2) % sizeof(double) );
+}
+
 int
 main()
 {
   quality_test();
+  pointer_quality_test();
   return 0;
 }



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Enhance std::hash for pointers
  2015-05-06 20:10 Enhance std::hash for pointers François Dumont
@ 2015-05-08  8:02 ` Richard Biener
  2015-05-08 20:18   ` François Dumont
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2015-05-08  8:02 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On Wed, May 6, 2015 at 10:10 PM, François Dumont <frs.dumont@gmail.com> wrote:
> Hi
>
>     Following Marc Glisse comment #4
> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to
> propose this enhancement to the hash functor for pointers. It simply gets
> rid of the irrelevant bits on pointers hash code based on memory alignment
> of the pointed type. The only drawback I can think of is that the type needs
> to be complete at std::hash instantiation time but is it really an issue ?
>
>     IMO it is quite obvious that the resulting hash code will be better but

If you use a real hashing function that's not true.  That is, something
else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }.

Richard.

> if anyone has a good method to prove it I can try to implement it. The test
> I have added in quality.cc is very basic and just reflect enhancement
> following Marc's comment.
>
> 2015-05-05  François Dumont <fdumont@gcc.gnu.org>
>
>     * include/bits/functional_hash.h
>     (std::__detail::_Lowest_power_of_two<size_t>): New.
>     (std::hash<_Tp*>::operator()): Use latter.
>     * testsuite/20_util/hash/quality.cc (pointer_quality_test): New.
>
> Tested under Linux x86_64.
>
> François
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Enhance std::hash for pointers
  2015-05-08  8:02 ` Richard Biener
@ 2015-05-08 20:18   ` François Dumont
  2015-05-08 22:11     ` Christopher Jefferson
  0 siblings, 1 reply; 5+ messages in thread
From: François Dumont @ 2015-05-08 20:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: libstdc++, gcc-patches

On 08/05/2015 10:02, Richard Biener wrote:
> On Wed, May 6, 2015 at 10:10 PM, François Dumont <frs.dumont@gmail.com> wrote:
>> Hi
>>
>>      Following Marc Glisse comment #4
>> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to
>> propose this enhancement to the hash functor for pointers. It simply gets
>> rid of the irrelevant bits on pointers hash code based on memory alignment
>> of the pointed type. The only drawback I can think of is that the type needs
>> to be complete at std::hash instantiation time but is it really an issue ?
>>
>>      IMO it is quite obvious that the resulting hash code will be better but
> If you use a real hashing function that's not true.  That is, something
> else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }.

Sorry, I don't understand your remark. Do you mean that if someone is 
not using std::hash to hash its pointers he won't benefit from the 
enhancement ?

It is a good point however to see that gcc is using something similar 
internally.

François

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Enhance std::hash for pointers
  2015-05-08 20:18   ` François Dumont
@ 2015-05-08 22:11     ` Christopher Jefferson
  2015-05-11 21:06       ` François Dumont
  0 siblings, 1 reply; 5+ messages in thread
From: Christopher Jefferson @ 2015-05-08 22:11 UTC (permalink / raw)
  To: François Dumont; +Cc: Richard Biener, libstdc++, gcc-patches

My concern with accepting this patch is that many of libstdc++'s hash
functions are awful from a mixing point of view -- you would get
exactly the same problem from users who have integers which are always
a multiple of a power of 2 (which is in practice not uncommon). This
would give exactly the same problem.

Rather than try to "fix" one hash function like this, we should just
accept our hash functions might have low quality lower order bits.



On 8 May 2015 at 21:18, François Dumont <frs.dumont@gmail.com> wrote:
> On 08/05/2015 10:02, Richard Biener wrote:
>>
>> On Wed, May 6, 2015 at 10:10 PM, François Dumont <frs.dumont@gmail.com>
>> wrote:
>>>
>>> Hi
>>>
>>>      Following Marc Glisse comment #4
>>> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to
>>> propose this enhancement to the hash functor for pointers. It simply gets
>>> rid of the irrelevant bits on pointers hash code based on memory
>>> alignment
>>> of the pointed type. The only drawback I can think of is that the type
>>> needs
>>> to be complete at std::hash instantiation time but is it really an issue
>>> ?
>>>
>>>      IMO it is quite obvious that the resulting hash code will be better
>>> but
>>
>> If you use a real hashing function that's not true.  That is, something
>> else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }.
>
>
> Sorry, I don't understand your remark. Do you mean that if someone is not
> using std::hash to hash its pointers he won't benefit from the enhancement ?
>
> It is a good point however to see that gcc is using something similar
> internally.
>
> François
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Enhance std::hash for pointers
  2015-05-08 22:11     ` Christopher Jefferson
@ 2015-05-11 21:06       ` François Dumont
  0 siblings, 0 replies; 5+ messages in thread
From: François Dumont @ 2015-05-11 21:06 UTC (permalink / raw)
  To: Christopher Jefferson; +Cc: Richard Biener, libstdc++, gcc-patches

     My proposal should be consider out of any context. We don't know 
what std::hash is used for in user code, this is why I am proposing this 
patch even if for the moment it doesn't make any difference considering 
only our usage of it.

     Your remark would make more sens if we were talking about changing 
std::unordered_xxx containers number of buckets policy to a power of 2

François


On 09/05/2015 00:10, Christopher Jefferson wrote:
> My concern with accepting this patch is that many of libstdc++'s hash
> functions are awful from a mixing point of view -- you would get
> exactly the same problem from users who have integers which are always
> a multiple of a power of 2 (which is in practice not uncommon). This
> would give exactly the same problem.
>
> Rather than try to "fix" one hash function like this, we should just
> accept our hash functions might have low quality lower order bits.
>
>
>
> On 8 May 2015 at 21:18, François Dumont<frs.dumont@gmail.com>  wrote:
>> On 08/05/2015 10:02, Richard Biener wrote:
>>> On Wed, May 6, 2015 at 10:10 PM, François Dumont<frs.dumont@gmail.com>
>>> wrote:
>>>> Hi
>>>>
>>>>       Following Marc Glisse comment #4
>>>> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641  I would like to
>>>> propose this enhancement to the hash functor for pointers. It simply gets
>>>> rid of the irrelevant bits on pointers hash code based on memory
>>>> alignment
>>>> of the pointed type. The only drawback I can think of is that the type
>>>> needs
>>>> to be complete at std::hash instantiation time but is it really an issue
>>>> ?
>>>>
>>>>       IMO it is quite obvious that the resulting hash code will be better
>>>> but
>>> If you use a real hashing function that's not true.  That is, something
>>> else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }.
>> Sorry, I don't understand your remark. Do you mean that if someone is not
>> using std::hash to hash its pointers he won't benefit from the enhancement ?
>>
>> It is a good point however to see that gcc is using something similar
>> internally.
>>
>> François
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-05-11 21:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-06 20:10 Enhance std::hash for pointers François Dumont
2015-05-08  8:02 ` Richard Biener
2015-05-08 20:18   ` François Dumont
2015-05-08 22:11     ` Christopher Jefferson
2015-05-11 21:06       ` François Dumont

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).