public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: "François Dumont" <frs.dumont@gmail.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
Date: Fri, 17 Jul 2020 13:58:09 +0100	[thread overview]
Message-ID: <20200717125809.GC2827066@redhat.com> (raw)
In-Reply-To: <88d3aa39-2146-3ea5-8bb2-a4d9b70accbd@gmail.com>

On 17/11/19 22:31 +0100, François Dumont wrote:
>This is an implementation of PR 68303.
>
>I try to use this idea as much as possible to avoid computation of 
>hash codes.
>
>Note that tests are not showing any gain. I guess hash computation 
>must be quite bad to get a benefit from it. So I am only activating it 
>when hash code is not cached and/or when computation is not fast.

If the tests don't show any benefit, why bother making the change?

Does it help the example in the PR?

>    PR libstdc++/68303
>    * include/bits/hashtable_policy.h
>    (_Small_size_threshold<_Hash>): New.
>    (_Hashtable_traits<>): Add _Small_size_threshold std::size_t template
>    parameter, default to 0.
>    (_Hashtable_traits<>::__small_size_threshold): New.
>    (_Hash_code_base<>::_M_hash_code(const __node_type*)): New.
>    (_Equal_helper<>::_S_node_equals): New.
>    * include/bits/hashtable.h:
>    (__small_size_threshold_default<>): New template alias.
>    (_Hashtable<>::find): Add linear lookup when size is lower or equal to
>    _Small_size_threshold.
>    (_Hashtable<>::_M_emplace<_Args>(true_type, _Args&&...)): Add linear
>    lookup when size is lower or equal to _Small_size_threshold.
>    (_Hashtable<>::_M_insert<>(_Arg&&, const _NodeGenerator&, true_type,
>    size_type)): Likewise.
>    (_Hashtable<>::_M_compute_hash_code(const_iterator, const key_type&)):
>    New.
>    (_Hashtable<>::_M_emplace<_Args>(false_type, _Args&&...)): Use latter.
>    (_Hashtable<>::_M_insert(const_iterator, _Arg&&, const _NodeGenerator&,
>    false_type)): Likewise.
>    (_Hashtable<>::_M_find_before_node(const key_type&)): New.
>    (_Hashtable<>::_M_erase(true_type, const key_type&)): Use latter 
>if size
>    is lower or equal to _Small_size_threshold.
>    (_Hashtable<>::_M_erase(false_type, const key_type&)): Likewise.
>    * include/bits/unordered_map.h (__umaps_traits): Adapt using small size
>    threshold set to 20.
>    (__ummap_traits): Likewise.
>    * include/bits/unordered_set.h (__uset_traits, __ummset_traits): 
>Likewise.
>    * src/c++11/hashtable_c++0x.cc: Add <bits/functional_hash.h> include.

The implementation of this seems unnecessarily complicated, and it
changes the mangled name of the _Hashtable_traits type, which changes
the mangled name of _Hashtable too.

It seems to me that instead of __traits_type::__small_size_threshold
being a typedef inside the traits class, we could just use a constexpr
function, maybe as a member of the _Hashtable_traits class template:

--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -201,6 +201,14 @@ namespace __detail
        using __hash_cached = __bool_constant<_Cache_hash_code>;
        using __constant_iterators = __bool_constant<_Constant_iterators>;
        using __unique_keys = __bool_constant<_Unique_keys>;
+
+      template<typename _Hash>
+       static constexpr size_t
+       __small_size_threshold()
+       {
+         return _Cache_hash_code ? 0
+           : __detail::_Small_size_threshold<_Hash>::value;
+       }
      };
  
Regarding the new _Small_size_threshold class template, do we really
need yet another customization point? Can't we just use
__is_fast_hash? e.g.

          return _Cache_hash_code || __is_fast_hash ? 0 : 20;

As an aside, it seems like a terrible design that _Hashtable_traits
doesn't actually depend on the hash function, so any time we want to
extend it to detect some other property of the hash function we need
to add extra template parameters. Far too late to fix now, but we can
extend it by adding members, not by changing its type.



  reply	other threads:[~2020-07-17 12:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-17 21:31 François Dumont
2020-07-17 12:58 ` Jonathan Wakely [this message]
2021-08-16 19:03   ` François Dumont
2021-09-23  4:36     ` François Dumont
2021-12-21  6:07       ` François Dumont
2021-12-21  6:28         ` Daniel Krügler
2021-12-21 17:55           ` François Dumont
2021-12-23 12:43             ` Jonathan Wakely
     [not found]         ` <YcRzoSSc534Lg+/F@redhat.com>
2021-12-25 21:39           ` François Dumont
2022-01-05 17:07             ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200717125809.GC2827066@redhat.com \
    --to=jwakely@redhat.com \
    --cc=frs.dumont@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).