From: "François Dumont" <frs.dumont@gmail.com>
To: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
Date: Thu, 23 Sep 2021 06:36:10 +0200 [thread overview]
Message-ID: <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com> (raw)
In-Reply-To: <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com>
Ping ?
On 16/08/21 9:03 pm, François Dumont wrote:
> On 17/07/20 2:58 pm, Jonathan Wakely wrote:
>> On 17/11/19 22:31 +0100, François Dumont wrote:
>>> This is an implementation of PR 68303.
>>>
>>> I try to use this idea as much as possible to avoid computation of
>>> hash codes.
>>>
>>> Note that tests are not showing any gain. I guess hash computation
>>> must be quite bad to get a benefit from it. So I am only activating
>>> it when hash code is not cached and/or when computation is not fast.
>>
>> If the tests don't show any benefit, why bother making the change?
>
> I eventually managed to demonstrate this optimization through a
> performance test case.
>
>>
>> Does it help the example in the PR?
>
> No, the code attached to the PR just show what the user has done to
> put in place this optim on his side.
>
> What I needed was a slow hash code computation compared to the equal
> operation. I realized that I had to use longer string to achieve this.
>
> Moreover making this optim dependant on
> _Hashtable_traits::__hash_cached was just wrong as we cannot use the
> cached hash code here as the input is a key instance, not a node.
>
> I introduce _Hashtable_hash_traits<_Hash> to offer a single
> customization point as this optim depends highly on the difference
> between a hash code computation and a comparison. Maybe I should put
> it at std namespace scope to ease partial specialization ?
>
> Performance test results before the patch:
>
> unordered_small_size.cc std::unordered_set<int>: 1st insert
> 40r 32u 8s 264000112mem 0pf
> unordered_small_size.cc std::unordered_set<int>: find/erase
> 22r 22u 0s -191999808mem 0pf
> unordered_small_size.cc std::unordered_set<int>: 2nd insert
> 36r 36u 0s 191999776mem 0pf
> unordered_small_size.cc std::unordered_set<int>: erase key
> 25r 25u 0s -191999808mem 0pf
> unordered_small_size.cc std::unordered_set<string>: 1st insert
> 404r 244u 156s -1989936256mem 0pf
> unordered_small_size.cc std::unordered_set<string>: find/erase
> 315r 315u 0s 2061942272mem 0pf
> unordered_small_size.cc std::unordered_set<string>: 2nd insert
> 233r 233u 0s -2061942208mem 0pf
> unordered_small_size.cc std::unordered_set<string>: erase key
> 299r 298u 0s 2061942208mem 0pf
>
> after the patch:
>
> unordered_small_size.cc std::unordered_set<int>: 1st insert
> 41r 33u 7s 264000112mem 0pf
> unordered_small_size.cc std::unordered_set<int>: find/erase
> 24r 25u 1s -191999808mem 0pf
> unordered_small_size.cc std::unordered_set<int>: 2nd insert
> 34r 34u 0s 191999776mem 0pf
> unordered_small_size.cc std::unordered_set<int>: erase key
> 25r 25u 0s -191999808mem 0pf
> unordered_small_size.cc std::unordered_set<string>: 1st insert
> 399r 232u 165s -1989936256mem 0pf
> unordered_small_size.cc std::unordered_set<string>: find/erase
> 196r 197u 0s 2061942272mem 0pf
> unordered_small_size.cc std::unordered_set<string>: 2nd insert
> 221r 222u 0s -2061942208mem 0pf
> unordered_small_size.cc std::unordered_set<string>: erase key
> 178r 178u 0s 2061942208mem 0pf
>
> libstdc++: Optimize operations on small size hashtable [PR 68303]
>
> When hasher is identified as slow and the number of elements is
> limited in the
> container use a brute-force loop on those elements to look for a
> given key using
> the key_equal functor. For the moment the default threshold below
> which the
> container is considered as small is 20.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/68303
> * include/bits/hashtable_policy.h
> (_Hashtable_hash_traits<_Hash>): New.
> (_Hash_code_base<>::_M_hash_code(const
> _Hash_node_value<>&)): New.
> (_Hashtable_base<>::_M_key_equals): New.
> (_Hashtable_base<>::_M_equals): Use latter.
> (_Hashtable_base<>::_M_key_equals_tr): New.
> (_Hashtable_base<>::_M_equals_tr): Use latter.
> * include/bits/hashtable.h
> (_Hashtable<>::__small_size_threshold()): New, use
> _Hashtable_hash_traits.
> (_Hashtable<>::find): Loop through elements to look for
> key if size is lower
> than __small_size_threshold().
> (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise.
> (_Hashtable<>::_M_insert_unique(_Kt&&, _Args&&, const
> _NodeGenerator&)): Likewise.
> (_Hashtable<>::_M_compute_hash_code(const_iterator, const key_type&)):
> New.
> (_Hashtable<>::_M_emplace(const_iterator, false_type,
> _Args&&...)): Use latter.
> (_Hashtable<>::_M_find_before_node(const key_type&)): New.
> (_Hashtable<>::_M_erase(true_type, const key_type&)): Use
> latter.
> (_Hashtable<>::_M_erase(false_type, const key_type&)):
> Likewise.
> * src/c++11/hashtable_c++0x.cc: Include
> <bits/functional_hash.h>.
> * testsuite/util/testsuite_performane.h
> (report_performance): Use 9 width to display memory.
> *
> testsuite/performance/23_containers/insert_erase/unordered_small_size.cc:
> New performance test case.
>
> Tested under Linux x86_64.
>
> Ok to commit ?
>
> François
>
next prev parent reply other threads:[~2021-09-23 4:36 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-17 21:31 François Dumont
2020-07-17 12:58 ` Jonathan Wakely
2021-08-16 19:03 ` François Dumont
2021-09-23 4:36 ` François Dumont [this message]
2021-12-21 6:07 ` François Dumont
2021-12-21 6:28 ` Daniel Krügler
2021-12-21 17:55 ` François Dumont
2021-12-23 12:43 ` Jonathan Wakely
[not found] ` <YcRzoSSc534Lg+/F@redhat.com>
2021-12-25 21:39 ` François Dumont
2022-01-05 17:07 ` Jonathan Wakely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com \
--to=frs.dumont@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).