public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: "François Dumont" <frs.dumont@gmail.com>
To: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
Date: Thu, 23 Sep 2021 06:36:10 +0200	[thread overview]
Message-ID: <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com> (raw)
In-Reply-To: <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com>

Ping ?

On 16/08/21 9:03 pm, François Dumont wrote:
> On 17/07/20 2:58 pm, Jonathan Wakely wrote:
>> On 17/11/19 22:31 +0100, François Dumont wrote:
>>> This is an implementation of PR 68303.
>>>
>>> I try to use this idea as much as possible to avoid computation of 
>>> hash codes.
>>>
>>> Note that tests are not showing any gain. I guess hash computation 
>>> must be quite bad to get a benefit from it. So I am only activating 
>>> it when hash code is not cached and/or when computation is not fast.
>>
>> If the tests don't show any benefit, why bother making the change?
>
> I eventually managed to demonstrate this optimization through a 
> performance test case.
>
>>
>> Does it help the example in the PR?
>
> No, the code attached to the PR just show what the user has done to 
> put in place this optim on his side.
>
> What I needed was a slow hash code computation compared to the equal 
> operation. I realized that I had to use longer string to achieve this.
>
> Moreover making this optim dependant on 
> _Hashtable_traits::__hash_cached was just wrong as we cannot use the 
> cached hash code here as the input is a key instance, not a node.
>
> I introduce _Hashtable_hash_traits<_Hash> to offer a single 
> customization point as this optim depends highly on the difference 
> between a hash code computation and a comparison. Maybe I should put 
> it at std namespace scope to ease partial specialization ?
>
> Performance test results before the patch:
>
> unordered_small_size.cc      std::unordered_set<int>: 1st insert      
> 40r   32u    8s 264000112mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: find/erase      
> 22r   22u    0s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: 2nd insert      
> 36r   36u    0s 191999776mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: erase key       
> 25r   25u    0s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 1st insert    
>  404r  244u  156s -1989936256mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: find/erase    
>  315r  315u    0s 2061942272mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 2nd insert    
>  233r  233u    0s -2061942208mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: erase key     
>  299r  298u    0s 2061942208mem    0pf
>
> after the patch:
>
> unordered_small_size.cc      std::unordered_set<int>: 1st insert      
> 41r   33u    7s 264000112mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: find/erase      
> 24r   25u    1s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: 2nd insert      
> 34r   34u    0s 191999776mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: erase key       
> 25r   25u    0s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 1st insert    
>  399r  232u  165s -1989936256mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: find/erase    
>  196r  197u    0s 2061942272mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 2nd insert    
>  221r  222u    0s -2061942208mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: erase key     
>  178r  178u    0s 2061942208mem    0pf
>
>     libstdc++: Optimize operations on small size hashtable [PR 68303]
>
>     When hasher is identified as slow and the number of elements is 
> limited in the
>     container use a brute-force loop on those elements to look for a 
> given key using
>     the key_equal functor. For the moment the default threshold below 
> which the
>     container is considered as small is 20.
>
>     libstdc++-v3/ChangeLog:
>
>             PR libstdc++/68303
>             * include/bits/hashtable_policy.h
>             (_Hashtable_hash_traits<_Hash>): New.
>             (_Hash_code_base<>::_M_hash_code(const 
> _Hash_node_value<>&)): New.
>             (_Hashtable_base<>::_M_key_equals): New.
>             (_Hashtable_base<>::_M_equals): Use latter.
>             (_Hashtable_base<>::_M_key_equals_tr): New.
>             (_Hashtable_base<>::_M_equals_tr): Use latter.
>             * include/bits/hashtable.h
>             (_Hashtable<>::__small_size_threshold()): New, use 
> _Hashtable_hash_traits.
>             (_Hashtable<>::find): Loop through elements to look for 
> key if size is lower
>             than __small_size_threshold().
>             (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise.
>             (_Hashtable<>::_M_insert_unique(_Kt&&, _Args&&, const 
> _NodeGenerator&)): Likewise.
> (_Hashtable<>::_M_compute_hash_code(const_iterator, const key_type&)): 
> New.
>             (_Hashtable<>::_M_emplace(const_iterator, false_type, 
> _Args&&...)): Use latter.
>             (_Hashtable<>::_M_find_before_node(const key_type&)): New.
>             (_Hashtable<>::_M_erase(true_type, const key_type&)): Use 
> latter.
>             (_Hashtable<>::_M_erase(false_type, const key_type&)): 
> Likewise.
>             * src/c++11/hashtable_c++0x.cc: Include 
> <bits/functional_hash.h>.
>             * testsuite/util/testsuite_performane.h
>             (report_performance): Use 9 width to display memory.
>             * 
> testsuite/performance/23_containers/insert_erase/unordered_small_size.cc:
>             New performance test case.
>
> Tested under Linux x86_64.
>
> Ok to commit ?
>
> François
>


  reply	other threads:[~2021-09-23  4:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-17 21:31 François Dumont
2020-07-17 12:58 ` Jonathan Wakely
2021-08-16 19:03   ` François Dumont
2021-09-23  4:36     ` François Dumont [this message]
2021-12-21  6:07       ` François Dumont
2021-12-21  6:28         ` Daniel Krügler
2021-12-21 17:55           ` François Dumont
2021-12-23 12:43             ` Jonathan Wakely
     [not found]         ` <YcRzoSSc534Lg+/F@redhat.com>
2021-12-25 21:39           ` François Dumont
2022-01-05 17:07             ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com \
    --to=frs.dumont@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).