From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frs.dumont@gmail.com>
Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com
 [IPv6:2a00:1450:4864:20::52b])
 by sourceware.org (Postfix) with ESMTPS id 734C43858C3B;
 Thu, 23 Sep 2021 04:36:14 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 734C43858C3B
Received: by mail-ed1-x52b.google.com with SMTP id v24so18648521eda.3;
 Wed, 22 Sep 2021 21:36:14 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:subject:from:to:cc:references:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding
 :content-language;
 bh=UbrFjMjleD7Nb/RSpgET4nCWK0LWYMKTTkSU3Dq7Xc0=;
 b=IcS54hE2sUWN6AMz9q+w/j2TkUTUW4e9kUb7XjwobjkbHVe6EV8Exvq2J0DZzSyIlO
 jmIjrrLd23/KfdmcM9phq93O9nWc9HXd/rl6B5fCACIfTfQHwqrIcKo76Sra+iWvihQS
 jcEUXwV9QWbvRAuTvTCsx7bSEb2sqBOEE0qSVvjQjDA7cl7h+wHZHg6PP/nHxx+tZLSF
 BXiOxt3x0ECWMxkDNKL4JeNWoMHVpBNcGsmbnpyD9IYiZiYFu5MxEHf6OZr/cSxwK0C3
 YGzC7z2KkMRO+FhpTHBjnmxwcRv4T0v8CThymUmM54uUSqf2uvQEjm3PjDC/7deR1suQ
 PJ6A==
X-Gm-Message-State: AOAM532vUwKK+Gi4DZF9RdQ7zbJ61zpAuD7e+UDVSOdQUbu6rSll/nuj
 VUx8p5di5sfrgaGYf+OPij7/nn00Luc=
X-Google-Smtp-Source: ABdhPJytuHgOJAxcQRu+kwqGRhj5eBDctJLYOyMJY3tgRzF5MdgCKrW88IkkEDHpMg/9/m+/RjxJBw==
X-Received: by 2002:a17:906:6b0c:: with SMTP id
 q12mr2990960ejr.0.1632371772846; 
 Wed, 22 Sep 2021 21:36:12 -0700 (PDT)
Received: from [10.62.4.83] ([109.190.253.15])
 by smtp.googlemail.com with ESMTPSA id q14sm2169411ejc.93.2021.09.22.21.36.11
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 22 Sep 2021 21:36:12 -0700 (PDT)
Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= <frs.dumont@gmail.com>
To: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
References: <88d3aa39-2146-3ea5-8bb2-a4d9b70accbd@gmail.com>
 <20200717125809.GC2827066@redhat.com>
 <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com>
Message-ID: <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com>
Date: Thu, 23 Sep 2021 06:36:10 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.13.0
MIME-Version: 1.0
In-Reply-To: <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: fr
X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM,
 NICE_REPLY_A, RCVD_IN_ABUSEAT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libstdc++@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libstdc++ mailing list <libstdc++.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/libstdc++/>
List-Post: <mailto:libstdc++@gcc.gnu.org>
List-Help: <mailto:libstdc++-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Sep 2021 04:36:16 -0000

Ping ?

On 16/08/21 9:03 pm, François Dumont wrote:
> On 17/07/20 2:58 pm, Jonathan Wakely wrote:
>> On 17/11/19 22:31 +0100, FranÃ§ois Dumont wrote:
>>> This is an implementation of PR 68303.
>>>
>>> I try to use this idea as much as possible to avoid computation of 
>>> hash codes.
>>>
>>> Note that tests are not showing any gain. I guess hash computation 
>>> must be quite bad to get a benefit from it. So I am only activating 
>>> it when hash code is not cached and/or when computation is not fast.
>>
>> If the tests don't show any benefit, why bother making the change?
>
> I eventually managed to demonstrate this optimization through a 
> performance test case.
>
>>
>> Does it help the example in the PR?
>
> No, the code attached to the PR just show what the user has done to 
> put in place this optim on his side.
>
> What I needed was a slow hash code computation compared to the equal 
> operation. I realized that I had to use longer string to achieve this.
>
> Moreover making this optim dependant on 
> _Hashtable_traits::__hash_cached was just wrong as we cannot use the 
> cached hash code here as the input is a key instance, not a node.
>
> I introduce _Hashtable_hash_traits<_Hash> to offer a single 
> customization point as this optim depends highly on the difference 
> between a hash code computation and a comparison. Maybe I should put 
> it at std namespace scope to ease partial specialization ?
>
> Performance test results before the patch:
>
> unordered_small_size.cc      std::unordered_set<int>: 1st insert      
> 40r   32u    8s 264000112mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: find/erase      
> 22r   22u    0s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: 2nd insert      
> 36r   36u    0s 191999776mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: erase key       
> 25r   25u    0s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 1st insert    
>  404r  244u  156s -1989936256mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: find/erase    
>  315r  315u    0s 2061942272mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 2nd insert    
>  233r  233u    0s -2061942208mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: erase key     
>  299r  298u    0s 2061942208mem    0pf
>
> after the patch:
>
> unordered_small_size.cc      std::unordered_set<int>: 1st insert      
> 41r   33u    7s 264000112mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: find/erase      
> 24r   25u    1s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: 2nd insert      
> 34r   34u    0s 191999776mem    0pf
> unordered_small_size.cc      std::unordered_set<int>: erase key       
> 25r   25u    0s -191999808mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 1st insert    
>  399r  232u  165s -1989936256mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: find/erase    
>  196r  197u    0s 2061942272mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: 2nd insert    
>  221r  222u    0s -2061942208mem    0pf
> unordered_small_size.cc      std::unordered_set<string>: erase key     
>  178r  178u    0s 2061942208mem    0pf
>
>     libstdc++: Optimize operations on small size hashtable [PR 68303]
>
>     When hasher is identified as slow and the number of elements is 
> limited in the
>     container use a brute-force loop on those elements to look for a 
> given key using
>     the key_equal functor. For the moment the default threshold below 
> which the
>     container is considered as small is 20.
>
>     libstdc++-v3/ChangeLog:
>
>             PR libstdc++/68303
>             * include/bits/hashtable_policy.h
>             (_Hashtable_hash_traits<_Hash>): New.
>             (_Hash_code_base<>::_M_hash_code(const 
> _Hash_node_value<>&)): New.
>             (_Hashtable_base<>::_M_key_equals): New.
>             (_Hashtable_base<>::_M_equals): Use latter.
>             (_Hashtable_base<>::_M_key_equals_tr): New.
>             (_Hashtable_base<>::_M_equals_tr): Use latter.
>             * include/bits/hashtable.h
>             (_Hashtable<>::__small_size_threshold()): New, use 
> _Hashtable_hash_traits.
>             (_Hashtable<>::find): Loop through elements to look for 
> key if size is lower
>             than __small_size_threshold().
>             (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise.
>             (_Hashtable<>::_M_insert_unique(_Kt&&, _Args&&, const 
> _NodeGenerator&)): Likewise.
> (_Hashtable<>::_M_compute_hash_code(const_iterator, const key_type&)): 
> New.
>             (_Hashtable<>::_M_emplace(const_iterator, false_type, 
> _Args&&...)): Use latter.
>             (_Hashtable<>::_M_find_before_node(const key_type&)): New.
>             (_Hashtable<>::_M_erase(true_type, const key_type&)): Use 
> latter.
>             (_Hashtable<>::_M_erase(false_type, const key_type&)): 
> Likewise.
>             * src/c++11/hashtable_c++0x.cc: Include 
> <bits/functional_hash.h>.
>             * testsuite/util/testsuite_performane.h
>             (report_performance): Use 9 width to display memory.
>             * 
> testsuite/performance/23_containers/insert_erase/unordered_small_size.cc:
>             New performance test case.
>
> Tested under Linux x86_64.
>
> Ok to commit ?
>
> François
>