From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by sourceware.org (Postfix) with ESMTPS id 734C43858C3B; Thu, 23 Sep 2021 04:36:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 734C43858C3B Received: by mail-ed1-x52b.google.com with SMTP id v24so18648521eda.3; Wed, 22 Sep 2021 21:36:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=UbrFjMjleD7Nb/RSpgET4nCWK0LWYMKTTkSU3Dq7Xc0=; b=IcS54hE2sUWN6AMz9q+w/j2TkUTUW4e9kUb7XjwobjkbHVe6EV8Exvq2J0DZzSyIlO jmIjrrLd23/KfdmcM9phq93O9nWc9HXd/rl6B5fCACIfTfQHwqrIcKo76Sra+iWvihQS jcEUXwV9QWbvRAuTvTCsx7bSEb2sqBOEE0qSVvjQjDA7cl7h+wHZHg6PP/nHxx+tZLSF BXiOxt3x0ECWMxkDNKL4JeNWoMHVpBNcGsmbnpyD9IYiZiYFu5MxEHf6OZr/cSxwK0C3 YGzC7z2KkMRO+FhpTHBjnmxwcRv4T0v8CThymUmM54uUSqf2uvQEjm3PjDC/7deR1suQ PJ6A== X-Gm-Message-State: AOAM532vUwKK+Gi4DZF9RdQ7zbJ61zpAuD7e+UDVSOdQUbu6rSll/nuj VUx8p5di5sfrgaGYf+OPij7/nn00Luc= X-Google-Smtp-Source: ABdhPJytuHgOJAxcQRu+kwqGRhj5eBDctJLYOyMJY3tgRzF5MdgCKrW88IkkEDHpMg/9/m+/RjxJBw== X-Received: by 2002:a17:906:6b0c:: with SMTP id q12mr2990960ejr.0.1632371772846; Wed, 22 Sep 2021 21:36:12 -0700 (PDT) Received: from [10.62.4.83] ([109.190.253.15]) by smtp.googlemail.com with ESMTPSA id q14sm2169411ejc.93.2021.09.22.21.36.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Sep 2021 21:36:12 -0700 (PDT) Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= To: "libstdc++@gcc.gnu.org" Cc: gcc-patches References: <88d3aa39-2146-3ea5-8bb2-a4d9b70accbd@gmail.com> <20200717125809.GC2827066@redhat.com> <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com> Message-ID: <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com> Date: Thu, 23 Sep 2021 06:36:10 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: fr X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_ABUSEAT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Sep 2021 04:36:16 -0000 Ping ? On 16/08/21 9:03 pm, François Dumont wrote: > On 17/07/20 2:58 pm, Jonathan Wakely wrote: >> On 17/11/19 22:31 +0100, François Dumont wrote: >>> This is an implementation of PR 68303. >>> >>> I try to use this idea as much as possible to avoid computation of >>> hash codes. >>> >>> Note that tests are not showing any gain. I guess hash computation >>> must be quite bad to get a benefit from it. So I am only activating >>> it when hash code is not cached and/or when computation is not fast. >> >> If the tests don't show any benefit, why bother making the change? > > I eventually managed to demonstrate this optimization through a > performance test case. > >> >> Does it help the example in the PR? > > No, the code attached to the PR just show what the user has done to > put in place this optim on his side. > > What I needed was a slow hash code computation compared to the equal > operation. I realized that I had to use longer string to achieve this. > > Moreover making this optim dependant on > _Hashtable_traits::__hash_cached was just wrong as we cannot use the > cached hash code here as the input is a key instance, not a node. > > I introduce _Hashtable_hash_traits<_Hash> to offer a single > customization point as this optim depends highly on the difference > between a hash code computation and a comparison. Maybe I should put > it at std namespace scope to ease partial specialization ? > > Performance test results before the patch: > > unordered_small_size.cc      std::unordered_set: 1st insert      > 40r   32u    8s 264000112mem    0pf > unordered_small_size.cc      std::unordered_set: find/erase      > 22r   22u    0s -191999808mem    0pf > unordered_small_size.cc      std::unordered_set: 2nd insert      > 36r   36u    0s 191999776mem    0pf > unordered_small_size.cc      std::unordered_set: erase key       > 25r   25u    0s -191999808mem    0pf > unordered_small_size.cc      std::unordered_set: 1st insert    >  404r  244u  156s -1989936256mem    0pf > unordered_small_size.cc      std::unordered_set: find/erase    >  315r  315u    0s 2061942272mem    0pf > unordered_small_size.cc      std::unordered_set: 2nd insert    >  233r  233u    0s -2061942208mem    0pf > unordered_small_size.cc      std::unordered_set: erase key     >  299r  298u    0s 2061942208mem    0pf > > after the patch: > > unordered_small_size.cc      std::unordered_set: 1st insert      > 41r   33u    7s 264000112mem    0pf > unordered_small_size.cc      std::unordered_set: find/erase      > 24r   25u    1s -191999808mem    0pf > unordered_small_size.cc      std::unordered_set: 2nd insert      > 34r   34u    0s 191999776mem    0pf > unordered_small_size.cc      std::unordered_set: erase key       > 25r   25u    0s -191999808mem    0pf > unordered_small_size.cc      std::unordered_set: 1st insert    >  399r  232u  165s -1989936256mem    0pf > unordered_small_size.cc      std::unordered_set: find/erase    >  196r  197u    0s 2061942272mem    0pf > unordered_small_size.cc      std::unordered_set: 2nd insert    >  221r  222u    0s -2061942208mem    0pf > unordered_small_size.cc      std::unordered_set: erase key     >  178r  178u    0s 2061942208mem    0pf > >     libstdc++: Optimize operations on small size hashtable [PR 68303] > >     When hasher is identified as slow and the number of elements is > limited in the >     container use a brute-force loop on those elements to look for a > given key using >     the key_equal functor. For the moment the default threshold below > which the >     container is considered as small is 20. > >     libstdc++-v3/ChangeLog: > >             PR libstdc++/68303 >             * include/bits/hashtable_policy.h >             (_Hashtable_hash_traits<_Hash>): New. >             (_Hash_code_base<>::_M_hash_code(const > _Hash_node_value<>&)): New. >             (_Hashtable_base<>::_M_key_equals): New. >             (_Hashtable_base<>::_M_equals): Use latter. >             (_Hashtable_base<>::_M_key_equals_tr): New. >             (_Hashtable_base<>::_M_equals_tr): Use latter. >             * include/bits/hashtable.h >             (_Hashtable<>::__small_size_threshold()): New, use > _Hashtable_hash_traits. >             (_Hashtable<>::find): Loop through elements to look for > key if size is lower >             than __small_size_threshold(). >             (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise. >             (_Hashtable<>::_M_insert_unique(_Kt&&, _Args&&, const > _NodeGenerator&)): Likewise. > (_Hashtable<>::_M_compute_hash_code(const_iterator, const key_type&)): > New. >             (_Hashtable<>::_M_emplace(const_iterator, false_type, > _Args&&...)): Use latter. >             (_Hashtable<>::_M_find_before_node(const key_type&)): New. >             (_Hashtable<>::_M_erase(true_type, const key_type&)): Use > latter. >             (_Hashtable<>::_M_erase(false_type, const key_type&)): > Likewise. >             * src/c++11/hashtable_c++0x.cc: Include > . >             * testsuite/util/testsuite_performane.h >             (report_performance): Use 9 width to display memory. >             * > testsuite/performance/23_containers/insert_erase/unordered_small_size.cc: >             New performance test case. > > Tested under Linux x86_64. > > Ok to commit ? > > François >