From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by sourceware.org (Postfix) with ESMTPS id 85CE53858D20; Tue, 29 Aug 2023 19:51:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 85CE53858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-3fef34c33d6so45522855e9.3; Tue, 29 Aug 2023 12:51:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1693338718; x=1693943518; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:date:message-id:from:to :cc:subject:date:message-id:reply-to; bh=gIczB46B/NlmlmSKZ31FzJqLpGU56i/0+S0jUpHT87g=; b=WaufwGli86jNTDH77gLJhGjOXcFqRFZ1E6WEPuGSpExG3fBsQDexWWUF9F7ljxeUjF mXA2LEjOP7NE+jNbV88PNSlm75D7UHRE1KAD9JtieOJwzSaZFDbonpwMjZmzOjYKahLi JcvQzjlcKdIiowh3oOvKPQMyQjv/QDsy++t31oClUvIbkvYPnIknaKlJYXCD+rQvCrYV n6cXznJ2McaFLJN+A0pL0zVhwFewYhcn7OgBUP7B6x7E8T2u8/NmmBrREOfBbhAxRHgS KiQXe9ttbTRa2Vtbt0bvbXaTVk1cA2DpjtrZh5Tt+8a7+0tx5/oLGA4W6snoH4TC73Yx oodA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693338718; x=1693943518; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gIczB46B/NlmlmSKZ31FzJqLpGU56i/0+S0jUpHT87g=; b=Vm0Bwxgc1Ec7PUEmTHaREaonVvI1v8DvKZM1ay3FLlBKDwDQYBb/xIeGyLISuEIqUR 08u66m64Zib4RO+jnWSRdHHI2UWWyp+N38Bvas22psiXzWXMKojhgHXWw3JhOzGu4oZt CFP4eAO3D2dAm8WH19wZ5W8bbaKR6hYBbkoQ6x7NWZBmPu/ywXAjDq3o/y0rhdTEXwT3 s3YV3u81S9QEsjqzQHbmxfak3AIJv3PzcjPrgMTnVxgpSFzT+Uw0sDCrofcoKNXHYYyO U5EAHBKKiNdDx96Ni7J9OMxxGsNRZoZHwIeLc1vc9t+K/BLy427uQ7KfD/z9jbuxxq9C v+FQ== X-Gm-Message-State: AOJu0YzBCvxadYRo3LKys9Bw+Bs+CGWgalWTakztSTQ8UF6+HSa75Dai +zJsnywhT9ezyqngMN/KnhoSgwsd/5k= X-Google-Smtp-Source: AGHT+IFaNB6dMRD1PbFNV+5jQLLc83Gj4CoXUFKM2ZnDCGQSI/agkJAomds+3/Fw40H67FVK4TVEmA== X-Received: by 2002:a05:600c:214d:b0:3fe:22fd:1b23 with SMTP id v13-20020a05600c214d00b003fe22fd1b23mr229992wml.34.1693338717460; Tue, 29 Aug 2023 12:51:57 -0700 (PDT) Received: from ?IPV6:2a01:e0a:1dc:b1c0:b08c:ed50:111a:ab17? ([2a01:e0a:1dc:b1c0:b08c:ed50:111a:ab17]) by smtp.gmail.com with ESMTPSA id o10-20020a1c750a000000b003fe29dc0ff2sm15085002wmc.21.2023.08.29.12.51.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 29 Aug 2023 12:51:56 -0700 (PDT) Message-ID: <8fe52f15-1acc-da0e-9353-35bdc9aa25ef@gmail.com> Date: Tue, 29 Aug 2023 21:51:56 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH][Hashtable] Performance optimization through use of insertion hint From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= To: libstdc++ Cc: gcc-patches References: <6841b337-c7a0-55d7-a513-e655c99df01a@gmail.com> Content-Language: en-US In-Reply-To: <6841b337-c7a0-55d7-a513-e655c99df01a@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Any feedback regarding this patch ? François On 24/07/2023 13:02, François Dumont wrote: > libstdc++: [_Hashtable] Make more use of insertion hint > > >     When inserting an element into an empty bucket we currently insert > the new node >     after the before-begin node so in first position. The drawback of > doing this is >     that we are forced to update the bucket that was containing this > before-begin >     node to point to the newly inserted node. To do so we need at best > to do a modulo >     to find this bucket and when hash code is not cached also compute it. > >     To avoid this side effect it is better to insert after the last > node. Adding >     a member to keep track of this last node would be an abi breaking > change. Still we >     can ask the user to maintain and provide this last node as an > insertion hint. > >     Adapt range insertion methods to try to detect the last node and > then use it as >     the insertion hint. > >     libstdc++-v3/ChangeLog: > >             * include/bits/hashtable_policy.h: >             (_Insert_base::try_emplace): Adapt to use hint. >             (_Insert_base::_M_insert_range): Try to detect container > last node and use it >             as hint for insertions. >             (_Hash_code_base::_M_hash_code(const _Hash&, const > _Hash_node_value<>&)): Remove. >             (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const > _Hash_node_value<>&)): Remove. >             * include/bits/hashtable.h >             (_Hashtable<>::_M_insert_bucket_begin): Add hint parameter > and use it when inserting >             into an empty bucket if hint is the last container node. >             (_Hashtable<>::_InsertInfo): New struct. >             (_Hashtable<>::_M_get_insert_info): New, return latter. >             (_Hashtable<>::_M_insert_multi_node): Add _InsertInfo > parameter. >             (_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint > parameter. >             (_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)): > New. >             (_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)): > New. >             (_Hashtable<>::_M_emplace()): Adapt to use latters. >             (_Hashtable<>::_M_insert_unique): Add __node_ptr parameter. >             (_Hashtable<>::_M_insert_unique_aux): Add __node_ptr > parameter. >             (_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const > _NodeGenerator&, true_type)): >             Use latter. >             (_Hashtable<>::_M_reinsert_node(const_iterator, > node_type&&)): >             Add hint parameter, adapt to use it. >             (_Hashtable<>::_M_reinsert_node_multi): Use hint parameter > if available to extract >             hash code. >             (_Hashtable<>::_M_compute_hash_code(const _Hash&, > __node_ptr, __node_ptr, >             const key_type&)): New. > (_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr, > __node_ptr, >             const key_type&)): New. >             (_Hashtable<>::_M_merge_unique): Adapt to use latter. > Implement small size >             optimization. >             (_Hashtable<>::_M_get_insert_info(const _Hash&, > __node_ptr, __node_ptr, >             const key_type&)): New. > (_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr, > __node_ptr, >             const key_type&)): New. >             (_Hashtable<>::_M_merge_multi): Adapt to use latter. >             * include/bits/unordered_map.h > (unordered_map<>::insert(node_type&&)): Pass cend as >             hint. >             (unordered_map<>::insert(const_iterator, node_type&&)): > Adapt to use hint. >             * include/bits/unordered_set.h > (unordered_set<>::insert(node_type&&)): Pass cend as >             hint. >             (unordered_set<>::insert(const_iterator, node_type&&)): > Adapt to use hint. >             * > testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt > implementation >             specific tests. > > Tested under linux x86_64. > > Here is the performance results of this patch, before: > > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > insertions w/o hint      94r   94u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > insertions with perfect hint      95r   94u    0s 191999712mem 0pf > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > insertions with bad hint      94r   94u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > range insertions      88r   88u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 > insertions w/o hint      91r   91u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 > insertions with perfect hint      92r   93u    0s 191999712mem 0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 > insertions with bad hint      93r   93u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 range > insertions      88r   88u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 > insertions w/o hint      94r   95u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 > insertions with perfect hint      94r   95u    0s 191999712mem 0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 > insertions with bad hint      94r   94u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 range > insertions      90r   90u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code cached 2000000 insertions > w/o hint      68r   68u    0s 223999264mem    0pf > unordered_set_hint.cc-thread    hash code cached 2000000 insertions > with perfect hint      67r   67u    0s 223999264mem 0pf > unordered_set_hint.cc-thread    hash code cached 2000000 insertions > with bad hint      68r   68u    0s 223999264mem    0pf > unordered_set_hint.cc-thread    hash code cached 2000000 range > insertions      62r   62u    0s 223999264mem    0pf > > After: > > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > insertions w/o hint      93r   93u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > insertions with perfect hint      58r   59u    0s 191999712mem 0pf > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > insertions with bad hint      93r   94u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code NOT cached 2000000 > range insertions      52r   51u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 > insertions w/o hint      96r   95u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 > insertions with perfect hint      61r   62u    0s 191999712mem 0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 > insertions with bad hint      94r   94u    0s 191999712mem    0pf > unordered_multiset_hint.cc-thread    hash code cached 2000000 range > insertions      52r   52u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 > insertions w/o hint      96r   95u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 > insertions with perfect hint      58r   59u    0s 191999712mem 0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 > insertions with bad hint      94r   94u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code NOT cached 2000000 range > insertions      52r   52u    0s 191999712mem    0pf > unordered_set_hint.cc-thread    hash code cached 2000000 insertions > w/o hint      70r   69u    0s 223999264mem    0pf > unordered_set_hint.cc-thread    hash code cached 2000000 insertions > with perfect hint      67r   67u    0s 223999264mem 0pf > unordered_set_hint.cc-thread    hash code cached 2000000 insertions > with bad hint      67r   67u    0s 223999264mem    0pf > unordered_set_hint.cc-thread    hash code cached 2000000 range > insertions      63r   62u    0s 223999264mem    0pf > > Ok to commit ? > > François