From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id C65A03858D20; Thu, 7 Sep 2023 16:56:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C65A03858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-99c1c66876aso146413566b.2; Thu, 07 Sep 2023 09:56:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694105767; x=1694710567; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=iDc6X/kB7aFFxUVAPVUQZA8Es23TQR8w2xQOFnedkuE=; b=ahO2H1SBwgKc0iAuTqSnQIaYkNBV8ZOhK+ODgfIv/Wt5Bhk5LT/h0bSeoqUCJPaKPE AdQRjAd/RG9+PofVAsz/wDL9T6TfGj/NOum5nGhxIP1DGHmp16OTT9Y+VfJvS5cZfbhZ RiiadKHhBk9/1+fyDJYwQhqBFEZLHFgEV3elA7RKzg4pxlXsShvu54YSy5Wni4vEq6H3 fpu1XaynG7zv5BImDFXWCtowDCJn3TY89UgUrumbGpkkPBXV3XHZspwy6IzloAusa6lK uG/gYtK6EMQA2EN7k8GYFlvJxp/jw0mrZn3+MAK55EmqE854ie/F0ISsvToQAsbZeq/E rDrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694105767; x=1694710567; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iDc6X/kB7aFFxUVAPVUQZA8Es23TQR8w2xQOFnedkuE=; b=Opq6MMU2dI4+kp4eOmeHmus8USTTBV89W+J4BkeLnTZcCggwCndB9L440DpkxQmiFc Oe036kYOpIQMcnlGGf/QmPcO9c6fwyfl0ik/5pnde7IikPUMBovSSLt13Do9oY7SHxQX 9tOvL6x3gk3PblUK0zAmnYHegBs+4yIBSBaevfIz9gPt/pILMD5US2IgCU7SPQwjQE1t tsNmmxozB+LUpIuNSwsqJwP8qcLanuhTpzSO9iX+GWSgeY5DMgwZuoQCAd6P0vICaUBV NnURtoxVd5bx17dNz1j8+QrY40O1EV4P4ASN0xuxoLsGY4MGN/OV+Mt/4b6ws88YOmdb HZQA== X-Gm-Message-State: AOJu0Yy5SUQ+rbzyk/ETmQxC+3TQAHgOjgCgFEZvWKFqpUwjOJffUFoA 4k8Q/+xy5RmxAKMuWVQGKnv/tLdgD8Y= X-Google-Smtp-Source: AGHT+IHLiHMG26MKWYNG53IvFpqgGDSxO7yV7zGuciNDDm2JKaZ8yDZ0vgqT6YiGfVEkP2jcjY1WSQ== X-Received: by 2002:a17:906:3112:b0:9a9:d5f4:1a0d with SMTP id 18-20020a170906311200b009a9d5f41a0dmr3479069ejx.45.1694105767156; Thu, 07 Sep 2023 09:56:07 -0700 (PDT) Received: from [10.25.1.214] ([89.207.171.100]) by smtp.gmail.com with ESMTPSA id r22-20020a170906365600b0099b6becb107sm10699118ejb.95.2023.09.07.09.56.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Sep 2023 09:56:06 -0700 (PDT) Message-ID: Date: Thu, 7 Sep 2023 18:55:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH][Hashtable] Performance optimization through use of insertion hint To: Jonathan Wakely Cc: libstdc++ , gcc-patches References: <6841b337-c7a0-55d7-a513-e655c99df01a@gmail.com> <8fe52f15-1acc-da0e-9353-35bdc9aa25ef@gmail.com> Content-Language: en-US From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 01/09/2023 10:59, Jonathan Wakely wrote: > On Tue, 29 Aug 2023 at 20:52, François Dumont via Libstdc++ > wrote: >> Hi >> >> Any feedback regarding this patch ? > This is a fairly large patch I've decided to split it, at least in 2. So just ignore this one, I'll submit new ones once abi issue is fixed. > and before we make any more changes to > unordered containers we have an ABI break to fix: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111050 > > >> François >> >> On 24/07/2023 13:02, François Dumont wrote: >>> libstdc++: [_Hashtable] Make more use of insertion hint >>> >>> >>> When inserting an element into an empty bucket we currently insert >>> the new node >>> after the before-begin node so in first position. The drawback of >>> doing this is >>> that we are forced to update the bucket that was containing this >>> before-begin >>> node to point to the newly inserted node. To do so we need at best >>> to do a modulo >>> to find this bucket and when hash code is not cached also compute it. >>> >>> To avoid this side effect it is better to insert after the last >>> node. Adding >>> a member to keep track of this last node would be an abi breaking >>> change. Still we >>> can ask the user to maintain and provide this last node as an >>> insertion hint. >>> >>> Adapt range insertion methods to try to detect the last node and >>> then use it as >>> the insertion hint. >>> >>> libstdc++-v3/ChangeLog: >>> >>> * include/bits/hashtable_policy.h: >>> (_Insert_base::try_emplace): Adapt to use hint. >>> (_Insert_base::_M_insert_range): Try to detect container >>> last node and use it >>> as hint for insertions. >>> (_Hash_code_base::_M_hash_code(const _Hash&, const >>> _Hash_node_value<>&)): Remove. >>> (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const >>> _Hash_node_value<>&)): Remove. >>> * include/bits/hashtable.h >>> (_Hashtable<>::_M_insert_bucket_begin): Add hint parameter >>> and use it when inserting >>> into an empty bucket if hint is the last container node. >>> (_Hashtable<>::_InsertInfo): New struct. >>> (_Hashtable<>::_M_get_insert_info): New, return latter. >>> (_Hashtable<>::_M_insert_multi_node): Add _InsertInfo >>> parameter. >>> (_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint >>> parameter. >>> (_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)): >>> New. >>> (_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)): >>> New. >>> (_Hashtable<>::_M_emplace()): Adapt to use latters. >>> (_Hashtable<>::_M_insert_unique): Add __node_ptr parameter. >>> (_Hashtable<>::_M_insert_unique_aux): Add __node_ptr >>> parameter. >>> (_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const >>> _NodeGenerator&, true_type)): >>> Use latter. >>> (_Hashtable<>::_M_reinsert_node(const_iterator, >>> node_type&&)): >>> Add hint parameter, adapt to use it. >>> (_Hashtable<>::_M_reinsert_node_multi): Use hint parameter >>> if available to extract >>> hash code. >>> (_Hashtable<>::_M_compute_hash_code(const _Hash&, >>> __node_ptr, __node_ptr, >>> const key_type&)): New. >>> (_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr, >>> __node_ptr, >>> const key_type&)): New. >>> (_Hashtable<>::_M_merge_unique): Adapt to use latter. >>> Implement small size >>> optimization. >>> (_Hashtable<>::_M_get_insert_info(const _Hash&, >>> __node_ptr, __node_ptr, >>> const key_type&)): New. >>> (_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr, >>> __node_ptr, >>> const key_type&)): New. >>> (_Hashtable<>::_M_merge_multi): Adapt to use latter. >>> * include/bits/unordered_map.h >>> (unordered_map<>::insert(node_type&&)): Pass cend as >>> hint. >>> (unordered_map<>::insert(const_iterator, node_type&&)): >>> Adapt to use hint. >>> * include/bits/unordered_set.h >>> (unordered_set<>::insert(node_type&&)): Pass cend as >>> hint. >>> (unordered_set<>::insert(const_iterator, node_type&&)): >>> Adapt to use hint. >>> * >>> testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt >>> implementation >>> specific tests. >>> >>> Tested under linux x86_64. >>> >>> Here is the performance results of this patch, before: >>> >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> insertions w/o hint 94r 94u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> insertions with perfect hint 95r 94u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> insertions with bad hint 94r 94u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> range insertions 88r 88u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 >>> insertions w/o hint 91r 91u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 >>> insertions with perfect hint 92r 93u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 >>> insertions with bad hint 93r 93u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 range >>> insertions 88r 88u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 >>> insertions w/o hint 94r 95u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 >>> insertions with perfect hint 94r 95u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 >>> insertions with bad hint 94r 94u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 range >>> insertions 90r 90u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 insertions >>> w/o hint 68r 68u 0s 223999264mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 insertions >>> with perfect hint 67r 67u 0s 223999264mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 insertions >>> with bad hint 68r 68u 0s 223999264mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 range >>> insertions 62r 62u 0s 223999264mem 0pf >>> >>> After: >>> >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> insertions w/o hint 93r 93u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> insertions with perfect hint 58r 59u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> insertions with bad hint 93r 94u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000 >>> range insertions 52r 51u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 >>> insertions w/o hint 96r 95u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 >>> insertions with perfect hint 61r 62u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 >>> insertions with bad hint 94r 94u 0s 191999712mem 0pf >>> unordered_multiset_hint.cc-thread hash code cached 2000000 range >>> insertions 52r 52u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 >>> insertions w/o hint 96r 95u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 >>> insertions with perfect hint 58r 59u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 >>> insertions with bad hint 94r 94u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code NOT cached 2000000 range >>> insertions 52r 52u 0s 191999712mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 insertions >>> w/o hint 70r 69u 0s 223999264mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 insertions >>> with perfect hint 67r 67u 0s 223999264mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 insertions >>> with bad hint 67r 67u 0s 223999264mem 0pf >>> unordered_set_hint.cc-thread hash code cached 2000000 range >>> insertions 63r 62u 0s 223999264mem 0pf >>> >>> Ok to commit ? >>> >>> François