From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frs.dumont@gmail.com>
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com
 [IPv6:2a00:1450:4864:20::42b])
 by sourceware.org (Postfix) with ESMTPS id 63FCB385841F;
 Tue, 21 Dec 2021 06:07:17 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 63FCB385841F
Received: by mail-wr1-x42b.google.com with SMTP id e5so24611272wrc.5;
 Mon, 20 Dec 2021 22:07:17 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:subject:from:to:cc:references:message-id:date
 :user-agent:mime-version:in-reply-to:content-language;
 bh=WJL/axG1131Z1Rf+NuZyfkK8Ut3G9N69n8H7KgPvsOc=;
 b=65nagTSagvhBBU/KnW5C+/pvoJB+Sl2HTVFTYCi2S7D0LyfLXu8rAIde3bHjIEuAE5
 l6KFb7Voz4272bdqOX8N0a+Nyq2oX7N1K6xuM5VF2M3TQmP4L7tt7u+J/5CnCA2iyWug
 uQVnI71nvuuTbt0MpW3rdWEJ7sUV5WqupSzc+zsj1r2xyLZ186SkQfGsqjkMc7w2n5kP
 v6pjLDA53b6fryRZl43O5q6AgACqA0jyTU44ivkpbMS+5Kih5G+0qDbMAKgu+b7j5ikX
 DgvfbltNMARGyLJAqXCb3o4i9s8xRW4dBsZGbAHYGO3QPudDyEqRUNB9OyAopP0fYcCZ
 xUxA==
X-Gm-Message-State: AOAM532JSK55EnpMTwNANk89Jrwg5OM4FJPyGxX9Lf1kCUwzDxiraFHb
 qZFzFP0hhnzmpWR+QAUSsS/crFMB+g0=
X-Google-Smtp-Source: ABdhPJw9L29cTPQuZSMkquZO4nmK20sSfOFql8iCvmCPMEAmpfv3iXFRwjrfwa4ez5ToWtvZxlMvSQ==
X-Received: by 2002:a05:6000:1684:: with SMTP id
 y4mr1225571wrd.26.1640066836135; 
 Mon, 20 Dec 2021 22:07:16 -0800 (PST)
Received: from [10.21.4.78] ([109.190.253.11])
 by smtp.googlemail.com with ESMTPSA id g3sm13943081wrp.79.2021.12.20.22.07.14
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Mon, 20 Dec 2021 22:07:15 -0800 (PST)
Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= <frs.dumont@gmail.com>
To: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
References: <88d3aa39-2146-3ea5-8bb2-a4d9b70accbd@gmail.com>
 <20200717125809.GC2827066@redhat.com>
 <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com>
 <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com>
Message-ID: <73be2079-5046-c06f-f144-1af85052fc50@gmail.com>
Date: Tue, 21 Dec 2021 07:07:14 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com>
Content-Type: multipart/mixed; boundary="------------666919DA253E3739EE49B3EC"
Content-Language: fr
X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM,
 GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_BARRACUDACENTRAL,
 RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libstdc++@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libstdc++ mailing list <libstdc++.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/libstdc++/>
List-Post: <mailto:libstdc++@gcc.gnu.org>
List-Help: <mailto:libstdc++-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Dec 2021 06:07:21 -0000

This is a multi-part message in MIME format.
--------------666919DA253E3739EE49B3EC
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit

Hi

     Is there a chance for this patch to be integrated for next gcc 
release ?

François


On 23/09/21 6:36 am, François Dumont wrote:
> Ping ?
>
> On 16/08/21 9:03 pm, François Dumont wrote:
>> On 17/07/20 2:58 pm, Jonathan Wakely wrote:
>>> On 17/11/19 22:31 +0100, FranÃ§ois Dumont wrote:
>>>> This is an implementation of PR 68303.
>>>>
>>>> I try to use this idea as much as possible to avoid computation of 
>>>> hash codes.
>>>>
>>>> Note that tests are not showing any gain. I guess hash computation 
>>>> must be quite bad to get a benefit from it. So I am only activating 
>>>> it when hash code is not cached and/or when computation is not fast.
>>>
>>> If the tests don't show any benefit, why bother making the change?
>>
>> I eventually managed to demonstrate this optimization through a 
>> performance test case.
>>
>>>
>>> Does it help the example in the PR?
>>
>> No, the code attached to the PR just show what the user has done to 
>> put in place this optim on his side.
>>
>> What I needed was a slow hash code computation compared to the equal 
>> operation. I realized that I had to use longer string to achieve this.
>>
>> Moreover making this optim dependant on 
>> _Hashtable_traits::__hash_cached was just wrong as we cannot use the 
>> cached hash code here as the input is a key instance, not a node.
>>
>> I introduce _Hashtable_hash_traits<_Hash> to offer a single 
>> customization point as this optim depends highly on the difference 
>> between a hash code computation and a comparison. Maybe I should put 
>> it at std namespace scope to ease partial specialization ?
>>
>> Performance test results before the patch:
>>
>> unordered_small_size.cc      std::unordered_set<int>: 1st insert      
>> 40r   32u    8s 264000112mem    0pf
>> unordered_small_size.cc      std::unordered_set<int>: find/erase      
>> 22r   22u    0s -191999808mem    0pf
>> unordered_small_size.cc      std::unordered_set<int>: 2nd insert      
>> 36r   36u    0s 191999776mem    0pf
>> unordered_small_size.cc      std::unordered_set<int>: erase key       
>> 25r   25u    0s -191999808mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: 1st 
>> insert     404r  244u  156s -1989936256mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: 
>> find/erase     315r  315u    0s 2061942272mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: 2nd 
>> insert     233r  233u    0s -2061942208mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: erase key 
>>      299r  298u    0s 2061942208mem    0pf
>>
>> after the patch:
>>
>> unordered_small_size.cc      std::unordered_set<int>: 1st insert      
>> 41r   33u    7s 264000112mem    0pf
>> unordered_small_size.cc      std::unordered_set<int>: find/erase      
>> 24r   25u    1s -191999808mem    0pf
>> unordered_small_size.cc      std::unordered_set<int>: 2nd insert      
>> 34r   34u    0s 191999776mem    0pf
>> unordered_small_size.cc      std::unordered_set<int>: erase key       
>> 25r   25u    0s -191999808mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: 1st 
>> insert     399r  232u  165s -1989936256mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: 
>> find/erase     196r  197u    0s 2061942272mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: 2nd 
>> insert     221r  222u    0s -2061942208mem    0pf
>> unordered_small_size.cc      std::unordered_set<string>: erase key 
>>      178r  178u    0s 2061942208mem    0pf
>>
>>     libstdc++: Optimize operations on small size hashtable [PR 68303]
>>
>>     When hasher is identified as slow and the number of elements is 
>> limited in the
>>     container use a brute-force loop on those elements to look for a 
>> given key using
>>     the key_equal functor. For the moment the default threshold below 
>> which the
>>     container is considered as small is 20.
>>
>>     libstdc++-v3/ChangeLog:
>>
>>             PR libstdc++/68303
>>             * include/bits/hashtable_policy.h
>>             (_Hashtable_hash_traits<_Hash>): New.
>>             (_Hash_code_base<>::_M_hash_code(const 
>> _Hash_node_value<>&)): New.
>>             (_Hashtable_base<>::_M_key_equals): New.
>>             (_Hashtable_base<>::_M_equals): Use latter.
>>             (_Hashtable_base<>::_M_key_equals_tr): New.
>>             (_Hashtable_base<>::_M_equals_tr): Use latter.
>>             * include/bits/hashtable.h
>>             (_Hashtable<>::__small_size_threshold()): New, use 
>> _Hashtable_hash_traits.
>>             (_Hashtable<>::find): Loop through elements to look for 
>> key if size is lower
>>             than __small_size_threshold().
>>             (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise.
>>             (_Hashtable<>::_M_insert_unique(_Kt&&, _Args&&, const 
>> _NodeGenerator&)): Likewise.
>> (_Hashtable<>::_M_compute_hash_code(const_iterator, const 
>> key_type&)): New.
>>             (_Hashtable<>::_M_emplace(const_iterator, false_type, 
>> _Args&&...)): Use latter.
>>             (_Hashtable<>::_M_find_before_node(const key_type&)): New.
>>             (_Hashtable<>::_M_erase(true_type, const key_type&)): Use 
>> latter.
>>             (_Hashtable<>::_M_erase(false_type, const key_type&)): 
>> Likewise.
>>             * src/c++11/hashtable_c++0x.cc: Include 
>> <bits/functional_hash.h>.
>>             * testsuite/util/testsuite_performane.h
>>             (report_performance): Use 9 width to display memory.
>>             * 
>> testsuite/performance/23_containers/insert_erase/unordered_small_size.cc:
>>             New performance test case.
>>
>> Tested under Linux x86_64.
>>
>> Ok to commit ?
>>
>> François
>>
>


--------------666919DA253E3739EE49B3EC
Content-Type: text/x-patch; charset=UTF-8;
 name="pr68303.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="pr68303.patch"

diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 6e2d4c10cfe..6b2c6b99fef 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -419,6 +419,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       _M_uses_single_bucket() const
       { return _M_uses_single_bucket(_M_buckets); }
 
+      static constexpr size_t
+      __small_size_threshold()
+      {
+	return
+	  __detail::_Hashtable_hash_traits<_Hash>::__small_size_threshold();
+      }
+
       __hashtable_alloc&
       _M_base_alloc() { return *this; }
 
@@ -788,6 +795,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       _M_bucket_index(__hash_code __c) const
       { return __hash_code_base::_M_bucket_index(__c, _M_bucket_count); }
 
+      __node_base_ptr
+      _M_find_before_node(const key_type&);
+
       // Find and insert helper functions and types
       // Find the node before the one matching the criteria.
       __node_base_ptr
@@ -831,6 +841,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       __node_base_ptr
       _M_get_previous_node(size_type __bkt, __node_ptr __n);
 
+      pair<const_iterator, __hash_code>
+      _M_compute_hash_code(const_iterator __hint, const key_type& __k) const;
+
       // Insert node __n with hash code __code, in bucket __bkt if no
       // rehash (assumes no element with same key already present).
       // Takes ownership of __n if insertion succeeds, throws otherwise.
@@ -1126,7 +1139,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       void _M_rehash(size_type __bkt_count, const __rehash_state& __state);
     };
 
-
   // Definitions of class template _Hashtable's out-of-line member functions.
   template<typename _Key, typename _Value, typename _Alloc,
 	   typename _ExtractKey, typename _Equal,
@@ -1628,6 +1640,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     find(const key_type& __k)
     -> iterator
     {
+      if (size() <= __small_size_threshold())
+	{
+	  for (auto __it = begin(); __it != end(); ++__it)
+	    if (this->_M_key_equals(__k, *__it._M_cur))
+	      return __it;
+	  return end();
+	}
+
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __bkt = _M_bucket_index(__code);
       return iterator(_M_find_node(__bkt, __k, __code));
@@ -1643,6 +1663,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     find(const key_type& __k) const
     -> const_iterator
     {
+      if (size() <= __small_size_threshold())
+	{
+	  for (auto __it = begin(); __it != end(); ++__it)
+	    if (this->_M_key_equals(__k, *__it._M_cur))
+	      return __it;
+	  return end();
+	}
+
       __hash_code __code = this->_M_hash_code(__k);
       std::size_t __bkt = _M_bucket_index(__code);
       return const_iterator(_M_find_node(__bkt, __k, __code));
@@ -1855,6 +1883,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       }
 #endif
 
+  // Find the node before the one whose key compares equal to k.
+  // Return nullptr if no node is found.
+  template<typename _Key, typename _Value, typename _Alloc,
+	   typename _ExtractKey, typename _Equal,
+	   typename _Hash, typename _RangeHash, typename _Unused,
+	   typename _RehashPolicy, typename _Traits>
+    auto
+    _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
+	       _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
+    _M_find_before_node(const key_type& __k)
+    -> __node_base_ptr
+    {
+      __node_base_ptr __prev_p = &_M_before_begin;
+      if (!__prev_p->_M_nxt)
+	return nullptr;
+
+      for (__node_ptr __p = static_cast<__node_ptr>(__prev_p->_M_nxt);
+	   __p != nullptr;
+	   __p = __p->_M_next())
+	{
+	  if (this->_M_key_equals(__k, *__p))
+	    return __prev_p;
+
+	  __prev_p = __p;
+	}
+
+      return nullptr;
+    }
+
   // Find the node before the one whose key compares equal to k in the bucket
   // bkt. Return nullptr if no node is found.
   template<typename _Key, typename _Value, typename _Alloc,
@@ -2003,11 +2060,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	// First build the node to get access to the hash code
 	_Scoped_node __node { this, std::forward<_Args>(__args)...  };
 	const key_type& __k = _ExtractKey{}(__node._M_node->_M_v());
+	if (size() <= __small_size_threshold())
+	  {
+	    for (auto __it = begin(); __it != end(); ++__it)
+	      if (this->_M_key_equals(__k, *__it._M_cur))
+		// There is already an equivalent node, no insertion
+		return { __it, false };
+	  }
+
 	__hash_code __code = this->_M_hash_code(__k);
 	size_type __bkt = _M_bucket_index(__code);
+	if (size() > __small_size_threshold())
 	  if (__node_ptr __p = _M_find_node(__bkt, __k, __code))
 	    // There is already an equivalent node, no insertion
-	  return std::make_pair(iterator(__p), false);
+	    return { iterator(__p), false };
 
 	// Insert the node
 	auto __pos = _M_insert_unique_node(__bkt, __code, __node._M_node);
@@ -2031,13 +2097,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	_Scoped_node __node { this, std::forward<_Args>(__args)...  };
 	const key_type& __k = _ExtractKey{}(__node._M_node->_M_v());
 
-	__hash_code __code = this->_M_hash_code(__k);
+	auto __res = this->_M_compute_hash_code(__hint, __k);
 	auto __pos
-	  = _M_insert_multi_node(__hint._M_cur, __code, __node._M_node);
+	  = _M_insert_multi_node(__res.first._M_cur, __res.second,
+				 __node._M_node);
 	__node._M_node = nullptr;
 	return __pos;
       }
 
+  template<typename _Key, typename _Value, typename _Alloc,
+	   typename _ExtractKey, typename _Equal,
+	   typename _Hash, typename _RangeHash, typename _Unused,
+	   typename _RehashPolicy, typename _Traits>
+    auto
+    _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
+	       _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
+    _M_compute_hash_code(const_iterator __hint, const key_type& __k) const
+    -> pair<const_iterator, __hash_code>
+    {
+      if (size() <= __small_size_threshold())
+	{
+	  if (__hint != cend())
+	    {
+	      for (auto __it = __hint; __it != cend(); ++__it)
+		if (this->_M_key_equals(__k, *__it._M_cur))
+		  return { __it, this->_M_hash_code(*__it._M_cur) };
+	    }
+
+	  for (auto __it = cbegin(); __it != __hint; ++__it)
+	    if (this->_M_key_equals(__k, *__it._M_cur))
+	      return { __it, this->_M_hash_code(*__it._M_cur) };
+	}
+
+      return { __hint, this->_M_hash_code(__k) };
+    }
+
   template<typename _Key, typename _Value, typename _Alloc,
 	   typename _ExtractKey, typename _Equal,
 	   typename _Hash, typename _RangeHash, typename _Unused,
@@ -2136,9 +2230,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		       const _NodeGenerator& __node_gen)
       -> pair<iterator, bool>
       {
+	if (size() <= __small_size_threshold())
+	  for (auto __it = begin(); __it != end(); ++__it)
+	    if (this->_M_key_equals_tr(__k, *__it._M_cur))
+	      return { __it, false };
+
 	__hash_code __code = this->_M_hash_code_tr(__k);
 	size_type __bkt = _M_bucket_index(__code);
 
+	if (size() > __small_size_threshold())
 	  if (__node_ptr __node = _M_find_node_tr(__bkt, __k, __code))
 	    return { iterator(__node), false };
 
@@ -2172,11 +2272,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	_Scoped_node __node{ __node_gen(std::forward<_Arg>(__v)), this };
 
 	// Second compute the hash code so that we don't rehash if it throws.
-	__hash_code __code
-	  = this->_M_hash_code(_ExtractKey{}(__node._M_node->_M_v()));
+	auto __res = this->_M_compute_hash_code(
+	  __hint, _ExtractKey{}(__node._M_node->_M_v()));
 
 	auto __pos
-	  = _M_insert_multi_node(__hint._M_cur, __code, __node._M_node);
+	  = _M_insert_multi_node(__res.first._M_cur, __res.second,
+				 __node._M_node);
 	__node._M_node = nullptr;
 	return __pos;
       }
@@ -2238,17 +2339,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	       _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
     _M_erase(true_type /* __uks */, const key_type& __k)
     -> size_type
+    {
+      __node_base_ptr __prev_n;
+      __node_ptr __n;
+      std::size_t __bkt;
+      if (size() <= __small_size_threshold())
+	{
+	  __prev_n = _M_find_before_node(__k);
+	  if (!__prev_n)
+	    return 0;
+
+	  // We found a matching node, erase it.
+	  __n = static_cast<__node_ptr>(__prev_n->_M_nxt);
+	  __bkt = _M_bucket_index(*__n);
+	}
+      else
 	{
 	  __hash_code __code = this->_M_hash_code(__k);
-      std::size_t __bkt = _M_bucket_index(__code);
+	  __bkt = _M_bucket_index(__code);
 
 	  // Look for the node before the first matching node.
-      __node_base_ptr __prev_n = _M_find_before_node(__bkt, __k, __code);
+	  __prev_n = _M_find_before_node(__bkt, __k, __code);
 	  if (!__prev_n)
 	    return 0;
 
 	  // We found a matching node, erase it.
-      __node_ptr __n = static_cast<__node_ptr>(__prev_n->_M_nxt);
+	  __n = static_cast<__node_ptr>(__prev_n->_M_nxt);
+	}
+
       _M_erase(__bkt, __prev_n, __n);
       return 1;
     }
@@ -2262,22 +2380,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	       _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
     _M_erase(false_type /* __uks */, const key_type& __k)
     -> size_type
+    {
+      std::size_t __bkt;
+      __node_base_ptr __prev_n;
+      __node_ptr __n;
+      if (size() <= __small_size_threshold())
+	{
+	  __prev_n = _M_find_before_node(__k);
+	  if (!__prev_n)
+	    return 0;
+
+	  // We found a matching node, erase it.
+	  __n = static_cast<__node_ptr>(__prev_n->_M_nxt);
+	  __bkt = _M_bucket_index(*__n);
+	}
+      else
 	{
 	  __hash_code __code = this->_M_hash_code(__k);
-      std::size_t __bkt = _M_bucket_index(__code);
+	  __bkt = _M_bucket_index(__code);
 
 	  // Look for the node before the first matching node.
-      __node_base_ptr __prev_n = _M_find_before_node(__bkt, __k, __code);
+	  __prev_n = _M_find_before_node(__bkt, __k, __code);
 	  if (!__prev_n)
 	    return 0;
 
+	  __n = static_cast<__node_ptr>(__prev_n->_M_nxt);
+	}
+
       // _GLIBCXX_RESOLVE_LIB_DEFECTS
       // 526. Is it undefined if a function in the standard changes
       // in parameters?
       // We use one loop to find all matching nodes and another to deallocate
       // them so that the key stays valid during the first loop. It might be
       // invalidated indirectly when destroying nodes.
-      __node_ptr __n = static_cast<__node_ptr>(__prev_n->_M_nxt);
       __node_ptr __n_last = __n->_M_next();
       while (__n_last && this->_M_node_equals(*__n, *__n_last))
 	__n_last = __n_last->_M_next();
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 0b5443fc187..b4a8af081ce 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -246,6 +246,20 @@ namespace __detail
       using __unique_keys = __bool_constant<_Unique_keys>;
     };
 
+  /**
+   *  struct _Hashtable_hash_traits
+   *
+   *  Important traits for hash tables depending on associated hasher.
+   *
+   */
+  template<typename _Hash>
+    struct _Hashtable_hash_traits
+    {
+      static constexpr std::size_t
+      __small_size_threshold()
+      { return std::__is_fast_hash<_Hash>::value ? 0 : 20; }
+    };
+
   /**
    *  struct _Hash_node_base
    *
@@ -1105,10 +1119,12 @@ namespace __detail
 			_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits,
 			true_type /* Has load factor */>
     {
+    private:
       using __hashtable = _Hashtable<_Key, _Value, _Alloc, _ExtractKey,
 				     _Equal, _Hash, _RangeHash, _Unused,
 				     _RehashPolicy, _Traits>;
 
+    public:
       float
       max_load_factor() const noexcept
       {
@@ -1263,6 +1279,14 @@ namespace __detail
 		const _Hash_node_value<_Value, __cache_hash_code>& __n) const
 	{ return _M_hash_code(_ExtractKey{}(__n._M_v())); }
 
+      __hash_code
+      _M_hash_code(const _Hash_node_value<_Value, false>& __n) const
+      { return _M_hash_code(_ExtractKey{}(__n._M_v())); }
+
+      __hash_code
+      _M_hash_code(const _Hash_node_value<_Value, true>& __n) const
+      { return __n._M_hash_code; }
+
       std::size_t
       _M_bucket_index(__hash_code __c, std::size_t __bkt_count) const
       { return _RangeHash{}(__c, __bkt_count); }
@@ -1273,17 +1297,14 @@ namespace __detail
 	noexcept( noexcept(declval<const _Hash&>()(declval<const _Key&>()))
 		  && noexcept(declval<const _RangeHash&>()((__hash_code)0,
 							   (std::size_t)0)) )
-      {
-	return _RangeHash{}(_M_hash_code(_ExtractKey{}(__n._M_v())),
-			    __bkt_count);
-      }
+      { return _M_bucket_index(_M_hash_code(__n), __bkt_count); }
 
       std::size_t
       _M_bucket_index(const _Hash_node_value<_Value, true>& __n,
 		      std::size_t __bkt_count) const
 	noexcept( noexcept(declval<const _RangeHash&>()((__hash_code)0,
 							(std::size_t)0)) )
-      { return _RangeHash{}(__n._M_hash_code, __bkt_count); }
+      { return _M_bucket_index(__n._M_hash_code, __bkt_count); }
 
       void
       _M_store_code(_Hash_node_code_cache<false>&, __hash_code) const
@@ -1641,18 +1662,19 @@ namespace __detail
       { }
 
       bool
-      _M_equals(const _Key& __k, __hash_code __c,
-		const _Hash_node_value<_Value, __hash_cached::value>& __n) const
+      _M_key_equals(const _Key& __k,
+		    const _Hash_node_value<_Value,
+					   __hash_cached::value>& __n) const
       {
 	static_assert(__is_invocable<const _Equal&, const _Key&, const _Key&>{},
 	  "key equality predicate must be invocable with two arguments of "
 	  "key type");
-	return _S_equals(__c, __n) && _M_eq()(__k, _ExtractKey{}(__n._M_v()));
+	return _M_eq()(__k, _ExtractKey{}(__n._M_v()));
       }
 
       template<typename _Kt>
 	bool
-	_M_equals_tr(const _Kt& __k, __hash_code __c,
+	_M_key_equals_tr(const _Kt& __k,
 			 const _Hash_node_value<_Value,
 					     __hash_cached::value>& __n) const
 	{
@@ -1660,16 +1682,28 @@ namespace __detail
 	    __is_invocable<const _Equal&, const _Kt&, const _Key&>{},
 	    "key equality predicate must be invocable with two arguments of "
 	    "key type");
-	  return _S_equals(__c, __n) && _M_eq()(__k, _ExtractKey{}(__n._M_v()));
+	  return _M_eq()(__k, _ExtractKey{}(__n._M_v()));
 	}
 
+      bool
+      _M_equals(const _Key& __k, __hash_code __c,
+		const _Hash_node_value<_Value, __hash_cached::value>& __n) const
+      { return _S_equals(__c, __n) && _M_key_equals(__k, __n); }
+
+      template<typename _Kt>
+	bool
+	_M_equals_tr(const _Kt& __k, __hash_code __c,
+		     const _Hash_node_value<_Value,
+					    __hash_cached::value>& __n) const
+	{ return _S_equals(__c, __n) && _M_key_equals_tr(__k, __n); }
+
       bool
       _M_node_equals(
 	const _Hash_node_value<_Value, __hash_cached::value>& __lhn,
 	const _Hash_node_value<_Value, __hash_cached::value>& __rhn) const
       {
 	return _S_node_equals(__lhn, __rhn)
-	  && _M_eq()(_ExtractKey{}(__lhn._M_v()), _ExtractKey{}(__rhn._M_v()));
+	  && _M_key_equals(_ExtractKey{}(__lhn._M_v()), __rhn);
       }
 
       void
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index b9dec61b2ea..116192dfcf1 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -30,6 +30,7 @@
 #include <tuple>
 #include <ext/aligned_buffer.h>
 #include <ext/alloc_traits.h>
+#include <bits/functional_hash.h>
 #include <bits/hashtable_policy.h>
 
 namespace std _GLIBCXX_VISIBILITY(default)
diff --git a/libstdc++-v3/testsuite/performance/23_containers/insert_erase/unordered_small_size.cc b/libstdc++-v3/testsuite/performance/23_containers/insert_erase/unordered_small_size.cc
new file mode 100644
index 00000000000..ae63c15b5da
--- /dev/null
+++ b/libstdc++-v3/testsuite/performance/23_containers/insert_erase/unordered_small_size.cc
@@ -0,0 +1,125 @@
+// { dg-do run { target c++11 } }
+
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <string>
+#include <sstream>
+#include <vector>
+#include <unordered_set>
+#include <testsuite_performance.h>
+
+namespace
+{
+  const int nb_elements = 20;
+  const int nb_insts = 150000;
+
+  template<typename _ElemType>
+    void bench(const char* desc, const std::vector<_ElemType>& elems)
+    {
+      using namespace __gnu_test;
+
+      time_counter time;
+      resource_counter resource;
+
+      std::vector<std::unordered_set<_ElemType>> insts(nb_insts);
+      for (int j = 0; j != nb_insts; ++j)
+	insts.emplace_back();
+
+      start_counters(time, resource);
+
+      for (auto& us : insts)
+	for (int i = 0; i != nb_elements; ++i)
+	  us.insert(elems[i]);
+
+      stop_counters(time, resource);
+
+      std::ostringstream ostr;
+      ostr << desc << " 1st insert";
+      report_performance(__FILE__, ostr.str().c_str(), time, resource);
+
+      start_counters(time, resource);
+
+      for (auto& us : insts)
+	for (int i = nb_elements - 1; i >= 0; --i)
+	  {
+	    auto it = us.find(elems[i]);
+	    if (it != us.end())
+	      us.erase(it);
+	  }
+
+      stop_counters(time, resource);
+
+      ostr.str("");
+      ostr << desc << " find/erase";
+      report_performance(__FILE__, ostr.str().c_str(), time, resource);
+
+      start_counters(time, resource);
+
+      for (auto& us : insts)
+	{
+	  us.insert(elems[0]);
+	  for (int i = nb_elements - 1; i >= 0; --i)
+	    us.insert(elems[i]);
+	}
+
+      stop_counters(time, resource);
+      ostr.str("");
+      ostr << desc << " 2nd insert";
+      report_performance(__FILE__, ostr.str().c_str(), time, resource);
+
+      start_counters(time, resource);
+
+      for (auto& us : insts)
+	for (int j = nb_elements - 1; j >= 0; --j)
+	  us.erase(elems[j]);
+
+      stop_counters(time, resource);
+      ostr.str("");
+      ostr << desc << " erase key ";
+      report_performance(__FILE__, ostr.str().c_str(), time, resource);
+    }
+}
+
+int main()
+{
+  {
+    std::vector<int> elems;
+    elems.reserve(nb_elements);
+    for (int i = 0; i != nb_elements; ++i)
+      elems.push_back(i);
+
+    bench("std::unordered_set<int>:    ", elems);
+  }
+
+  {
+    std::vector<std::string> elems;
+    {
+      elems.reserve(nb_elements);
+      for (int i = 0; i != nb_elements; ++i)
+	{
+	  std::ostringstream ostr;
+	  ostr << "string #" << i << ' ' << std::string(1000, 'a' + i);
+	  elems.push_back(ostr.str());
+	}
+    }
+
+    bench("std::unordered_set<string>: ", elems);
+  }
+
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/util/testsuite_performance.h b/libstdc++-v3/testsuite/util/testsuite_performance.h
index cba3a0d4b17..4ca15ab0e71 100644
--- a/libstdc++-v3/testsuite/util/testsuite_performance.h
+++ b/libstdc++-v3/testsuite/util/testsuite_performance.h
@@ -239,7 +239,7 @@ namespace __gnu_test
     out << std::setw(4) << t.real_time() << "r" << space;
     out << std::setw(4) << t.user_time() << "u" << space;
     out << std::setw(4) << t.system_time() << "s" << space;
-    out << std::setw(8) << r.allocated_memory() << "mem" << space;
+    out << std::setw(9) << r.allocated_memory() << "mem" << space;
     out << std::setw(4) << r.hard_page_fault() << "pf" << space;
 
     out << std::endl;

--------------666919DA253E3739EE49B3EC--