* [PATCH][Hashtable] Performance optimization through use of insertion hint
@ 2023-07-24 11:02 François Dumont
2023-08-29 19:51 ` François Dumont
0 siblings, 1 reply; 5+ messages in thread
From: François Dumont @ 2023-07-24 11:02 UTC (permalink / raw)
To: libstdc++; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 8819 bytes --]
libstdc++: [_Hashtable] Make more use of insertion hint
When inserting an element into an empty bucket we currently insert
the new node
after the before-begin node so in first position. The drawback of
doing this is
that we are forced to update the bucket that was containing this
before-begin
node to point to the newly inserted node. To do so we need at best
to do a modulo
to find this bucket and when hash code is not cached also compute it.
To avoid this side effect it is better to insert after the last
node. Adding
a member to keep track of this last node would be an abi breaking
change. Still we
can ask the user to maintain and provide this last node as an
insertion hint.
Adapt range insertion methods to try to detect the last node and
then use it as
the insertion hint.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h:
(_Insert_base::try_emplace): Adapt to use hint.
(_Insert_base::_M_insert_range): Try to detect container
last node and use it
as hint for insertions.
(_Hash_code_base::_M_hash_code(const _Hash&, const
_Hash_node_value<>&)): Remove.
(_Hash_code_base::_M_hash_code<_H2>(const _H2&, const
_Hash_node_value<>&)): Remove.
* include/bits/hashtable.h
(_Hashtable<>::_M_insert_bucket_begin): Add hint parameter
and use it when inserting
into an empty bucket if hint is the last container node.
(_Hashtable<>::_InsertInfo): New struct.
(_Hashtable<>::_M_get_insert_info): New, return latter.
(_Hashtable<>::_M_insert_multi_node): Add _InsertInfo
parameter.
(_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint
parameter.
(_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)): New.
(_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)): New.
(_Hashtable<>::_M_emplace()): Adapt to use latters.
(_Hashtable<>::_M_insert_unique): Add __node_ptr parameter.
(_Hashtable<>::_M_insert_unique_aux): Add __node_ptr parameter.
(_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const
_NodeGenerator&, true_type)):
Use latter.
(_Hashtable<>::_M_reinsert_node(const_iterator, node_type&&)):
Add hint parameter, adapt to use it.
(_Hashtable<>::_M_reinsert_node_multi): Use hint parameter
if available to extract
hash code.
(_Hashtable<>::_M_compute_hash_code(const _Hash&,
__node_ptr, __node_ptr,
const key_type&)): New.
(_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr, __node_ptr,
const key_type&)): New.
(_Hashtable<>::_M_merge_unique): Adapt to use latter.
Implement small size
optimization.
(_Hashtable<>::_M_get_insert_info(const _Hash&, __node_ptr,
__node_ptr,
const key_type&)): New.
(_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr, __node_ptr,
const key_type&)): New.
(_Hashtable<>::_M_merge_multi): Adapt to use latter.
* include/bits/unordered_map.h
(unordered_map<>::insert(node_type&&)): Pass cend as
hint.
(unordered_map<>::insert(const_iterator, node_type&&)):
Adapt to use hint.
* include/bits/unordered_set.h
(unordered_set<>::insert(node_type&&)): Pass cend as
hint.
(unordered_set<>::insert(const_iterator, node_type&&)):
Adapt to use hint.
*
testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt
implementation
specific tests.
Tested under linux x86_64.
Here is the performance results of this patch, before:
unordered_multiset_hint.cc-thread hash code NOT cached 2000000
insertions w/o hint 94r 94u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code NOT cached 2000000
insertions with perfect hint 95r 94u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code NOT cached 2000000
insertions with bad hint 94r 94u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code NOT cached 2000000 range
insertions 88r 88u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 insertions
w/o hint 91r 91u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 insertions
with perfect hint 92r 93u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 insertions
with bad hint 93r 93u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 range
insertions 88r 88u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 insertions
w/o hint 94r 95u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 insertions
with perfect hint 94r 95u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 insertions
with bad hint 94r 94u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 range
insertions 90r 90u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 insertions w/o
hint 68r 68u 0s 223999264mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 insertions with
perfect hint 67r 67u 0s 223999264mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 insertions with
bad hint 68r 68u 0s 223999264mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 range
insertions 62r 62u 0s 223999264mem 0pf
After:
unordered_multiset_hint.cc-thread hash code NOT cached 2000000
insertions w/o hint 93r 93u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code NOT cached 2000000
insertions with perfect hint 58r 59u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code NOT cached 2000000
insertions with bad hint 93r 94u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code NOT cached 2000000 range
insertions 52r 51u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 insertions
w/o hint 96r 95u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 insertions
with perfect hint 61r 62u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 insertions
with bad hint 94r 94u 0s 191999712mem 0pf
unordered_multiset_hint.cc-thread hash code cached 2000000 range
insertions 52r 52u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 insertions
w/o hint 96r 95u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 insertions
with perfect hint 58r 59u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 insertions
with bad hint 94r 94u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code NOT cached 2000000 range
insertions 52r 52u 0s 191999712mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 insertions w/o
hint 70r 69u 0s 223999264mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 insertions with
perfect hint 67r 67u 0s 223999264mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 insertions with
bad hint 67r 67u 0s 223999264mem 0pf
unordered_set_hint.cc-thread hash code cached 2000000 range
insertions 63r 62u 0s 223999264mem 0pf
Ok to commit ?
François
[-- Attachment #2: hashtable.patch --]
[-- Type: text/x-patch, Size: 50199 bytes --]
diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 954a1c7a58d..4b84b566aea 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -835,30 +835,45 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// Insert a node at the beginning of a bucket.
void
- _M_insert_bucket_begin(size_type __bkt, __node_ptr __node)
+ _M_insert_bucket_begin(__node_base_ptr __hint,
+ size_type __bkt, __node_ptr __node)
{
+ __node_base_ptr __prev;
if (_M_buckets[__bkt])
{
// Bucket is not empty, we just need to insert the new node
// after the bucket before begin.
- __node->_M_nxt = _M_buckets[__bkt]->_M_nxt;
- _M_buckets[__bkt]->_M_nxt = __node;
+ __prev = _M_buckets[__bkt];
}
else
{
- // The bucket is empty, the new node is inserted at the
- // beginning of the singly-linked list and the bucket will
- // contain _M_before_begin pointer.
- __node->_M_nxt = _M_before_begin._M_nxt;
- _M_before_begin._M_nxt = __node;
-
- if (__node->_M_nxt)
- // We must update former begin bucket that is pointing to
- // _M_before_begin.
- _M_buckets[_M_bucket_index(*__node->_M_next())] = __node;
-
- _M_buckets[__bkt] = &_M_before_begin;
+ // Based on benches using last node is not interested when hash code is
+ // cached in case of unique keys.
+ if (!(__unique_keys::value && __hash_cached::value) &&
+ __hint && !__hint->_M_nxt)
+ __prev = __hint;
+ else
+ {
+ // The bucket is empty, the new node is inserted at the
+ // beginning of the singly-linked list and the bucket will
+ // contain _M_before_begin pointer.
+ __prev = &_M_before_begin;
+
+ if (__prev->_M_nxt)
+ {
+ // We must update former begin bucket that is pointing to
+ // _M_before_begin.
+ size_type __bb_bkt = _M_bucket_index(
+ *static_cast<__node_ptr>(__prev->_M_nxt));
+ _M_buckets[__bb_bkt] = __node;
+ }
+ }
+
+ _M_buckets[__bkt] = __prev;
}
+
+ __node->_M_nxt = __prev->_M_nxt;
+ __prev->_M_nxt = __node;
}
// Remove the bucket first node
@@ -884,44 +899,64 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_base_ptr
_M_get_previous_node(size_type __bkt, __node_ptr __n);
- pair<__node_ptr, __hash_code>
- _M_compute_hash_code(__node_ptr __hint, const key_type& __k) const;
+ struct _InsertInfo
+ {
+ __node_base_ptr _M_prev;
+ __node_ptr _M_equi_n;
+ __node_ptr _M_hint;
+ __hash_code _M_code;
+ };
+
+ _InsertInfo
+ _M_get_insert_info(__node_ptr __hint, const key_type& __k,
+ __node_ptr __n = nullptr) const;
// Insert node __n with hash code __code, in bucket __bkt if no
// rehash (assumes no element with same key already present).
// Takes ownership of __n if insertion succeeds, throws otherwise.
iterator
- _M_insert_unique_node(size_type __bkt, __hash_code,
+ _M_insert_unique_node(__node_ptr __hint, size_type __bkt, __hash_code,
__node_ptr __n, size_type __n_elt = 1);
- // Insert node __n with key __k and hash code __code.
+ // Insert node __n after __prev if any.
// Takes ownership of __n if insertion succeeds, throws otherwise.
iterator
- _M_insert_multi_node(__node_ptr __hint,
- __hash_code __code, __node_ptr __n);
+ _M_insert_multi_node(const _InsertInfo& __info, __node_ptr __n);
template<typename... _Args>
std::pair<iterator, bool>
- _M_emplace(true_type __uks, _Args&&... __args);
+ _M_emplace_unique(__node_ptr __hint, _Args&&... __args);
template<typename... _Args>
iterator
- _M_emplace(false_type __uks, _Args&&... __args)
- { return _M_emplace(cend(), __uks, std::forward<_Args>(__args)...); }
+ _M_emplace_multi(__node_ptr __hint, _Args&&... __args);
+
+ template<typename... _Args>
+ std::pair<iterator, bool>
+ _M_emplace(true_type /*__uks*/, _Args&&... __args)
+ { return _M_emplace_unique(nullptr, std::forward<_Args>(__args)...); }
+
+ template<typename... _Args>
+ iterator
+ _M_emplace(false_type /*__uks*/, _Args&&... __args)
+ { return _M_emplace_multi(nullptr, std::forward<_Args>(__args)...); }
- // Emplace with hint, useless when keys are unique.
template<typename... _Args>
iterator
- _M_emplace(const_iterator, true_type __uks, _Args&&... __args)
- { return _M_emplace(__uks, std::forward<_Args>(__args)...).first; }
+ _M_emplace(__node_ptr __hint, true_type /*__uks*/, _Args&&... __args)
+ {
+ return _M_emplace_unique(__hint,
+ std::forward<_Args>(__args)...).first;
+ }
template<typename... _Args>
iterator
- _M_emplace(const_iterator, false_type __uks, _Args&&... __args);
+ _M_emplace(__node_ptr __hint, false_type /*__uks*/, _Args&&... __args)
+ { return _M_emplace_multi(__hint, std::forward<_Args>(__args)...); }
template<typename _Kt, typename _Arg, typename _NodeGenerator>
std::pair<iterator, bool>
- _M_insert_unique(_Kt&&, _Arg&&, const _NodeGenerator&);
+ _M_insert_unique(__node_ptr, _Kt&&, _Arg&&, const _NodeGenerator&);
template<typename _Kt>
static __conditional_t<
@@ -941,9 +976,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
template<typename _Arg, typename _NodeGenerator>
std::pair<iterator, bool>
- _M_insert_unique_aux(_Arg&& __arg, const _NodeGenerator& __node_gen)
+ _M_insert_unique_aux(__node_ptr __hint,
+ _Arg&& __arg, const _NodeGenerator& __node_gen)
{
- return _M_insert_unique(
+ return _M_insert_unique(__hint,
_S_forward_key(_ExtractKey{}(std::forward<_Arg>(__arg))),
std::forward<_Arg>(__arg), __node_gen);
}
@@ -955,7 +991,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
using __to_value
= __detail::_ConvertToValueType<_ExtractKey, value_type>;
- return _M_insert_unique_aux(
+ return _M_insert_unique_aux(nullptr,
__to_value{}(std::forward<_Arg>(__arg)), __node_gen);
}
@@ -966,25 +1002,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
using __to_value
= __detail::_ConvertToValueType<_ExtractKey, value_type>;
- return _M_insert(cend(),
+ return _M_insert(nullptr,
__to_value{}(std::forward<_Arg>(__arg)), __node_gen, __uks);
}
- // Insert with hint, not used when keys are unique.
+ // Insert with hint when keys are unique.
template<typename _Arg, typename _NodeGenerator>
iterator
- _M_insert(const_iterator, _Arg&& __arg,
- const _NodeGenerator& __node_gen, true_type __uks)
+ _M_insert(__node_ptr __hint, _Arg&& __arg,
+ const _NodeGenerator& __node_gen, true_type /* __uks */)
{
- return
- _M_insert(std::forward<_Arg>(__arg), __node_gen, __uks).first;
+ using __to_value
+ = __detail::_ConvertToValueType<_ExtractKey, value_type>;
+ return _M_insert_unique_aux(__hint,
+ __to_value{}(std::forward<_Arg>(__arg)), __node_gen).first;
}
// Insert with hint when keys are not unique.
template<typename _Arg, typename _NodeGenerator>
iterator
- _M_insert(const_iterator, _Arg&&,
- const _NodeGenerator&, false_type __uks);
+ _M_insert(__node_ptr __hint, _Arg&& __arg,
+ const _NodeGenerator& __node_gen, false_type /* __uks */);
size_type
_M_erase(true_type __uks, const key_type&);
@@ -1006,7 +1044,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
iterator
emplace_hint(const_iterator __hint, _Args&&... __args)
{
- return _M_emplace(__hint, __unique_keys{},
+ return _M_emplace(__hint._M_cur, __unique_keys{},
std::forward<_Args>(__args)...);
}
@@ -1041,7 +1079,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
#if __cplusplus > 201402L
/// Re-insert an extracted node into a container with unique keys.
insert_return_type
- _M_reinsert_node(node_type&& __nh)
+ _M_reinsert_node(const_iterator __hint, node_type&& __nh)
{
insert_return_type __ret;
if (__nh.empty())
@@ -1062,7 +1100,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
else
{
__ret.position
- = _M_insert_unique_node(__bkt, __code, __nh._M_ptr);
+ = _M_insert_unique_node(__hint._M_cur,
+ __bkt, __code, __nh._M_ptr);
__nh._M_ptr = nullptr;
__ret.inserted = true;
}
@@ -1080,11 +1119,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__glibcxx_assert(get_allocator() == __nh.get_allocator());
const key_type& __k = __nh._M_key();
- auto __code = this->_M_hash_code(__k);
- auto __ret
- = _M_insert_multi_node(__hint._M_cur, __code, __nh._M_ptr);
+ auto __info = _M_get_insert_info(__hint._M_cur, __k);
+ auto __it = _M_insert_multi_node(__info, __nh._M_ptr);
__nh._M_ptr = nullptr;
- return __ret;
+ return __it;
}
private:
@@ -1130,6 +1168,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return __nh;
}
+ private:
+ __hash_code
+ _M_compute_hash_code(const _Hash&, __node_ptr __n, const key_type&)
+ { return this->_M_hash_code(*__n); }
+
+ template<typename _H2>
+ __hash_code
+ _M_compute_hash_code(const _H2&, __node_ptr, const key_type& __k)
+ { return this->_M_hash_code(__k); }
+
+ public:
/// Merge from a compatible container into one with unique keys.
template<typename _Compatible_Hashtable>
void
@@ -1139,18 +1188,48 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
node_type>, "Node types are compatible");
__glibcxx_assert(get_allocator() == __src.get_allocator());
+ __node_ptr __hint = nullptr;
auto __n_elt = __src.size();
for (auto __i = __src.cbegin(), __end = __src.cend(); __i != __end;)
{
auto __pos = __i++;
const key_type& __k = _ExtractKey{}(*__pos);
- __hash_code __code
- = this->_M_hash_code(__src.hash_function(), *__pos._M_cur);
+ if (size() <= __small_size_threshold())
+ {
+ __node_ptr __last_n = nullptr;
+ bool __found = false;
+ for (auto __n = _M_begin(); __n;
+ __last_n = __n, __n = __n->_M_next())
+ if (this->_M_key_equals(__k, *__n))
+ {
+ __found = true;
+ break;
+ }
+
+ if (__found)
+ {
+ if (__n_elt != 1)
+ --__n_elt;
+ continue;
+ }
+ else if (!__hint)
+ __hint = __last_n;
+ }
+
+ __hash_code __code =
+ _M_compute_hash_code(__src.hash_function(), __pos._M_cur, __k);
size_type __bkt = _M_bucket_index(__code);
- if (_M_find_node(__bkt, __k, __code) == nullptr)
+ if (size() <= __small_size_threshold()
+ || _M_find_node(__bkt, __k, __code) == nullptr)
{
auto __nh = __src.extract(__pos);
- _M_insert_unique_node(__bkt, __code, __nh._M_ptr, __n_elt);
+ auto __pos = _M_insert_unique_node(
+ __hint, __bkt, __code, __nh._M_ptr, __n_elt)._M_cur;
+
+ // Keep only the last node as an insertion hint.
+ if (!__pos->_M_nxt)
+ __hint = __pos;
+
__nh._M_ptr = nullptr;
__n_elt = 1;
}
@@ -1159,6 +1238,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
}
+ private:
+ _InsertInfo
+ _M_get_insert_info(const _Hash&, __node_ptr __hint, __node_ptr __n,
+ const key_type& __k)
+ {
+ // Same _Hash functor, we can reuse __n hash code.
+ return _M_get_insert_info(__hint, __k, __n);
+ }
+
+ template<typename _H2>
+ _InsertInfo
+ _M_get_insert_info(const _H2&, __node_ptr __hint, __node_ptr,
+ const key_type& __k)
+ { return _M_get_insert_info(__hint, __k); }
+
+ public:
/// Merge from a compatible container into one with equivalent keys.
template<typename _Compatible_Hashtable>
void
@@ -1173,10 +1268,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
for (auto __i = __src.cbegin(), __end = __src.cend(); __i != __end;)
{
auto __pos = __i++;
- __hash_code __code
- = this->_M_hash_code(__src.hash_function(), *__pos._M_cur);
+ const key_type& __k = _ExtractKey{}(*__pos);
+ auto __info = _M_get_insert_info(__src.hash_function(),
+ __hint, __pos._M_cur, __k);
auto __nh = __src.extract(__pos);
- __hint = _M_insert_multi_node(__hint, __code, __nh._M_ptr)._M_cur;
+ auto __inserted_n =
+ _M_insert_multi_node(__info, __nh._M_ptr)._M_cur;
+
+ // Keep only the last node as an insertion hint.
+ if (!__inserted_n->_M_nxt)
+ __hint = __inserted_n;
+
__nh._M_ptr = nullptr;
}
}
@@ -1252,9 +1354,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_bucket_count = __bkt_count;
}
+ __node_ptr __hint = nullptr;
__alloc_node_gen_t __node_gen(*this);
for (; __f != __l; ++__f)
- _M_insert(*__f, __node_gen, __uks);
+ {
+ __node_ptr __insert_pos
+ = _M_insert(__hint, *__f, __node_gen, __uks)._M_cur;
+ if (!__insert_pos->_M_nxt)
+ __hint = __insert_pos;
+ }
}
template<typename _Key, typename _Value, typename _Alloc,
@@ -1681,9 +1789,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
if (size() <= __small_size_threshold())
{
- for (auto __it = _M_begin(); __it; __it = __it->_M_next())
- if (this->_M_key_equals(__k, *__it))
- return iterator(__it);
+ for (auto __n = _M_begin(); __n; __n = __n->_M_next())
+ if (this->_M_key_equals(__k, *__n))
+ return iterator(__n);
return end();
}
@@ -1704,9 +1812,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
if (size() <= __small_size_threshold())
{
- for (auto __it = _M_begin(); __it; __it = __it->_M_next())
- if (this->_M_key_equals(__k, *__it))
- return const_iterator(__it);
+ for (auto __n = _M_begin(); __n; __n = __n->_M_next())
+ if (this->_M_key_equals(__k, *__n))
+ return const_iterator(__n);
return end();
}
@@ -2036,7 +2144,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_emplace(true_type /* __uks */, _Args&&... __args)
+ _M_emplace_unique(__node_ptr __hint, _Args&&... __args)
-> pair<iterator, bool>
{
// First build the node to get access to the hash code
@@ -2044,10 +2152,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
const key_type& __k = _ExtractKey{}(__node._M_node->_M_v());
if (size() <= __small_size_threshold())
{
- for (auto __it = _M_begin(); __it; __it = __it->_M_next())
- if (this->_M_key_equals(__k, *__it))
+ __node_ptr __last_n = nullptr;
+ for (auto __n = _M_begin(); __n;
+ __last_n = __n, __n = __n->_M_next())
+ if (this->_M_key_equals(__k, *__n))
// There is already an equivalent node, no insertion
- return { iterator(__it), false };
+ return { iterator(__n), false };
+
+ __hint = __last_n;
}
__hash_code __code = this->_M_hash_code(__k);
@@ -2058,7 +2170,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return { iterator(__p), false };
// Insert the node
- auto __pos = _M_insert_unique_node(__bkt, __code, __node._M_node);
+ auto __pos = _M_insert_unique_node(__hint, __bkt, __code,
+ __node._M_node);
__node._M_node = nullptr;
return { __pos, true };
}
@@ -2071,17 +2184,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_emplace(const_iterator __hint, false_type /* __uks */,
- _Args&&... __args)
+ _M_emplace_multi(__node_ptr __hint, _Args&&... __args)
-> iterator
{
// First build the node to get its hash code.
_Scoped_node __node { this, std::forward<_Args>(__args)... };
const key_type& __k = _ExtractKey{}(__node._M_node->_M_v());
- auto __res = this->_M_compute_hash_code(__hint._M_cur, __k);
- auto __pos
- = _M_insert_multi_node(__res.first, __res.second, __node._M_node);
+ auto __info = _M_get_insert_info(__hint, __k);
+ auto __pos = _M_insert_multi_node(__info, __node._M_node);
__node._M_node = nullptr;
return __pos;
}
@@ -2093,26 +2204,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_compute_hash_code(__node_ptr __hint, const key_type& __k) const
- -> pair<__node_ptr, __hash_code>
+ _M_get_insert_info(__node_ptr __hint, const key_type& __k,
+ __node_ptr __equi_n) const
+ -> _InsertInfo
{
if (size() <= __small_size_threshold())
{
- if (__hint)
- {
- for (auto __it = __hint; __it; __it = __it->_M_next())
- if (this->_M_key_equals(__k, *__it))
- return { __it, this->_M_hash_code(*__it) };
- }
-
- for (auto __it = _M_begin(); __it != __hint; __it = __it->_M_next())
- if (this->_M_key_equals(__k, *__it))
- return { __it, this->_M_hash_code(*__it) };
+ __node_ptr __last_n = nullptr;
+ __node_base_ptr __prev
+ = const_cast<__node_base_ptr>(&_M_before_begin);
+ for (auto __n = static_cast<__node_ptr>(__prev->_M_nxt); __n;
+ __prev = __last_n = __n, __n = __n->_M_next())
+ if (this->_M_key_equals(__k, *__n))
+ return
+ {
+ __prev, __n, __hint,
+ __hash_cached::value ? this->_M_hash_code(*__n) : 0
+ };
- __hint = nullptr;
+ __hint = __last_n;
}
- return { __hint, this->_M_hash_code(__k) };
+ return
+ {
+ nullptr, nullptr, __hint,
+ __equi_n ? this->_M_hash_code(*__equi_n) : this->_M_hash_code(__k)
+ };
}
template<typename _Key, typename _Value, typename _Alloc,
@@ -2122,7 +2239,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_insert_unique_node(size_type __bkt, __hash_code __code,
+ _M_insert_unique_node(__node_ptr __hint,
+ size_type __bkt, __hash_code __code,
__node_ptr __node, size_type __n_elt)
-> iterator
{
@@ -2140,7 +2258,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
this->_M_store_code(*__node, __code);
// Always insert at the beginning of the bucket.
- _M_insert_bucket_begin(__bkt, __node);
+ _M_insert_bucket_begin(__hint, __bkt, __node);
++_M_element_count;
return iterator(__node);
}
@@ -2152,8 +2270,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_insert_multi_node(__node_ptr __hint,
- __hash_code __code, __node_ptr __node)
+ _M_insert_multi_node(const _InsertInfo& __info, __node_ptr __node)
-> iterator
{
const __rehash_state& __saved_state = _M_rehash_policy._M_state();
@@ -2163,39 +2280,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (__do_rehash.first)
_M_rehash(__do_rehash.second, __saved_state);
+ auto __code = __info._M_code;
this->_M_store_code(*__node, __code);
const key_type& __k = _ExtractKey{}(__node->_M_v());
- size_type __bkt = _M_bucket_index(__code);
+ auto __prev = __info._M_prev;
+ const bool __has_prev = __prev != nullptr;
+ size_type __bkt;
// Find the node before an equivalent one or use hint if it exists and
// if it is equivalent.
- __node_base_ptr __prev
- = __builtin_expect(__hint != nullptr, false)
- && this->_M_equals(__k, __code, *__hint)
- ? __hint
- : _M_find_before_node(__bkt, __k, __code);
+ if (!__has_prev)
+ {
+ __bkt = _M_bucket_index(__code);
+ __prev = _M_find_before_node(__bkt, __k, __code);
+ }
if (__prev)
{
// Insert after the node before the equivalent one.
__node->_M_nxt = __prev->_M_nxt;
__prev->_M_nxt = __node;
- if (__builtin_expect(__prev == __hint, false))
- // hint might be the last bucket node, in this case we need to
- // update next bucket.
- if (__node->_M_nxt
- && !this->_M_equals(__k, __code, *__node->_M_next()))
- {
- size_type __next_bkt = _M_bucket_index(*__node->_M_next());
- if (__next_bkt != __bkt)
- _M_buckets[__next_bkt] = __node;
- }
+
+ // __hint might be the last bucket node, in this case we need to
+ // update next bucket.
+ if (__has_prev && __node->_M_nxt != __info._M_equi_n)
+ {
+ const auto __nxt = __node->_M_next();
+ if (!this->_M_equals(__k, __code, *__nxt))
+ {
+ size_type __next_bkt = _M_bucket_index(*__nxt);
+ if (__next_bkt != _M_bucket_index(__code))
+ _M_buckets[__next_bkt] = __node;
+ }
+ }
}
else
// The inserted node has no equivalent in the hashtable. We must
// insert the new node at the beginning of the bucket to preserve
// equivalent elements' relative positions.
- _M_insert_bucket_begin(__bkt, __node);
+ _M_insert_bucket_begin(__info._M_hint, __bkt, __node);
+
++_M_element_count;
return iterator(__node);
}
@@ -2209,14 +2333,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_insert_unique(_Kt&& __k, _Arg&& __v,
+ _M_insert_unique(__node_ptr __hint, _Kt&& __k, _Arg&& __v,
const _NodeGenerator& __node_gen)
-> pair<iterator, bool>
{
if (size() <= __small_size_threshold())
- for (auto __it = _M_begin(); __it; __it = __it->_M_next())
- if (this->_M_key_equals_tr(__k, *__it))
- return { iterator(__it), false };
+ {
+ __node_ptr __last_n = nullptr;
+ for (auto __n = _M_begin(); __n;
+ __last_n = __n, __n = __n->_M_next())
+ if (this->_M_key_equals_tr(__k, *__n))
+ return { iterator(__n), false };
+
+ __hint = __last_n;
+ }
__hash_code __code = this->_M_hash_code_tr(__k);
size_type __bkt = _M_bucket_index(__code);
@@ -2231,8 +2361,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_gen),
this
};
+
auto __pos
- = _M_insert_unique_node(__bkt, __code, __node._M_node);
+ = _M_insert_unique_node(__hint, __bkt, __code, __node._M_node);
__node._M_node = nullptr;
return { __pos, true };
}
@@ -2246,20 +2377,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_insert(const_iterator __hint, _Arg&& __v,
- const _NodeGenerator& __node_gen,
+ _M_insert(__node_ptr __hint, _Arg&& __v, const _NodeGenerator& __node_gen,
false_type /* __uks */)
-> iterator
{
// First allocate new node so that we don't do anything if it throws.
- _Scoped_node __node{ __node_gen(std::forward<_Arg>(__v)), this };
+ _Scoped_node __node { __node_gen(std::forward<_Arg>(__v)), this };
// Second compute the hash code so that we don't rehash if it throws.
- auto __res = this->_M_compute_hash_code(
- __hint._M_cur, _ExtractKey{}(__node._M_node->_M_v()));
+ auto __info =
+ _M_get_insert_info(__hint, _ExtractKey{}(__node._M_node->_M_v()));
+ auto __pos = _M_insert_multi_node(__info, __node._M_node);
- auto __pos
- = _M_insert_multi_node(__res.first, __res.second, __node._M_node);
__node._M_node = nullptr;
return __pos;
}
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 347d468ea86..484d7a9eed2 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -818,7 +818,7 @@ namespace __detail
std::tuple<>()
};
auto __pos
- = __h->_M_insert_unique_node(__bkt, __code, __node._M_node);
+ = __h->_M_insert_unique_node(nullptr, __bkt, __code, __node._M_node);
__node._M_node = nullptr;
return __pos->second;
}
@@ -845,7 +845,7 @@ namespace __detail
std::tuple<>()
};
auto __pos
- = __h->_M_insert_unique_node(__bkt, __code, __node._M_node);
+ = __h->_M_insert_unique_node(nullptr, __bkt, __code, __node._M_node);
__node._M_node = nullptr;
return __pos->second;
}
@@ -933,13 +933,13 @@ namespace __detail
insert(const_iterator __hint, const value_type& __v)
{
__hashtable& __h = _M_conjure_hashtable();
- __node_gen_type __node_gen(__h);
- return __h._M_insert(__hint, __v, __node_gen, __unique_keys{});
+ __node_gen_type __node_gen(__h);
+ return __h._M_insert(__hint._M_cur, __v, __node_gen, __unique_keys{});
}
template<typename _KType, typename... _Args>
std::pair<iterator, bool>
- try_emplace(const_iterator, _KType&& __k, _Args&&... __args)
+ try_emplace(const_iterator __hint, _KType&& __k, _Args&&... __args)
{
__hashtable& __h = _M_conjure_hashtable();
auto __code = __h._M_hash_code(__k);
@@ -954,7 +954,8 @@ namespace __detail
std::forward_as_tuple(std::forward<_Args>(__args)...)
};
auto __it
- = __h._M_insert_unique_node(__bkt, __code, __node._M_node);
+ = __h._M_insert_unique_node(__hint._M_cur,
+ __bkt, __code, __node._M_node);
__node._M_node = nullptr;
return { __it, true };
}
@@ -985,9 +986,16 @@ namespace __detail
_M_insert_range(_InputIterator __first, _InputIterator __last,
const _NodeGetter& __node_gen, true_type __uks)
{
+ using __node_ptr = typename __hashtable::__node_ptr;
__hashtable& __h = _M_conjure_hashtable();
+ __node_ptr __hint = nullptr;
for (; __first != __last; ++__first)
- __h._M_insert(*__first, __node_gen, __uks);
+ {
+ __node_ptr __insert_pos
+ = __h._M_insert(__hint, *__first, __node_gen, __uks)._M_cur;
+ if (!__insert_pos->_M_nxt)
+ __hint = __insert_pos;
+ }
}
template<typename _Key, typename _Value, typename _Alloc,
@@ -1002,6 +1010,7 @@ namespace __detail
_M_insert_range(_InputIterator __first, _InputIterator __last,
const _NodeGetter& __node_gen, false_type __uks)
{
+ using __node_ptr = typename __hashtable::__node_ptr;
using __rehash_type = typename __hashtable::__rehash_type;
using __rehash_state = typename __hashtable::__rehash_state;
using pair_type = std::pair<bool, std::size_t>;
@@ -1020,8 +1029,14 @@ namespace __detail
if (__do_rehash.first)
__h._M_rehash(__do_rehash.second, __saved_state);
+ __node_ptr __hint = nullptr;
for (; __first != __last; ++__first)
- __h._M_insert(*__first, __node_gen, __uks);
+ {
+ __node_ptr __insert_pos
+ = __h._M_insert(__hint, *__first, __node_gen, __uks)._M_cur;
+ if (!__insert_pos->_M_nxt)
+ __hint = __insert_pos;
+ }
}
/**
@@ -1076,7 +1091,7 @@ namespace __detail
{
__hashtable& __h = this->_M_conjure_hashtable();
__node_gen_type __node_gen(__h);
- return __h._M_insert(__hint, std::move(__v), __node_gen,
+ return __h._M_insert(__hint._M_cur, std::move(__v), __node_gen,
__unique_keys{});
}
};
@@ -1126,7 +1141,7 @@ namespace __detail
insert(const_iterator __hint, _Pair&& __v)
{
__hashtable& __h = this->_M_conjure_hashtable();
- return __h._M_emplace(__hint, __unique_keys{},
+ return __h._M_emplace(__hint._M_cur, __unique_keys{},
std::forward<_Pair>(__v));
}
};
@@ -1315,19 +1330,6 @@ namespace __detail
return _M_hash()(__k);
}
- __hash_code
- _M_hash_code(const _Hash&,
- const _Hash_node_value<_Value, true>& __n) const
- { return __n._M_hash_code; }
-
- // Compute hash code using _Hash as __n _M_hash_code, if present, was
- // computed using _H2.
- template<typename _H2>
- __hash_code
- _M_hash_code(const _H2&,
- const _Hash_node_value<_Value, __cache_hash_code>& __n) const
- { return _M_hash_code(_ExtractKey{}(__n._M_v())); }
-
__hash_code
_M_hash_code(const _Hash_node_value<_Value, false>& __n) const
{ return _M_hash_code(_ExtractKey{}(__n._M_v())); }
@@ -1999,19 +2001,20 @@ namespace __detail
_Hashtable_alloc<_NodeAlloc>::_M_allocate_node(_Args&&... __args)
-> __node_ptr
{
- auto __nptr = __node_alloc_traits::allocate(_M_node_allocator(), 1);
+ auto& __alloc = _M_node_allocator();
+ auto __nptr = __node_alloc_traits::allocate(__alloc, 1);
+
__node_ptr __n = std::__to_address(__nptr);
__try
{
::new ((void*)__n) __node_type;
- __node_alloc_traits::construct(_M_node_allocator(),
- __n->_M_valptr(),
+ __node_alloc_traits::construct(__alloc, __n->_M_valptr(),
std::forward<_Args>(__args)...);
return __n;
}
__catch(...)
{
- __node_alloc_traits::deallocate(_M_node_allocator(), __nptr, 1);
+ __node_alloc_traits::deallocate(__alloc, __nptr, 1);
__throw_exception_again;
}
}
diff --git a/libstdc++-v3/include/bits/unordered_map.h b/libstdc++-v3/include/bits/unordered_map.h
index 2f63bc5f1fa..d6becb83ee8 100644
--- a/libstdc++-v3/include/bits/unordered_map.h
+++ b/libstdc++-v3/include/bits/unordered_map.h
@@ -443,12 +443,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
/// Re-insert an extracted node.
insert_return_type
insert(node_type&& __nh)
- { return _M_h._M_reinsert_node(std::move(__nh)); }
+ { return _M_h._M_reinsert_node(cend(), std::move(__nh)); }
/// Re-insert an extracted node.
iterator
- insert(const_iterator, node_type&& __nh)
- { return _M_h._M_reinsert_node(std::move(__nh)).position; }
+ insert(const_iterator __hint, node_type&& __nh)
+ { return _M_h._M_reinsert_node(__hint, std::move(__nh)).position; }
#define __cpp_lib_unordered_map_try_emplace 201411L
/**
diff --git a/libstdc++-v3/include/bits/unordered_set.h b/libstdc++-v3/include/bits/unordered_set.h
index f3b0c078baa..915cd4fb700 100644
--- a/libstdc++-v3/include/bits/unordered_set.h
+++ b/libstdc++-v3/include/bits/unordered_set.h
@@ -504,12 +504,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
/// Re-insert an extracted node.
insert_return_type
insert(node_type&& __nh)
- { return _M_h._M_reinsert_node(std::move(__nh)); }
+ { return _M_h._M_reinsert_node(cend(), std::move(__nh)); }
/// Re-insert an extracted node.
iterator
- insert(const_iterator, node_type&& __nh)
- { return _M_h._M_reinsert_node(std::move(__nh)).position; }
+ insert(const_iterator __hint, node_type&& __nh)
+ { return _M_h._M_reinsert_node(__hint, std::move(__nh)).position; }
#endif // C++17
///@{
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multimap/insert/hint.cc b/libstdc++-v3/testsuite/23_containers/unordered_multimap/insert/hint.cc
index d0ed60c6799..1db71b8f30b 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multimap/insert/hint.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multimap/insert/hint.cc
@@ -40,11 +40,12 @@ void test01()
VERIFY( it2 != m.end() );
VERIFY( it2 != it1 );
VERIFY( it2->second == 2 );
- VERIFY( it2 == std::next(it1) );
+ VERIFY( std::next(it2) == it1 );
+ it1 = it2;
Pair p(0, 1);
it2 = m.insert(it1, p);
- VERIFY( it2 == std::next(it1) );
+ VERIFY( std::next(it2) == it1 );
}
struct hasher
@@ -71,13 +72,13 @@ void test02()
VERIFY( m.bucket_size(m.bucket(it3->first)) == 1 );
auto it4 = m.insert(it1, Pair(0, 1));
- VERIFY( it4 == std::next(it1) );
+ VERIFY( std::next(it4) == it1 );
VERIFY( m.bucket_size(m.bucket(it1->first)) == 3 );
VERIFY( m.bucket_size(m.bucket(it3->first)) == 1 );
Pair p(1, 1);
it4 = m.insert(it2, p);
- VERIFY( it4 == std::next(it2) );
+ VERIFY( std::next(it4) == it2 );
VERIFY( m.bucket_size(m.bucket(it1->first)) == 4 );
auto range = m.equal_range(0);
VERIFY( std::distance(range.first, range.second) == 2 );
@@ -104,11 +105,12 @@ void test03()
VERIFY( it2 != m.end() );
VERIFY( it2 != it1 );
VERIFY( it2->second == 2 );
- VERIFY( it2 == std::next(it1) );
+ VERIFY( std::next(it2) == it1 );
+ it1 = it2;
Pair p(0, 1);
it2 = m.emplace_hint(it1, p);
- VERIFY( it2 == std::next(it1) );
+ VERIFY( std::next(it2) == it1 );
}
int main()
diff --git a/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_multiset_hint.cc b/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_multiset_hint.cc
index 125a8a615b7..1e2502209f1 100644
--- a/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_multiset_hint.cc
+++ b/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_multiset_hint.cc
@@ -27,310 +27,170 @@
namespace
{
const int sz = 2000000;
- const std::string pattern = "test string #";
- const int nb_copies = 100;
+ const std::string pattern = "long enough test string #";
- // Special std::string hasher based on std::hash<std::string> but not tag as
- // slow so that hash code is not cached. It is easier to show impact of
- // hinting in this context.
+ // Perfect std::string hasher knowing how string instances have been
+ // generated. It is not tag as slow so that hash code is not cached.
+ // It is easier to show impact of hint in this context.
struct hasher
{
std::size_t
operator()(const std::string& str) const noexcept
{
- //std::istringstream istr(str.substr(pattern.size()));
- //std::size_t str_index;
- //istr >> str_index;
- //return str_index;
std::hash<std::string> std_hasher;
- return std_hasher(str);
+ auto hash = std_hasher(pattern);
+ std::istringstream isstr(str.substr(pattern.length()));
+ int idx;
+ isstr >> idx;
+ return (std::size_t)(hash / sz) * sz + idx;
}
};
- using ums_t = std::unordered_multiset<std::string, hasher>;
-
- void
- insert_with_perfect_hint1(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(str));
-
- for (int j = 1; j != nb_copies; ++j)
- for (std::size_t i = 0; i != strs.size(); ++i)
- ms.insert(hints[i], strs[i]);
- }
-
- void
- insert_with_perfect_hint2(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(str));
-
- for (std::size_t i = 0; i != strs.size(); ++i)
- for (int j = 1; j != nb_copies; ++j)
- ms.insert(hints[i], strs[i]);
- }
-
- // Always insert with the result of the previous insertion. The result of
- // the previous insertion will never be followed by an equivalent node
- // resulting in a re-computation of its hash code which is expensive.
- void
- insert_with_good_hint(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(str));
-
- for (int j = 1; j != nb_copies; ++j)
- for (std::size_t i = 0; i != strs.size(); ++i)
- hints[i] = ms.insert(hints[i], strs[i]);
- }
-
- // Note that the following use case is particularly bad especially compared to
- // the solution without hint because without hint the first insertion will put
- // it first in the bucket and following insertions will detect it and insert
- // just before. By giving a hint insertion will be done just after forcing to
- // check if it has no impact on the following bucket.
- void
- insert_with_bad_hint(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(str));
-
- for (std::size_t i = 0; i != strs.size(); ++i)
- for (int j = 1; j != nb_copies; ++j)
- hints[i] = ms.insert(hints[i], strs[i]);
- }
-
- void
- insert_without_hint1(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(str));
-
- for (int j = 1; j != nb_copies; ++j)
- for (std::size_t i = 0; i != strs.size(); ++i)
- hints[i] = ms.insert(strs[i]);
- }
-
- // This version is the best one if you insert all equivalent elements at the
- // same time. It demonstrates that most of the time it is better not to give
- // any hint unless you have written a benchmark for your application showing
- // that it does have a positive effect.
- void
- insert_without_hint2(const std::vector<std::string>& strs,
- ums_t& ms)
+ // Like previous hasher but tagged as slow.
+ struct slow_hasher
{
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(str));
-
- for (std::size_t i = 0; i != strs.size(); ++i)
- for (int j = 1; j != nb_copies; ++j)
- hints[i] = ms.insert(strs[i]);
- }
+ std::size_t
+ operator()(const std::string& str) const noexcept
+ { return hasher{}(str); }
+ };
- void
- insert_with_any_hint1(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(ms.begin(), str));
+ template<typename _Hash>
+ using ums_t = std::unordered_multiset<std::string, _Hash>;
- std::size_t offset = strs.size() / 2;
- for (int j = 1; j != nb_copies; ++j)
- for (std::size_t i = 0; i != strs.size(); ++i)
+ template<typename _Hash>
+ void
+ insert_with_perfect_hint(const std::vector<std::string>& strs,
+ ums_t<_Hash>& ms)
+ {
+ auto hint = ms.end();
+ for (auto str : strs)
{
- ms.insert(hints[(i + offset) % hints.size()], strs[i]);
- ++offset;
+ auto insert_pos = ms.insert(hint, str);
+ if (std::next(insert_pos) == ms.end())
+ hint = insert_pos;
}
- }
-
- void
- insert_with_any_hint2(const std::vector<std::string>& strs,
- ums_t& ms)
- {
- std::vector<typename ums_t::iterator> hints;
- hints.reserve(strs.size());
- for (auto str : strs)
- hints.push_back(ms.insert(ms.begin(), str));
+ }
- std::size_t offset = strs.size() / 2;
- for (std::size_t i = 0; i != strs.size(); ++i)
- for (int j = 1; j != nb_copies; ++j)
+ template<typename _Hash>
+ void
+ insert_with_bad_hint(const std::vector<std::string>& strs,
+ ums_t<_Hash>& ms)
+ {
+ auto hint = ms.begin();
+ for (auto str : strs)
{
- ms.insert(hints[(i + offset) % hints.size()], strs[i]);
- ++offset;
+ auto insert_pos = ms.insert(hint, str);
+ if (std::next(insert_pos) == hint)
+ hint = ms.begin();
}
- }
-}
-
-int main()
-{
- using namespace __gnu_test;
-
- const int nb_iter = 10;
-
- std::vector<std::string> strs;
- strs.reserve(sz / nb_copies);
+ }
- for (int i = 0; i != sz / nb_copies; ++i)
+ template<typename _Hash>
+ void
+ insert_without_hint(const std::vector<std::string>& strs,
+ ums_t<_Hash>& ms)
{
- std::ostringstream osstr;
- osstr << pattern << i;
- strs.push_back(osstr.str());
+ for (auto str : strs)
+ ms.insert(str);
}
- ums_t ms;
- // Use a large load factor to make the context ideal for using hint because we
- // will have many elements per bucket.
- ms.max_load_factor(10.0f);
- ms.reserve(sz);
+ template<typename _Hash>
+ void
+ insert_range(const std::vector<std::string>& strs,
+ ums_t<_Hash>& ms)
+ { ms.insert(strs.begin(), strs.end()); }
+}
- // Warm up.
+template<typename _Hash>
+ void bench(const char* ctx)
{
- for (auto str : strs)
- for (int j = 0; j != nb_copies; ++j)
- ms.insert(str);
- }
-
- time_counter time_no_hint1, time_any_hint1, time_bad_hint, time_perfect_hint1;
- time_counter time_no_hint2, time_any_hint2, time_good_hint, time_perfect_hint2;
- resource_counter resource_no_hint1, resource_any_hint1, resource_bad_hint,
- resource_perfect_hint1;
- resource_counter resource_no_hint2, resource_any_hint2, resource_good_hint,
- resource_perfect_hint2;
-
- for (int i = 0; i != nb_iter; ++i)
- {
- // Bad hint
- {
- ms.clear();
- start_counters(time_bad_hint, resource_bad_hint);
- insert_with_bad_hint(strs, ms);
- stop_counters(time_bad_hint, resource_bad_hint);
- }
+ using namespace __gnu_test;
- // No hint
- {
- ms.clear();
- start_counters(time_no_hint1, resource_no_hint1);
- insert_without_hint1(strs, ms);
- stop_counters(time_no_hint1, resource_no_hint1);
- }
-
- // Any hint
- {
- ms.clear();
- start_counters(time_any_hint1, resource_any_hint1);
- insert_with_any_hint1(strs, ms);
- stop_counters(time_any_hint1, resource_any_hint1);
- }
+ const int nb_iter = 10;
- // Good hint
- {
- ms.clear();
- start_counters(time_good_hint, resource_good_hint);
- insert_with_good_hint(strs, ms);
- stop_counters(time_good_hint, resource_good_hint);
- }
-
- // No hint
- {
- ms.clear();
- start_counters(time_no_hint2, resource_no_hint2);
- insert_without_hint2(strs, ms);
- stop_counters(time_no_hint2, resource_no_hint2);
- }
+ std::vector<std::string> strs;
+ strs.reserve(sz);
- // Perfect hint
+ for (int i = 0; i != sz; ++i)
{
- ms.clear();
- start_counters(time_perfect_hint2, resource_perfect_hint2);
- insert_with_perfect_hint2(strs, ms);
- stop_counters(time_perfect_hint2, resource_perfect_hint2);
+ std::ostringstream osstr;
+ osstr << pattern << i;
+ strs.push_back(osstr.str());
}
- // Any hint
- {
- ms.clear();
- start_counters(time_any_hint2, resource_any_hint2);
- insert_with_any_hint2(strs, ms);
- stop_counters(time_any_hint2, resource_any_hint2);
- }
+ ums_t<_Hash> ms;
+ ms.reserve(sz);
- // Perfect hint
- {
- ms.clear();
- start_counters(time_perfect_hint1, resource_perfect_hint1);
- insert_with_perfect_hint1(strs, ms);
- stop_counters(time_perfect_hint1, resource_perfect_hint1);
- }
+ // Warm up.
+ {
+ for (auto str : strs)
+ ms.insert(str);
}
- std::ostringstream ostr;
- ostr << "unordered_set " << nb_copies << " X " << sz / nb_copies
- << " insertions w/o hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_no_hint1, resource_no_hint1);
-
- ostr.str("");
- ostr << "unordered_set " << nb_copies << " X " << sz / nb_copies
- << " insertions with any hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_any_hint1, resource_any_hint1);
+ time_counter time_no_hint, time_bad_hint, time_perfect_hint,
+ time_range;
+ resource_counter resource_no_hint, resource_bad_hint, resource_perfect_hint,
+ resource_range;
- ostr.str("");
- ostr << "unordered_set " << nb_copies << " X " << sz / nb_copies
- << " insertions with good hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_good_hint, resource_good_hint);
+ for (int i = 0; i != nb_iter; ++i)
+ {
+ // Bad hint
+ {
+ ms.clear();
+ start_counters(time_bad_hint, resource_bad_hint);
+ insert_with_bad_hint(strs, ms);
+ stop_counters(time_bad_hint, resource_bad_hint);
+ }
- ostr.str("");
- ostr << "unordered_set " << nb_copies << " X " << sz / nb_copies
- << " insertions with perfect hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_perfect_hint1, resource_perfect_hint1);
+ // No hint
+ {
+ ms.clear();
+ start_counters(time_no_hint, resource_no_hint);
+ insert_without_hint(strs, ms);
+ stop_counters(time_no_hint, resource_no_hint);
+ }
- ostr.str("");
- ostr << "unordered_set " << sz / nb_copies << " X " << nb_copies
- << " insertions w/o hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_no_hint2, resource_no_hint2);
+ // Perfect hint
+ {
+ ms.clear();
+ start_counters(time_perfect_hint, resource_perfect_hint);
+ insert_with_perfect_hint(strs, ms);
+ stop_counters(time_perfect_hint, resource_perfect_hint);
+ }
- ostr.str("");
- ostr << "unordered_set " << sz / nb_copies << " X " << nb_copies
- << " insertions with any hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_any_hint2, resource_any_hint2);
+ // Range insert
+ {
+ ms.clear();
+ start_counters(time_range, resource_range);
+ insert_range(strs, ms);
+ stop_counters(time_range, resource_range);
+ }
+ }
- ostr.str("");
- ostr << "unordered_set " << sz / nb_copies << " X " << nb_copies
- << " insertions with bad hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_bad_hint, resource_bad_hint);
+ std::ostringstream ostr;
+ ostr << ctx << ' ' << sz << " insertions w/o hint";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_no_hint, resource_no_hint);
+
+ ostr.str("");
+ ostr << ctx << ' ' << sz << " insertions with perfect hint";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_perfect_hint, resource_perfect_hint);
+
+ ostr.str("");
+ ostr << ctx << ' ' << sz << " insertions with bad hint";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_bad_hint, resource_bad_hint);
+
+ ostr.str("");
+ ostr << ctx << ' ' << sz << " range insertions";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_range, resource_range);
+ }
- ostr.str("");
- ostr << "unordered_set " << sz / nb_copies << " X " << nb_copies
- << " insertions with perfect hint";
- report_performance(__FILE__, ostr.str().c_str(),
- time_perfect_hint2, resource_perfect_hint2);
+int main()
+{
+ bench<hasher>("hash code NOT cached");
+ bench<slow_hasher>("hash code cached");
return 0;
}
diff --git a/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_set_hint.cc b/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_set_hint.cc
new file mode 100644
index 00000000000..0197ba31b8b
--- /dev/null
+++ b/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_set_hint.cc
@@ -0,0 +1,186 @@
+// { dg-do run { target c++11 } }
+
+#include <testsuite_performance.h>
+
+#include <sstream>
+#include <string>
+#include <vector>
+#include <unordered_set>
+
+namespace
+{
+ const int sz = 2000000;
+ const std::string pattern = "long enough test string #";
+
+ // Perfect std::string hasher knowing how string instances have been
+ // generated. It is not tag as slow so that hash code is not cached.
+ // It is easier to show impact of hint in this context.
+ struct hasher
+ {
+ std::size_t
+ operator()(const std::string& str) const noexcept
+ {
+ std::hash<std::string> std_hasher;
+ auto hash = std_hasher(pattern);
+ std::istringstream isstr(str.substr(pattern.length()));
+ int idx;
+ isstr >> idx;
+ return (std::size_t)(hash / sz) * sz + idx;
+ }
+ };
+
+ // Like previous hasher but tagged as slow.
+ struct slow_hasher
+ {
+ std::size_t
+ operator()(const std::string& str) const noexcept
+ { return hasher{}(str); }
+ };
+
+ template<typename _Hash>
+ using us_t = std::unordered_set<std::string, _Hash>;
+
+ template<typename _Hash>
+ void
+ insert_with_perfect_hint(const std::vector<std::string>& strs,
+ us_t<_Hash>& s)
+ {
+ auto hint = s.end();
+ for (auto str : strs)
+ {
+ auto insert_pos = s.insert(hint, str);
+ if (std::next(insert_pos) == s.end())
+ hint = insert_pos;
+ }
+ }
+
+ template<typename _Hash>
+ void
+ insert_with_bad_hint(const std::vector<std::string>& strs,
+ us_t<_Hash>& s)
+ {
+ auto hint = s.begin();
+ for (auto str : strs)
+ {
+ auto insert_pos = s.insert(hint, str);
+ if (std::next(insert_pos) == hint)
+ hint = s.begin();
+ }
+ }
+
+ template<typename _Hash>
+ void
+ insert_without_hint(const std::vector<std::string>& strs,
+ us_t<_Hash>& s)
+ {
+ for (auto str : strs)
+ s.insert(str);
+ }
+
+ template<typename _Hash>
+ void
+ insert_range(const std::vector<std::string>& strs,
+ us_t<_Hash>& s)
+ { s.insert(strs.begin(), strs.end()); }
+}
+
+template<typename _Hash>
+ void bench(const char* ctx)
+ {
+ using namespace __gnu_test;
+
+ const int nb_iter = 10;
+
+ std::vector<std::string> strs;
+ strs.reserve(sz);
+
+ for (int i = 0; i != sz; ++i)
+ {
+ std::ostringstream osstr;
+ osstr << pattern << i;
+ strs.push_back(osstr.str());
+ }
+
+ us_t<_Hash> s;
+ s.reserve(sz);
+
+ // Warm up.
+ {
+ for (auto str : strs)
+ s.insert(str);
+ }
+
+ time_counter time_no_hint, time_bad_hint, time_perfect_hint,
+ time_range;
+ resource_counter resource_no_hint, resource_bad_hint, resource_perfect_hint,
+ resource_range;
+
+ for (int i = 0; i != nb_iter; ++i)
+ {
+ // Bad hint
+ {
+ s.clear();
+ start_counters(time_bad_hint, resource_bad_hint);
+ insert_with_bad_hint(strs, s);
+ stop_counters(time_bad_hint, resource_bad_hint);
+ }
+
+ // No hint
+ {
+ s.clear();
+ start_counters(time_no_hint, resource_no_hint);
+ insert_without_hint(strs, s);
+ stop_counters(time_no_hint, resource_no_hint);
+ }
+
+ // Perfect hint
+ {
+ s.clear();
+ start_counters(time_perfect_hint, resource_perfect_hint);
+ insert_with_perfect_hint(strs, s);
+ stop_counters(time_perfect_hint, resource_perfect_hint);
+ }
+
+ // Range insert
+ {
+ s.clear();
+ start_counters(time_range, resource_range);
+ insert_range(strs, s);
+ stop_counters(time_range, resource_range);
+ }
+ }
+
+ std::ostringstream ostr;
+ ostr << ctx << ' ' << sz << " insertions w/o hint";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_no_hint, resource_no_hint);
+
+ ostr.str("");
+ ostr << ctx << ' ' << sz << " insertions with perfect hint";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_perfect_hint, resource_perfect_hint);
+
+ ostr.str("");
+ ostr << ctx << ' ' << sz << " insertions with bad hint";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_bad_hint, resource_bad_hint);
+
+ ostr.str("");
+ ostr << ctx << ' ' << sz << " range insertions";
+ report_performance(__FILE__, ostr.str().c_str(),
+ time_range, resource_range);
+ }
+
+namespace std
+{
+ template<>
+ struct __is_fast_hash<slow_hasher> : std::false_type
+ { };
+}
+
+int main()
+{
+ bench<hasher>("hash code NOT cached");
+ bench<slow_hasher>("hash code cached");
+ return 0;
+}
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][Hashtable] Performance optimization through use of insertion hint
2023-07-24 11:02 [PATCH][Hashtable] Performance optimization through use of insertion hint François Dumont
@ 2023-08-29 19:51 ` François Dumont
2023-09-01 8:59 ` Jonathan Wakely
0 siblings, 1 reply; 5+ messages in thread
From: François Dumont @ 2023-08-29 19:51 UTC (permalink / raw)
To: libstdc++; +Cc: gcc-patches
Hi
Any feedback regarding this patch ?
François
On 24/07/2023 13:02, François Dumont wrote:
> libstdc++: [_Hashtable] Make more use of insertion hint
>
>
> When inserting an element into an empty bucket we currently insert
> the new node
> after the before-begin node so in first position. The drawback of
> doing this is
> that we are forced to update the bucket that was containing this
> before-begin
> node to point to the newly inserted node. To do so we need at best
> to do a modulo
> to find this bucket and when hash code is not cached also compute it.
>
> To avoid this side effect it is better to insert after the last
> node. Adding
> a member to keep track of this last node would be an abi breaking
> change. Still we
> can ask the user to maintain and provide this last node as an
> insertion hint.
>
> Adapt range insertion methods to try to detect the last node and
> then use it as
> the insertion hint.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/hashtable_policy.h:
> (_Insert_base::try_emplace): Adapt to use hint.
> (_Insert_base::_M_insert_range): Try to detect container
> last node and use it
> as hint for insertions.
> (_Hash_code_base::_M_hash_code(const _Hash&, const
> _Hash_node_value<>&)): Remove.
> (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const
> _Hash_node_value<>&)): Remove.
> * include/bits/hashtable.h
> (_Hashtable<>::_M_insert_bucket_begin): Add hint parameter
> and use it when inserting
> into an empty bucket if hint is the last container node.
> (_Hashtable<>::_InsertInfo): New struct.
> (_Hashtable<>::_M_get_insert_info): New, return latter.
> (_Hashtable<>::_M_insert_multi_node): Add _InsertInfo
> parameter.
> (_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint
> parameter.
> (_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)):
> New.
> (_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)):
> New.
> (_Hashtable<>::_M_emplace()): Adapt to use latters.
> (_Hashtable<>::_M_insert_unique): Add __node_ptr parameter.
> (_Hashtable<>::_M_insert_unique_aux): Add __node_ptr
> parameter.
> (_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const
> _NodeGenerator&, true_type)):
> Use latter.
> (_Hashtable<>::_M_reinsert_node(const_iterator,
> node_type&&)):
> Add hint parameter, adapt to use it.
> (_Hashtable<>::_M_reinsert_node_multi): Use hint parameter
> if available to extract
> hash code.
> (_Hashtable<>::_M_compute_hash_code(const _Hash&,
> __node_ptr, __node_ptr,
> const key_type&)): New.
> (_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr,
> __node_ptr,
> const key_type&)): New.
> (_Hashtable<>::_M_merge_unique): Adapt to use latter.
> Implement small size
> optimization.
> (_Hashtable<>::_M_get_insert_info(const _Hash&,
> __node_ptr, __node_ptr,
> const key_type&)): New.
> (_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr,
> __node_ptr,
> const key_type&)): New.
> (_Hashtable<>::_M_merge_multi): Adapt to use latter.
> * include/bits/unordered_map.h
> (unordered_map<>::insert(node_type&&)): Pass cend as
> hint.
> (unordered_map<>::insert(const_iterator, node_type&&)):
> Adapt to use hint.
> * include/bits/unordered_set.h
> (unordered_set<>::insert(node_type&&)): Pass cend as
> hint.
> (unordered_set<>::insert(const_iterator, node_type&&)):
> Adapt to use hint.
> *
> testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt
> implementation
> specific tests.
>
> Tested under linux x86_64.
>
> Here is the performance results of this patch, before:
>
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> insertions w/o hint 94r 94u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> insertions with perfect hint 95r 94u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> insertions with bad hint 94r 94u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> range insertions 88r 88u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000
> insertions w/o hint 91r 91u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000
> insertions with perfect hint 92r 93u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000
> insertions with bad hint 93r 93u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000 range
> insertions 88r 88u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000
> insertions w/o hint 94r 95u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000
> insertions with perfect hint 94r 95u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000
> insertions with bad hint 94r 94u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000 range
> insertions 90r 90u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 insertions
> w/o hint 68r 68u 0s 223999264mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 insertions
> with perfect hint 67r 67u 0s 223999264mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 insertions
> with bad hint 68r 68u 0s 223999264mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 range
> insertions 62r 62u 0s 223999264mem 0pf
>
> After:
>
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> insertions w/o hint 93r 93u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> insertions with perfect hint 58r 59u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> insertions with bad hint 93r 94u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> range insertions 52r 51u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000
> insertions w/o hint 96r 95u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000
> insertions with perfect hint 61r 62u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000
> insertions with bad hint 94r 94u 0s 191999712mem 0pf
> unordered_multiset_hint.cc-thread hash code cached 2000000 range
> insertions 52r 52u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000
> insertions w/o hint 96r 95u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000
> insertions with perfect hint 58r 59u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000
> insertions with bad hint 94r 94u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code NOT cached 2000000 range
> insertions 52r 52u 0s 191999712mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 insertions
> w/o hint 70r 69u 0s 223999264mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 insertions
> with perfect hint 67r 67u 0s 223999264mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 insertions
> with bad hint 67r 67u 0s 223999264mem 0pf
> unordered_set_hint.cc-thread hash code cached 2000000 range
> insertions 63r 62u 0s 223999264mem 0pf
>
> Ok to commit ?
>
> François
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][Hashtable] Performance optimization through use of insertion hint
2023-08-29 19:51 ` François Dumont
@ 2023-09-01 8:59 ` Jonathan Wakely
2023-09-01 9:00 ` Jonathan Wakely
2023-09-07 16:55 ` François Dumont
0 siblings, 2 replies; 5+ messages in thread
From: Jonathan Wakely @ 2023-09-01 8:59 UTC (permalink / raw)
To: François Dumont; +Cc: libstdc++, gcc-patches
On Tue, 29 Aug 2023 at 20:52, François Dumont via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
> Hi
>
> Any feedback regarding this patch ?
This is a fairly large patch and before we make any more changes to
unordered containers we have an ABI break to fix:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111050
>
> François
>
> On 24/07/2023 13:02, François Dumont wrote:
> > libstdc++: [_Hashtable] Make more use of insertion hint
> >
> >
> > When inserting an element into an empty bucket we currently insert
> > the new node
> > after the before-begin node so in first position. The drawback of
> > doing this is
> > that we are forced to update the bucket that was containing this
> > before-begin
> > node to point to the newly inserted node. To do so we need at best
> > to do a modulo
> > to find this bucket and when hash code is not cached also compute it.
> >
> > To avoid this side effect it is better to insert after the last
> > node. Adding
> > a member to keep track of this last node would be an abi breaking
> > change. Still we
> > can ask the user to maintain and provide this last node as an
> > insertion hint.
> >
> > Adapt range insertion methods to try to detect the last node and
> > then use it as
> > the insertion hint.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/bits/hashtable_policy.h:
> > (_Insert_base::try_emplace): Adapt to use hint.
> > (_Insert_base::_M_insert_range): Try to detect container
> > last node and use it
> > as hint for insertions.
> > (_Hash_code_base::_M_hash_code(const _Hash&, const
> > _Hash_node_value<>&)): Remove.
> > (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const
> > _Hash_node_value<>&)): Remove.
> > * include/bits/hashtable.h
> > (_Hashtable<>::_M_insert_bucket_begin): Add hint parameter
> > and use it when inserting
> > into an empty bucket if hint is the last container node.
> > (_Hashtable<>::_InsertInfo): New struct.
> > (_Hashtable<>::_M_get_insert_info): New, return latter.
> > (_Hashtable<>::_M_insert_multi_node): Add _InsertInfo
> > parameter.
> > (_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint
> > parameter.
> > (_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)):
> > New.
> > (_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)):
> > New.
> > (_Hashtable<>::_M_emplace()): Adapt to use latters.
> > (_Hashtable<>::_M_insert_unique): Add __node_ptr parameter.
> > (_Hashtable<>::_M_insert_unique_aux): Add __node_ptr
> > parameter.
> > (_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const
> > _NodeGenerator&, true_type)):
> > Use latter.
> > (_Hashtable<>::_M_reinsert_node(const_iterator,
> > node_type&&)):
> > Add hint parameter, adapt to use it.
> > (_Hashtable<>::_M_reinsert_node_multi): Use hint parameter
> > if available to extract
> > hash code.
> > (_Hashtable<>::_M_compute_hash_code(const _Hash&,
> > __node_ptr, __node_ptr,
> > const key_type&)): New.
> > (_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr,
> > __node_ptr,
> > const key_type&)): New.
> > (_Hashtable<>::_M_merge_unique): Adapt to use latter.
> > Implement small size
> > optimization.
> > (_Hashtable<>::_M_get_insert_info(const _Hash&,
> > __node_ptr, __node_ptr,
> > const key_type&)): New.
> > (_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr,
> > __node_ptr,
> > const key_type&)): New.
> > (_Hashtable<>::_M_merge_multi): Adapt to use latter.
> > * include/bits/unordered_map.h
> > (unordered_map<>::insert(node_type&&)): Pass cend as
> > hint.
> > (unordered_map<>::insert(const_iterator, node_type&&)):
> > Adapt to use hint.
> > * include/bits/unordered_set.h
> > (unordered_set<>::insert(node_type&&)): Pass cend as
> > hint.
> > (unordered_set<>::insert(const_iterator, node_type&&)):
> > Adapt to use hint.
> > *
> > testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt
> > implementation
> > specific tests.
> >
> > Tested under linux x86_64.
> >
> > Here is the performance results of this patch, before:
> >
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > insertions w/o hint 94r 94u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > insertions with perfect hint 95r 94u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > range insertions 88r 88u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000
> > insertions w/o hint 91r 91u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000
> > insertions with perfect hint 92r 93u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000
> > insertions with bad hint 93r 93u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000 range
> > insertions 88r 88u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > insertions w/o hint 94r 95u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > insertions with perfect hint 94r 95u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000 range
> > insertions 90r 90u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > w/o hint 68r 68u 0s 223999264mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > with perfect hint 67r 67u 0s 223999264mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > with bad hint 68r 68u 0s 223999264mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 range
> > insertions 62r 62u 0s 223999264mem 0pf
> >
> > After:
> >
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > insertions w/o hint 93r 93u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > insertions with perfect hint 58r 59u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > insertions with bad hint 93r 94u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > range insertions 52r 51u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000
> > insertions w/o hint 96r 95u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000
> > insertions with perfect hint 61r 62u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000
> > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > unordered_multiset_hint.cc-thread hash code cached 2000000 range
> > insertions 52r 52u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > insertions w/o hint 96r 95u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > insertions with perfect hint 58r 59u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code NOT cached 2000000 range
> > insertions 52r 52u 0s 191999712mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > w/o hint 70r 69u 0s 223999264mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > with perfect hint 67r 67u 0s 223999264mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > with bad hint 67r 67u 0s 223999264mem 0pf
> > unordered_set_hint.cc-thread hash code cached 2000000 range
> > insertions 63r 62u 0s 223999264mem 0pf
> >
> > Ok to commit ?
> >
> > François
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][Hashtable] Performance optimization through use of insertion hint
2023-09-01 8:59 ` Jonathan Wakely
@ 2023-09-01 9:00 ` Jonathan Wakely
2023-09-07 16:55 ` François Dumont
1 sibling, 0 replies; 5+ messages in thread
From: Jonathan Wakely @ 2023-09-01 9:00 UTC (permalink / raw)
To: François Dumont; +Cc: libstdc++, gcc-patches
On Fri, 1 Sept 2023 at 09:59, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Tue, 29 Aug 2023 at 20:52, François Dumont via Libstdc++
> <libstdc++@gcc.gnu.org> wrote:
> >
> > Hi
> >
> > Any feedback regarding this patch ?
>
> This is a fairly large patch and before we make any more changes to
> unordered containers we have an ABI break to fix:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111050
Or at least decide what to do about it...
>
>
> >
> > François
> >
> > On 24/07/2023 13:02, François Dumont wrote:
> > > libstdc++: [_Hashtable] Make more use of insertion hint
> > >
> > >
> > > When inserting an element into an empty bucket we currently insert
> > > the new node
> > > after the before-begin node so in first position. The drawback of
> > > doing this is
> > > that we are forced to update the bucket that was containing this
> > > before-begin
> > > node to point to the newly inserted node. To do so we need at best
> > > to do a modulo
> > > to find this bucket and when hash code is not cached also compute it.
> > >
> > > To avoid this side effect it is better to insert after the last
> > > node. Adding
> > > a member to keep track of this last node would be an abi breaking
> > > change. Still we
> > > can ask the user to maintain and provide this last node as an
> > > insertion hint.
> > >
> > > Adapt range insertion methods to try to detect the last node and
> > > then use it as
> > > the insertion hint.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/bits/hashtable_policy.h:
> > > (_Insert_base::try_emplace): Adapt to use hint.
> > > (_Insert_base::_M_insert_range): Try to detect container
> > > last node and use it
> > > as hint for insertions.
> > > (_Hash_code_base::_M_hash_code(const _Hash&, const
> > > _Hash_node_value<>&)): Remove.
> > > (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const
> > > _Hash_node_value<>&)): Remove.
> > > * include/bits/hashtable.h
> > > (_Hashtable<>::_M_insert_bucket_begin): Add hint parameter
> > > and use it when inserting
> > > into an empty bucket if hint is the last container node.
> > > (_Hashtable<>::_InsertInfo): New struct.
> > > (_Hashtable<>::_M_get_insert_info): New, return latter.
> > > (_Hashtable<>::_M_insert_multi_node): Add _InsertInfo
> > > parameter.
> > > (_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint
> > > parameter.
> > > (_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)):
> > > New.
> > > (_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)):
> > > New.
> > > (_Hashtable<>::_M_emplace()): Adapt to use latters.
> > > (_Hashtable<>::_M_insert_unique): Add __node_ptr parameter.
> > > (_Hashtable<>::_M_insert_unique_aux): Add __node_ptr
> > > parameter.
> > > (_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const
> > > _NodeGenerator&, true_type)):
> > > Use latter.
> > > (_Hashtable<>::_M_reinsert_node(const_iterator,
> > > node_type&&)):
> > > Add hint parameter, adapt to use it.
> > > (_Hashtable<>::_M_reinsert_node_multi): Use hint parameter
> > > if available to extract
> > > hash code.
> > > (_Hashtable<>::_M_compute_hash_code(const _Hash&,
> > > __node_ptr, __node_ptr,
> > > const key_type&)): New.
> > > (_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr,
> > > __node_ptr,
> > > const key_type&)): New.
> > > (_Hashtable<>::_M_merge_unique): Adapt to use latter.
> > > Implement small size
> > > optimization.
> > > (_Hashtable<>::_M_get_insert_info(const _Hash&,
> > > __node_ptr, __node_ptr,
> > > const key_type&)): New.
> > > (_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr,
> > > __node_ptr,
> > > const key_type&)): New.
> > > (_Hashtable<>::_M_merge_multi): Adapt to use latter.
> > > * include/bits/unordered_map.h
> > > (unordered_map<>::insert(node_type&&)): Pass cend as
> > > hint.
> > > (unordered_map<>::insert(const_iterator, node_type&&)):
> > > Adapt to use hint.
> > > * include/bits/unordered_set.h
> > > (unordered_set<>::insert(node_type&&)): Pass cend as
> > > hint.
> > > (unordered_set<>::insert(const_iterator, node_type&&)):
> > > Adapt to use hint.
> > > *
> > > testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt
> > > implementation
> > > specific tests.
> > >
> > > Tested under linux x86_64.
> > >
> > > Here is the performance results of this patch, before:
> > >
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > insertions w/o hint 94r 94u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > insertions with perfect hint 95r 94u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > range insertions 88r 88u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000
> > > insertions w/o hint 91r 91u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000
> > > insertions with perfect hint 92r 93u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000
> > > insertions with bad hint 93r 93u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000 range
> > > insertions 88r 88u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > > insertions w/o hint 94r 95u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > > insertions with perfect hint 94r 95u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000 range
> > > insertions 90r 90u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > > w/o hint 68r 68u 0s 223999264mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > > with perfect hint 67r 67u 0s 223999264mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > > with bad hint 68r 68u 0s 223999264mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 range
> > > insertions 62r 62u 0s 223999264mem 0pf
> > >
> > > After:
> > >
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > insertions w/o hint 93r 93u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > insertions with perfect hint 58r 59u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > insertions with bad hint 93r 94u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code NOT cached 2000000
> > > range insertions 52r 51u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000
> > > insertions w/o hint 96r 95u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000
> > > insertions with perfect hint 61r 62u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000
> > > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > > unordered_multiset_hint.cc-thread hash code cached 2000000 range
> > > insertions 52r 52u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > > insertions w/o hint 96r 95u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > > insertions with perfect hint 58r 59u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000
> > > insertions with bad hint 94r 94u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code NOT cached 2000000 range
> > > insertions 52r 52u 0s 191999712mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > > w/o hint 70r 69u 0s 223999264mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > > with perfect hint 67r 67u 0s 223999264mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 insertions
> > > with bad hint 67r 67u 0s 223999264mem 0pf
> > > unordered_set_hint.cc-thread hash code cached 2000000 range
> > > insertions 63r 62u 0s 223999264mem 0pf
> > >
> > > Ok to commit ?
> > >
> > > François
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][Hashtable] Performance optimization through use of insertion hint
2023-09-01 8:59 ` Jonathan Wakely
2023-09-01 9:00 ` Jonathan Wakely
@ 2023-09-07 16:55 ` François Dumont
1 sibling, 0 replies; 5+ messages in thread
From: François Dumont @ 2023-09-07 16:55 UTC (permalink / raw)
To: Jonathan Wakely; +Cc: libstdc++, gcc-patches
On 01/09/2023 10:59, Jonathan Wakely wrote:
> On Tue, 29 Aug 2023 at 20:52, François Dumont via Libstdc++
> <libstdc++@gcc.gnu.org> wrote:
>> Hi
>>
>> Any feedback regarding this patch ?
> This is a fairly large patch
I've decided to split it, at least in 2.
So just ignore this one, I'll submit new ones once abi issue is fixed.
> and before we make any more changes to
> unordered containers we have an ABI break to fix:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111050
>
>
>> François
>>
>> On 24/07/2023 13:02, François Dumont wrote:
>>> libstdc++: [_Hashtable] Make more use of insertion hint
>>>
>>>
>>> When inserting an element into an empty bucket we currently insert
>>> the new node
>>> after the before-begin node so in first position. The drawback of
>>> doing this is
>>> that we are forced to update the bucket that was containing this
>>> before-begin
>>> node to point to the newly inserted node. To do so we need at best
>>> to do a modulo
>>> to find this bucket and when hash code is not cached also compute it.
>>>
>>> To avoid this side effect it is better to insert after the last
>>> node. Adding
>>> a member to keep track of this last node would be an abi breaking
>>> change. Still we
>>> can ask the user to maintain and provide this last node as an
>>> insertion hint.
>>>
>>> Adapt range insertion methods to try to detect the last node and
>>> then use it as
>>> the insertion hint.
>>>
>>> libstdc++-v3/ChangeLog:
>>>
>>> * include/bits/hashtable_policy.h:
>>> (_Insert_base::try_emplace): Adapt to use hint.
>>> (_Insert_base::_M_insert_range): Try to detect container
>>> last node and use it
>>> as hint for insertions.
>>> (_Hash_code_base::_M_hash_code(const _Hash&, const
>>> _Hash_node_value<>&)): Remove.
>>> (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const
>>> _Hash_node_value<>&)): Remove.
>>> * include/bits/hashtable.h
>>> (_Hashtable<>::_M_insert_bucket_begin): Add hint parameter
>>> and use it when inserting
>>> into an empty bucket if hint is the last container node.
>>> (_Hashtable<>::_InsertInfo): New struct.
>>> (_Hashtable<>::_M_get_insert_info): New, return latter.
>>> (_Hashtable<>::_M_insert_multi_node): Add _InsertInfo
>>> parameter.
>>> (_Hashtable<>::_M_insert_unique_node): Add __node_ptr hint
>>> parameter.
>>> (_Hashtable<>::_M_emplace_unique(__node_ptr, _Args&&...)):
>>> New.
>>> (_Hashtable<>::_M_emplace_multi(__node_ptr, _Args&&...)):
>>> New.
>>> (_Hashtable<>::_M_emplace()): Adapt to use latters.
>>> (_Hashtable<>::_M_insert_unique): Add __node_ptr parameter.
>>> (_Hashtable<>::_M_insert_unique_aux): Add __node_ptr
>>> parameter.
>>> (_Hashtable<>::_M_insert(__node_ptr, _Arg&&, const
>>> _NodeGenerator&, true_type)):
>>> Use latter.
>>> (_Hashtable<>::_M_reinsert_node(const_iterator,
>>> node_type&&)):
>>> Add hint parameter, adapt to use it.
>>> (_Hashtable<>::_M_reinsert_node_multi): Use hint parameter
>>> if available to extract
>>> hash code.
>>> (_Hashtable<>::_M_compute_hash_code(const _Hash&,
>>> __node_ptr, __node_ptr,
>>> const key_type&)): New.
>>> (_Hashtable<>::_M_compute_hash_code<_H2>(const _H2&, __node_ptr,
>>> __node_ptr,
>>> const key_type&)): New.
>>> (_Hashtable<>::_M_merge_unique): Adapt to use latter.
>>> Implement small size
>>> optimization.
>>> (_Hashtable<>::_M_get_insert_info(const _Hash&,
>>> __node_ptr, __node_ptr,
>>> const key_type&)): New.
>>> (_Hashtable<>::_M_get_insert_info<_H2>(const _H2&, __node_ptr,
>>> __node_ptr,
>>> const key_type&)): New.
>>> (_Hashtable<>::_M_merge_multi): Adapt to use latter.
>>> * include/bits/unordered_map.h
>>> (unordered_map<>::insert(node_type&&)): Pass cend as
>>> hint.
>>> (unordered_map<>::insert(const_iterator, node_type&&)):
>>> Adapt to use hint.
>>> * include/bits/unordered_set.h
>>> (unordered_set<>::insert(node_type&&)): Pass cend as
>>> hint.
>>> (unordered_set<>::insert(const_iterator, node_type&&)):
>>> Adapt to use hint.
>>> *
>>> testsuite/23_containers/unordered_multimap/insert/hint.cc: Adapt
>>> implementation
>>> specific tests.
>>>
>>> Tested under linux x86_64.
>>>
>>> Here is the performance results of this patch, before:
>>>
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> insertions w/o hint 94r 94u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> insertions with perfect hint 95r 94u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> insertions with bad hint 94r 94u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> range insertions 88r 88u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000
>>> insertions w/o hint 91r 91u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000
>>> insertions with perfect hint 92r 93u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000
>>> insertions with bad hint 93r 93u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000 range
>>> insertions 88r 88u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000
>>> insertions w/o hint 94r 95u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000
>>> insertions with perfect hint 94r 95u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000
>>> insertions with bad hint 94r 94u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000 range
>>> insertions 90r 90u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 insertions
>>> w/o hint 68r 68u 0s 223999264mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 insertions
>>> with perfect hint 67r 67u 0s 223999264mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 insertions
>>> with bad hint 68r 68u 0s 223999264mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 range
>>> insertions 62r 62u 0s 223999264mem 0pf
>>>
>>> After:
>>>
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> insertions w/o hint 93r 93u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> insertions with perfect hint 58r 59u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> insertions with bad hint 93r 94u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code NOT cached 2000000
>>> range insertions 52r 51u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000
>>> insertions w/o hint 96r 95u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000
>>> insertions with perfect hint 61r 62u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000
>>> insertions with bad hint 94r 94u 0s 191999712mem 0pf
>>> unordered_multiset_hint.cc-thread hash code cached 2000000 range
>>> insertions 52r 52u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000
>>> insertions w/o hint 96r 95u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000
>>> insertions with perfect hint 58r 59u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000
>>> insertions with bad hint 94r 94u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code NOT cached 2000000 range
>>> insertions 52r 52u 0s 191999712mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 insertions
>>> w/o hint 70r 69u 0s 223999264mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 insertions
>>> with perfect hint 67r 67u 0s 223999264mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 insertions
>>> with bad hint 67r 67u 0s 223999264mem 0pf
>>> unordered_set_hint.cc-thread hash code cached 2000000 range
>>> insertions 63r 62u 0s 223999264mem 0pf
>>>
>>> Ok to commit ?
>>>
>>> François
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-09-07 16:56 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-24 11:02 [PATCH][Hashtable] Performance optimization through use of insertion hint François Dumont
2023-08-29 19:51 ` François Dumont
2023-09-01 8:59 ` Jonathan Wakely
2023-09-01 9:00 ` Jonathan Wakely
2023-09-07 16:55 ` François Dumont
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).