public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] libstdc++/91223 Improve unordered containers == operator
@ 2020-01-10 17:58 François Dumont
  2020-01-10 22:10 ` Jonathan Wakely
  0 siblings, 1 reply; 12+ messages in thread
From: François Dumont @ 2020-01-10 17:58 UTC (permalink / raw)
  To: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2065 bytes --]

Hi

     Here is my attempt to improve == operator.

     There is a small optimization for the std::unordered_mutiXXX 
containers but the main enhancement rely on some partial template 
specialization of the _Equality type. I limit it to usage of unordered 
containers with std::equal_to to be sure that the container _Equal 
functor is like the key type ==.

     Do I need to also consider user partial template specialization of 
std::equal_to ? It is a well know bad practice so I hope the Standard 
says that such a partial specialization leads to undefined behavior.

     I saw that the _S_is_permutation has been done in 2012, before 
std::is_permutation has been added in 2013. I'll try to replace it in a 
future patch.

     PR libstdc++/91223
     * include/bits/hashtable_policy.h
     (_Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
     __detail::_Select1st, std::equal_to<_Key>, _H1, _H2, _Hash,
     _RehashPolicy, _Traits, true>): New partial template spetialization.
     (_Equality<_Value, _Value, _Alloc, __detail::_Identity,
     std::equal_to<_Value>, _H1, _H2, _Hash, _RehashPolicy, _Traits, true>):
     Likewise.
     (_Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
     __detail::_Select1st, std::equal_to<_Key>, _H1, _H2, _Hash,
     _RehashPolicy, _Traits, false>): Likewise.
     (_Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
     __detail::_Select1st, std::equal_to<_Key>, _H1, _H2, _Hash)
     (_RehashPolicy, _Traits, false>): Likewise.
     * src/c++11/hashtable_c++0x.cc: Include <bits/stl_function.h>.
     * testsuite/23_containers/unordered_multiset/operators/1.cc
     (Hash, Equal, test02, test03): New.
     * testsuite/23_containers/unordered_set/operators/1.cc
     (Hash, Equal, test02, test03): New.

unordered tests run under Linux x86_64.

Ok to commit after running all tests ?

François


[-- Attachment #2: pr91263.patch --]
[-- Type: text/x-patch, Size: 12891 bytes --]

diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 7bbfdfd375b..2ac3e959320 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -1959,11 +1959,18 @@ namespace __detail
 
       for (auto __itx = __this->begin(); __itx != __this->end();)
 	{
-	  const auto __xrange = __this->equal_range(_ExtractKey()(*__itx));
+	  std::size_t __x_count = 1;
+	  auto __itx_end = __itx;
+	  for (++__itx_end; __itx_end != __this->end()
+		 && __this->key_eq()(_ExtractKey()(*__itx_end),
+				     _ExtractKey()(*__itx));
+	       ++__itx_end)
+	    ++__x_count;
+
+	  const auto __xrange = std::make_pair(__itx, __itx_end);
 	  const auto __yrange = __other.equal_range(_ExtractKey()(*__itx));
 
-	  if (std::distance(__xrange.first, __xrange.second)
-	      != std::distance(__yrange.first, __yrange.second))
+	  if (__x_count != std::distance(__yrange.first, __yrange.second))
 	    return false;
 
 	  if (!_S_is_permutation(__xrange.first, __xrange.second,
@@ -1975,6 +1982,242 @@ namespace __detail
       return true;
     }
 
+  /// Specialization.
+  template<typename _Key, typename _Tp, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    struct _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
+		     __detail::_Select1st, std::equal_to<_Key>,
+		     _H1, _H2, _Hash, _RehashPolicy, _Traits, true>
+    {
+      using __hashtable = _Hashtable<_Key, std::pair<const _Key, _Tp>, _Alloc,
+				     __detail::_Select1st, std::equal_to<_Key>,
+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
+
+      bool
+      _M_equal(const __hashtable&) const;
+    };
+
+  template<typename _Key, typename _Tp, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    bool
+    _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc, __detail::_Select1st,
+	      std::equal_to<_Key>, _H1, _H2,
+	      _Hash, _RehashPolicy, _Traits, true>::
+    _M_equal(const __hashtable& __other) const
+    {
+      const __hashtable* __this = static_cast<const __hashtable*>(this);
+
+      if (__this->size() != __other.size())
+	return false;
+
+      for (auto __itx = __this->begin(); __itx != __this->end(); ++__itx)
+	{
+	  const auto __ity = __other.find(__itx->first);
+	  if (__ity == __other.end() || !bool(__ity->second == __itx->second))
+	    return false;
+	}
+      return true;
+    }
+
+  /// Specialization.
+  template<typename _Value, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+  struct _Equality<_Value, _Value, _Alloc, __detail::_Identity,
+		   std::equal_to<_Value>, _H1, _H2,
+		   _Hash, _RehashPolicy, _Traits, true>
+    {
+      using __hashtable = _Hashtable<_Value, _Value, _Alloc,
+				     __detail::_Identity, std::equal_to<_Value>,
+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
+
+      bool
+      _M_equal(const __hashtable&) const;
+    };
+
+  template<typename _Value, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    bool
+    _Equality<_Value, _Value, _Alloc, __detail::_Identity,
+	      std::equal_to<_Value>, _H1, _H2,
+	      _Hash, _RehashPolicy, _Traits, true>::
+    _M_equal(const __hashtable& __other) const
+    {
+      const __hashtable* __this = static_cast<const __hashtable*>(this);
+
+      if (__this->size() != __other.size())
+	return false;
+
+      for (auto __itx = __this->begin(); __itx != __this->end(); ++__itx)
+	if (__other.find(*__itx) == __other.end())
+	  return false;
+
+      return true;
+    }
+
+  /// Specialization.
+  template<typename _Key, typename _Tp, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    struct _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
+		     __detail::_Select1st, std::equal_to<_Key>,
+		     _H1, _H2, _Hash, _RehashPolicy, _Traits, false>
+    {
+      using __hashtable = _Hashtable<_Key, std::pair<const _Key, _Tp>, _Alloc,
+				     __detail::_Select1st, std::equal_to<_Key>,
+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
+
+      bool
+      _M_equal(const __hashtable&) const;
+
+    private:
+      using __hash_cached = typename _Traits::__hash_cached;
+      using __constant_iterators = typename _Traits::__constant_iterators;
+      using const_iterator =
+	__detail::_Node_const_iterator<std::pair<const _Key, _Tp>,
+				       __constant_iterators::value,
+				       __hash_cached::value>;
+
+      static bool
+      _S_is_permutation(const_iterator, const_iterator, const_iterator);
+    };
+
+  template<typename _Key, typename _Tp, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    bool
+    _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
+	      __detail::_Select1st, std::equal_to<_Key>,
+	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
+    _S_is_permutation(const_iterator __first1, const_iterator __last1,
+		      const_iterator __first2)
+    {
+      for (; __first1 != __last1; ++__first1, ++__first2)
+	if (!(__first1->second == __first2->second))
+	  break;
+
+      if (__first1 == __last1)
+	return true;
+
+      const_iterator __last2 = __first2;
+      std::advance(__last2, std::distance(__first1, __last1));
+
+      for (const_iterator __it1 = __first1; __it1 != __last1; ++__it1)
+	{
+	  const_iterator __tmp =  __first1;
+	  while (__tmp != __it1 && !bool(__tmp->second == __it1->second))
+	    ++__tmp;
+
+	  // We've seen this one before.
+	  if (__tmp != __it1)
+	    continue;
+
+	  std::ptrdiff_t __n2 = 0;
+	  for (__tmp = __first2; __tmp != __last2; ++__tmp)
+	    if (__tmp->second == __it1->second)
+	      ++__n2;
+
+	  if (!__n2)
+	    return false;
+
+	  std::ptrdiff_t __n1 = 0;
+	  for (__tmp = __it1; __tmp != __last1; ++__tmp)
+	    if (__tmp->second == __it1->second)
+	      ++__n1;
+
+	  if (__n1 != __n2)
+	    return false;
+	}
+      return true;
+    }
+
+  template<typename _Key, typename _Tp, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    bool
+    _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
+	      __detail::_Select1st, std::equal_to<_Key>,
+	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
+    _M_equal(const __hashtable& __other) const
+    {
+      const __hashtable* __this = static_cast<const __hashtable*>(this);
+
+      if (__this->size() != __other.size())
+	return false;
+
+      for (auto __itx = __this->begin(); __itx != __this->end();)
+	{
+	  std::size_t __x_count = 1;
+	  auto __itx_end = __itx;
+	  for (++__itx_end; __itx_end != __this->end()
+		 && __this->key_eq()(__itx_end->first, __itx->first);
+	       ++__itx_end)
+	    ++__x_count;
+
+	  const auto __xrange = make_pair(__itx, __itx_end);
+	  const auto __yrange = __other.equal_range(__itx->first);
+
+	  if (__x_count != std::distance(__yrange.first, __yrange.second))
+	    return false;
+
+	  if (!_S_is_permutation(__xrange.first, __xrange.second,
+				 __yrange.first))
+	    return false;
+
+	  __itx = __xrange.second;
+	}
+      return true;
+    }
+
+  /// Specialization.
+  template<typename _Value, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    struct _Equality<_Value, _Value, _Alloc, __detail::_Identity,
+		     std::equal_to<_Value>, _H1, _H2,
+		     _Hash, _RehashPolicy, _Traits, false>
+    {
+      using __hashtable = _Hashtable<_Value, _Value, _Alloc,
+				     __detail::_Identity, std::equal_to<_Value>,
+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
+
+      bool
+      _M_equal(const __hashtable&) const;
+    };
+
+  template<typename _Value, typename _Alloc,
+	   typename _H1, typename _H2, typename _Hash,
+	   typename _RehashPolicy, typename _Traits>
+    bool
+    _Equality<_Value, _Value, _Alloc, __detail::_Identity,
+	      std::equal_to<_Value>,
+	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
+    _M_equal(const __hashtable& __other) const
+    {
+      const __hashtable* __this = static_cast<const __hashtable*>(this);
+
+      if (__this->size() != __other.size())
+	return false;
+
+      for (auto __itx = __this->begin(); __itx != __this->end();)
+	{
+	  std::size_t __x_count = 1;
+	  auto __itx_end = __itx;
+	  for (++__itx_end; __itx_end != __this->end()
+		 && __this->key_eq()(*__itx_end, *__itx); ++__itx_end)
+	    ++__x_count;
+
+	  if (__x_count != __other.count(*__itx))
+	    return false;
+
+	  __itx = __itx_end;
+	}
+      return true;
+    }
+
   /**
    * This type deals with all allocation and keeps an allocator instance
    * through inheritance to benefit from EBO when possible.
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index de8e2c7cb91..2054791e13a 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -30,6 +30,7 @@
 #include <tuple>
 #include <ext/aligned_buffer.h>
 #include <ext/alloc_traits.h>
+#include <bits/stl_function.h> // equal_to, _Identity, _Select1st
 #include <bits/hashtable_policy.h>
 
 namespace std _GLIBCXX_VISIBILITY(default)
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
index 4b87f62b74a..7252cad29c2 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
@@ -99,8 +99,64 @@ void test01()
   VERIFY( !(ums1 != cums2) );
 }
 
+void test02()
+{
+  std::unordered_multiset<int> us1
+  { 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9 };
+  std::unordered_multiset<int> us2
+  { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+
+  VERIFY( us1 == us2 );
+}
+
+struct Hash
+{
+  std::size_t
+  operator()(const std::pair<int, int>& p) const
+  { return p.first; }
+};
+
+struct Equal
+{
+  bool
+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
+  { return lhs.first == rhs.first; }
+};
+
+void test03()
+{
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us1
+  {
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 }
+  };
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us2
+  {
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
+  };
+
+  VERIFY( us1 == us2 );
+
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us3
+  {
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 1 }, { 8, 0 }, { 9, 0 },
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
+  };
+
+  VERIFY( us1 != us3 );
+}
+
 int main()
 {
   test01();
+  test02();
+  test03();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
index d841246e2c1..36a45dfa099 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
@@ -99,8 +99,56 @@ void test01()
   VERIFY( !(us1 != cus2) );
 }
 
+void test02()
+{
+  std::unordered_set<int> us1 { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+  std::unordered_set<int> us2 { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };
+
+  VERIFY( us1 == us2 );
+}
+
+struct Hash
+{
+  std::size_t
+  operator()(const std::pair<int, int>& p) const
+  { return p.first; }
+};
+
+struct Equal
+{
+  bool
+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
+  { return lhs.first == rhs.first; }
+};
+
+void test03()
+{
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us1
+  {
+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 },
+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 }
+  };
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us2
+  {
+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
+  };
+
+  VERIFY( us1 == us2 );
+
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us3
+  {
+    { 5, -5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
+    { 0, 0  }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
+  };
+
+  VERIFY( us1 != us3 );
+}
+
 int main()
 {
   test01();
+  test02();
+  test03();
   return 0;
 }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-10 17:58 [PATCH] libstdc++/91223 Improve unordered containers == operator François Dumont
@ 2020-01-10 22:10 ` Jonathan Wakely
  2020-01-10 22:31   ` Jonathan Wakely
  2020-01-13 22:01   ` François Dumont
  0 siblings, 2 replies; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-10 22:10 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On 10/01/20 18:54 +0100, François Dumont wrote:
>Hi
>
>    Here is my attempt to improve == operator.
>
>    There is a small optimization for the std::unordered_mutiXXX 
>containers but the main enhancement rely on some partial template 
>specialization of the _Equality type. I limit it to usage of unordered 
>containers with std::equal_to to be sure that the container _Equal 
>functor is like the key type ==.

I think we can assume that for any _Equal, not just std::equal_to:
http://eel.is/c++draft/unord.req#12.sentence-5

However ...

>    Do I need to also consider user partial template specialization of 
>std::equal_to ? It is a well know bad practice so I hope the Standard 
>says that such a partial specialization leads to undefined behavior.

It's certainly not undefined to specialize equal_to, and I'm not sure
how to make your optimisations valid in that case.

Consider:

struct X
{
   int i;
   int rounded() const { return i - (i % 10); }
   bool operator==(X x) const { return i == x.i; }
};

template<> struct std::equal_to<X>
{
   bool operator()(X l, X r) const
   { return l.rounded() == r.rounded(); }
};

template<> std::hash<X>
{
   bool operator()(X x) const { return hash<int>()(x.rounded()); }
};

std::unordered_multiset<X> u{ X{10}, X{11}, X{12} };
std::unordered_multiset<X> v{ X{15}, X{16}, X{17} };
bool b1 = u == v;
bool b2 = std::is_permutation(u.begin(), u.end(), v.begin());
assert(b1 == b2);

I think the last new specialization in your patch would be used for
this case, and because __x_count == v.count(*u.begin()) it will say
they're equal. But the is_permutation call says they're not.

So I think the assertion would fail, but the standard says it should
pass. Am I mistaken?


>    I saw that the _S_is_permutation has been done in 2012, before 
>std::is_permutation has been added in 2013. I'll try to replace it in 
>a future patch.

Yes, that seems like a good idea.

>    PR libstdc++/91223
>    * include/bits/hashtable_policy.h
>    (_Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>    __detail::_Select1st, std::equal_to<_Key>, _H1, _H2, _Hash,
>    _RehashPolicy, _Traits, true>): New partial template spetialization.
>    (_Equality<_Value, _Value, _Alloc, __detail::_Identity,
>    std::equal_to<_Value>, _H1, _H2, _Hash, _RehashPolicy, _Traits, true>):
>    Likewise.
>    (_Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>    __detail::_Select1st, std::equal_to<_Key>, _H1, _H2, _Hash,
>    _RehashPolicy, _Traits, false>): Likewise.
>    (_Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>    __detail::_Select1st, std::equal_to<_Key>, _H1, _H2, _Hash)
>    (_RehashPolicy, _Traits, false>): Likewise.
>    * src/c++11/hashtable_c++0x.cc: Include <bits/stl_function.h>.
>    * testsuite/23_containers/unordered_multiset/operators/1.cc
>    (Hash, Equal, test02, test03): New.
>    * testsuite/23_containers/unordered_set/operators/1.cc
>    (Hash, Equal, test02, test03): New.
>
>unordered tests run under Linux x86_64.
>
>Ok to commit after running all tests ?
>
>François
>

>diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
>index 7bbfdfd375b..2ac3e959320 100644
>--- a/libstdc++-v3/include/bits/hashtable_policy.h
>+++ b/libstdc++-v3/include/bits/hashtable_policy.h
>@@ -1959,11 +1959,18 @@ namespace __detail
> 
>       for (auto __itx = __this->begin(); __itx != __this->end();)
> 	{
>-	  const auto __xrange = __this->equal_range(_ExtractKey()(*__itx));
>+	  std::size_t __x_count = 1;
>+	  auto __itx_end = __itx;
>+	  for (++__itx_end; __itx_end != __this->end()
>+		 && __this->key_eq()(_ExtractKey()(*__itx_end),
>+				     _ExtractKey()(*__itx));
>+	       ++__itx_end)
>+	    ++__x_count;

This is a nice optimisation.

>+	  const auto __xrange = std::make_pair(__itx, __itx_end);
> 	  const auto __yrange = __other.equal_range(_ExtractKey()(*__itx));
> 
>-	  if (std::distance(__xrange.first, __xrange.second)
>-	      != std::distance(__yrange.first, __yrange.second))
>+	  if (__x_count != std::distance(__yrange.first, __yrange.second))
> 	    return false;
> 
> 	  if (!_S_is_permutation(__xrange.first, __xrange.second,
>@@ -1975,6 +1982,242 @@ namespace __detail
>       return true;
>     }
> 
>+  /// Specialization.

With the increased number of specializations I think this comment
would be more useful if it said "specialization for multimap", and
change the others to "specialization for multiset" etc.

>+  template<typename _Key, typename _Tp, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    struct _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>+		     __detail::_Select1st, std::equal_to<_Key>,
>+		     _H1, _H2, _Hash, _RehashPolicy, _Traits, true>
>+    {
>+      using __hashtable = _Hashtable<_Key, std::pair<const _Key, _Tp>, _Alloc,
>+				     __detail::_Select1st, std::equal_to<_Key>,
>+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
>+
>+      bool
>+      _M_equal(const __hashtable&) const;
>+    };
>+
>+  template<typename _Key, typename _Tp, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    bool
>+    _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc, __detail::_Select1st,
>+	      std::equal_to<_Key>, _H1, _H2,
>+	      _Hash, _RehashPolicy, _Traits, true>::
>+    _M_equal(const __hashtable& __other) const
>+    {
>+      const __hashtable* __this = static_cast<const __hashtable*>(this);
>+
>+      if (__this->size() != __other.size())
>+	return false;
>+
>+      for (auto __itx = __this->begin(); __itx != __this->end(); ++__itx)
>+	{
>+	  const auto __ity = __other.find(__itx->first);
>+	  if (__ity == __other.end() || !bool(__ity->second == __itx->second))
>+	    return false;
>+	}
>+      return true;
>+    }
>+
>+  /// Specialization.
>+  template<typename _Value, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+  struct _Equality<_Value, _Value, _Alloc, __detail::_Identity,

Wrong indentation here.

>+		   std::equal_to<_Value>, _H1, _H2,
>+		   _Hash, _RehashPolicy, _Traits, true>
>+    {
>+      using __hashtable = _Hashtable<_Value, _Value, _Alloc,
>+				     __detail::_Identity, std::equal_to<_Value>,
>+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
>+
>+      bool
>+      _M_equal(const __hashtable&) const;
>+    };
>+
>+  template<typename _Value, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    bool
>+    _Equality<_Value, _Value, _Alloc, __detail::_Identity,
>+	      std::equal_to<_Value>, _H1, _H2,
>+	      _Hash, _RehashPolicy, _Traits, true>::
>+    _M_equal(const __hashtable& __other) const
>+    {
>+      const __hashtable* __this = static_cast<const __hashtable*>(this);
>+
>+      if (__this->size() != __other.size())
>+	return false;
>+
>+      for (auto __itx = __this->begin(); __itx != __this->end(); ++__itx)
>+	if (__other.find(*__itx) == __other.end())
>+	  return false;
>+
>+      return true;
>+    }
>+
>+  /// Specialization.
>+  template<typename _Key, typename _Tp, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    struct _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>+		     __detail::_Select1st, std::equal_to<_Key>,
>+		     _H1, _H2, _Hash, _RehashPolicy, _Traits, false>
>+    {
>+      using __hashtable = _Hashtable<_Key, std::pair<const _Key, _Tp>, _Alloc,
>+				     __detail::_Select1st, std::equal_to<_Key>,
>+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
>+
>+      bool
>+      _M_equal(const __hashtable&) const;
>+
>+    private:
>+      using __hash_cached = typename _Traits::__hash_cached;
>+      using __constant_iterators = typename _Traits::__constant_iterators;
>+      using const_iterator =
>+	__detail::_Node_const_iterator<std::pair<const _Key, _Tp>,
>+				       __constant_iterators::value,
>+				       __hash_cached::value>;
>+
>+      static bool
>+      _S_is_permutation(const_iterator, const_iterator, const_iterator);
>+    };
>+
>+  template<typename _Key, typename _Tp, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    bool
>+    _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>+	      __detail::_Select1st, std::equal_to<_Key>,
>+	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
>+    _S_is_permutation(const_iterator __first1, const_iterator __last1,
>+		      const_iterator __first2)
>+    {
>+      for (; __first1 != __last1; ++__first1, ++__first2)
>+	if (!(__first1->second == __first2->second))
>+	  break;
>+
>+      if (__first1 == __last1)
>+	return true;
>+
>+      const_iterator __last2 = __first2;
>+      std::advance(__last2, std::distance(__first1, __last1));
>+
>+      for (const_iterator __it1 = __first1; __it1 != __last1; ++__it1)
>+	{
>+	  const_iterator __tmp =  __first1;
>+	  while (__tmp != __it1 && !bool(__tmp->second == __it1->second))
>+	    ++__tmp;
>+
>+	  // We've seen this one before.
>+	  if (__tmp != __it1)
>+	    continue;
>+
>+	  std::ptrdiff_t __n2 = 0;
>+	  for (__tmp = __first2; __tmp != __last2; ++__tmp)
>+	    if (__tmp->second == __it1->second)
>+	      ++__n2;
>+
>+	  if (!__n2)
>+	    return false;
>+
>+	  std::ptrdiff_t __n1 = 0;
>+	  for (__tmp = __it1; __tmp != __last1; ++__tmp)
>+	    if (__tmp->second == __it1->second)
>+	      ++__n1;
>+
>+	  if (__n1 != __n2)
>+	    return false;
>+	}
>+      return true;
>+    }
>+
>+  template<typename _Key, typename _Tp, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    bool
>+    _Equality<_Key, std::pair<const _Key, _Tp>, _Alloc,
>+	      __detail::_Select1st, std::equal_to<_Key>,
>+	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
>+    _M_equal(const __hashtable& __other) const
>+    {
>+      const __hashtable* __this = static_cast<const __hashtable*>(this);
>+
>+      if (__this->size() != __other.size())
>+	return false;
>+
>+      for (auto __itx = __this->begin(); __itx != __this->end();)
>+	{
>+	  std::size_t __x_count = 1;
>+	  auto __itx_end = __itx;
>+	  for (++__itx_end; __itx_end != __this->end()
>+		 && __this->key_eq()(__itx_end->first, __itx->first);
>+	       ++__itx_end)
>+	    ++__x_count;
>+
>+	  const auto __xrange = make_pair(__itx, __itx_end);
>+	  const auto __yrange = __other.equal_range(__itx->first);
>+
>+	  if (__x_count != std::distance(__yrange.first, __yrange.second))
>+	    return false;
>+
>+	  if (!_S_is_permutation(__xrange.first, __xrange.second,
>+				 __yrange.first))
>+	    return false;
>+
>+	  __itx = __xrange.second;

Why do you use __xrange.second here, rather than __itx_end?

To me it seems clearer to use __itx_end, consistent with the
multiset case below.


>+	}
>+      return true;
>+    }
>+
>+  /// Specialization.
>+  template<typename _Value, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    struct _Equality<_Value, _Value, _Alloc, __detail::_Identity,
>+		     std::equal_to<_Value>, _H1, _H2,
>+		     _Hash, _RehashPolicy, _Traits, false>
>+    {
>+      using __hashtable = _Hashtable<_Value, _Value, _Alloc,
>+				     __detail::_Identity, std::equal_to<_Value>,
>+				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
>+
>+      bool
>+      _M_equal(const __hashtable&) const;
>+    };
>+
>+  template<typename _Value, typename _Alloc,
>+	   typename _H1, typename _H2, typename _Hash,
>+	   typename _RehashPolicy, typename _Traits>
>+    bool
>+    _Equality<_Value, _Value, _Alloc, __detail::_Identity,
>+	      std::equal_to<_Value>,
>+	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
>+    _M_equal(const __hashtable& __other) const
>+    {
>+      const __hashtable* __this = static_cast<const __hashtable*>(this);
>+
>+      if (__this->size() != __other.size())
>+	return false;
>+
>+      for (auto __itx = __this->begin(); __itx != __this->end();)
>+	{
>+	  std::size_t __x_count = 1;
>+	  auto __itx_end = __itx;
>+	  for (++__itx_end; __itx_end != __this->end()
>+		 && __this->key_eq()(*__itx_end, *__itx); ++__itx_end)
>+	    ++__x_count;
>+
>+	  if (__x_count != __other.count(*__itx))
>+	    return false;
>+
>+	  __itx = __itx_end;
>+	}
>+      return true;
>+    }
>+
>   /**
>    * This type deals with all allocation and keeps an allocator instance
>    * through inheritance to benefit from EBO when possible.
>diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
>index de8e2c7cb91..2054791e13a 100644
>--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
>+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
>@@ -30,6 +30,7 @@
> #include <tuple>
> #include <ext/aligned_buffer.h>
> #include <ext/alloc_traits.h>
>+#include <bits/stl_function.h> // equal_to, _Identity, _Select1st
> #include <bits/hashtable_policy.h>
> 
> namespace std _GLIBCXX_VISIBILITY(default)
>diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
>index 4b87f62b74a..7252cad29c2 100644
>--- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
>+++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
>@@ -99,8 +99,64 @@ void test01()
>   VERIFY( !(ums1 != cums2) );
> }
> 
>+void test02()
>+{
>+  std::unordered_multiset<int> us1
>+  { 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9 };
>+  std::unordered_multiset<int> us2
>+  { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
>+
>+  VERIFY( us1 == us2 );
>+}
>+
>+struct Hash
>+{
>+  std::size_t
>+  operator()(const std::pair<int, int>& p) const
>+  { return p.first; }
>+};
>+
>+struct Equal
>+{
>+  bool
>+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
>+  { return lhs.first == rhs.first; }
>+};
>+
>+void test03()
>+{
>+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us1
>+  {
>+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 },
>+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
>+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
>+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 }
>+  };
>+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us2
>+  {
>+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
>+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
>+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
>+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
>+  };
>+
>+  VERIFY( us1 == us2 );
>+
>+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us3
>+  {
>+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
>+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
>+    { 5, 0 }, { 6, 0 }, { 7, 1 }, { 8, 0 }, { 9, 0 },
>+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
>+  };
>+
>+  VERIFY( us1 != us3 );
>+}
>+
> int main()
> {
>   test01();
>+  test02();
>+  test03();
>   return 0;
> }
>diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
>index d841246e2c1..36a45dfa099 100644
>--- a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
>+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
>@@ -99,8 +99,56 @@ void test01()
>   VERIFY( !(us1 != cus2) );
> }
> 
>+void test02()
>+{
>+  std::unordered_set<int> us1 { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
>+  std::unordered_set<int> us2 { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };
>+
>+  VERIFY( us1 == us2 );
>+}
>+
>+struct Hash
>+{
>+  std::size_t
>+  operator()(const std::pair<int, int>& p) const
>+  { return p.first; }
>+};
>+
>+struct Equal
>+{
>+  bool
>+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
>+  { return lhs.first == rhs.first; }
>+};
>+
>+void test03()
>+{
>+  std::unordered_set<std::pair<int, int>, Hash, Equal> us1
>+  {
>+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 },
>+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 }
>+  };
>+  std::unordered_set<std::pair<int, int>, Hash, Equal> us2
>+  {
>+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
>+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
>+  };
>+
>+  VERIFY( us1 == us2 );
>+
>+  std::unordered_set<std::pair<int, int>, Hash, Equal> us3
>+  {
>+    { 5, -5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
>+    { 0, 0  }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
>+  };
>+
>+  VERIFY( us1 != us3 );
>+}
>+
> int main()
> {
>   test01();
>+  test02();
>+  test03();
>   return 0;
> }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-10 22:10 ` Jonathan Wakely
@ 2020-01-10 22:31   ` Jonathan Wakely
  2020-01-13 22:01   ` François Dumont
  1 sibling, 0 replies; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-10 22:31 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On 10/01/20 22:01 +0000, Jonathan Wakely wrote:
>On 10/01/20 18:54 +0100, François Dumont wrote:
>>Hi
>>
>>    Here is my attempt to improve == operator.
>>
>>    There is a small optimization for the std::unordered_mutiXXX 
>>containers but the main enhancement rely on some partial template 
>>specialization of the _Equality type. I limit it to usage of 
>>unordered containers with std::equal_to to be sure that the 
>>container _Equal functor is like the key type ==.
>
>I think we can assume that for any _Equal, not just std::equal_to:
>http://eel.is/c++draft/unord.req#12.sentence-5
>
>However ...
>
>>    Do I need to also consider user partial template specialization 
>>of std::equal_to ? It is a well know bad practice so I hope the 
>>Standard says that such a partial specialization leads to undefined 
>>behavior.
>
>It's certainly not undefined to specialize equal_to, and I'm not sure
>how to make your optimisations valid in that case.
>
>Consider:
>
>struct X
>{
>  int i;
>  int rounded() const { return i - (i % 10); }
>  bool operator==(X x) const { return i == x.i; }
>};
>
>template<> struct std::equal_to<X>
>{
>  bool operator()(X l, X r) const
>  { return l.rounded() == r.rounded(); }
>};
>
>template<> std::hash<X>
>{
>  bool operator()(X x) const { return hash<int>()(x.rounded()); }
>};
>
>std::unordered_multiset<X> u{ X{10}, X{11}, X{12} };
>std::unordered_multiset<X> v{ X{15}, X{16}, X{17} };
>bool b1 = u == v;
>bool b2 = std::is_permutation(u.begin(), u.end(), v.begin());
>assert(b1 == b2);
>
>I think the last new specialization in your patch would be used for
>this case, and because __x_count == v.count(*u.begin()) it will say
>they're equal. But the is_permutation call says they're not.
>
>So I think the assertion would fail, but the standard says it should
>pass. Am I mistaken?

I believe the optimization would still be valid if you do not use
__other.count(*__itx) to check for equivalent keys in the other
container. Your patch does:

+	  if (__x_count != __other.count(*__itx))
+	    return false;

This uses the _Equal predicate to count the equivalent elements in
__other. Instead you need to use operator== to count the **equal**
elements.

I think there's a similar problem in the _Equality specialization for
unordered_map (i.e. key-value pairs, unique keys):

+	  const auto __ity = __other.find(__itx->first);
+	  if (__ity == __other.end() || !bool(__ity->second == __itx->second))
+	    return false;

The call to __other.find(__itx->first) will return an element with
equivalent key, but that's not guaranteed to be equal. I think you
could fix this either by still using == to compare the keys after
__other.find(*__itx) returns an element (which doesn't fix the PR91263
bug) or by replacing find with a similar operation that looks up the
hash code and then uses == to test for equality (instead of using
_Equal pred to test for equivalent keys).

Basically, you can't use functions like find and count that rely on
equivalence of keys, you need to use handwritten lookups using ==.

And if you do that, then it doesn't matter whether _Equal is a
specialization of std::equal_to or not, and it doesn't matter whether
the user has defined their own specialization of std::equal_to. You
can do the optimizations for any _Equal, because you won't actually be
using it to test for equality.

Does that make sense?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-10 22:10 ` Jonathan Wakely
  2020-01-10 22:31   ` Jonathan Wakely
@ 2020-01-13 22:01   ` François Dumont
  2020-01-13 22:56     ` Jonathan Wakely
  1 sibling, 1 reply; 12+ messages in thread
From: François Dumont @ 2020-01-13 22:01 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]

On 1/10/20 11:01 PM, Jonathan Wakely wrote:
> On 10/01/20 18:54 +0100, François Dumont wrote:
>> Hi
>>
>>     Here is my attempt to improve == operator.
>>
>>     There is a small optimization for the std::unordered_mutiXXX 
>> containers but the main enhancement rely on some partial template 
>> specialization of the _Equality type. I limit it to usage of 
>> unordered containers with std::equal_to to be sure that the container 
>> _Equal functor is like the key type ==.
>
> I think we can assume that for any _Equal, not just std::equal_to:
> http://eel.is/c++draft/unord.req#12.sentence-5
>
> However ...

Ok but...

>
>>     Do I need to also consider user partial template specialization 
>> of std::equal_to ? It is a well know bad practice so I hope the 
>> Standard says that such a partial specialization leads to undefined 
>> behavior.
>
> It's certainly not undefined to specialize equal_to, and I'm not sure
> how to make your optimisations valid in that case.

This proposal is indeed invalid if you use a std::equal_to partial 
specialization, this is why I asked.

>
> Consider:
>
> struct X
> {
>   int i;
>   int rounded() const { return i - (i % 10); }
>   bool operator==(X x) const { return i == x.i; }
> };
>
> template<> struct std::equal_to<X>
> {
>   bool operator()(X l, X r) const
>   { return l.rounded() == r.rounded(); }
> };
>
> template<> std::hash<X>
> {
>   bool operator()(X x) const { return hash<int>()(x.rounded()); }
> };
>
> std::unordered_multiset<X> u{ X{10}, X{11}, X{12} };
> std::unordered_multiset<X> v{ X{15}, X{16}, X{17} };
> bool b1 = u == v;
> bool b2 = std::is_permutation(u.begin(), u.end(), v.begin());
> assert(b1 == b2);
>
> I think the last new specialization in your patch would be used for
> this case, and because __x_count == v.count(*u.begin()) it will say
> they're equal. But the is_permutation call says they're not.
>
> So I think the assertion would fail, but the standard says it should
> pass. Am I mistaken?

I agree, it would fail and the Standard says it should pass.

So here is a new proposal. For the unique keys case I think we are good, 
I do not see any other optimization.

For the multi-keys we could still avoid redundant comparisons when 
_Equal is just doing == on the key type. On unordered_multiset we could 
just avoids the call to is_permuation and on the unordered_multimap we 
could check the is_permutation only on the associated value rather than 
on the std::pair.

In order to detect that _Equal is the std::equal_to from stl_function.h 
it would be great to have something like a __builtin_is_system returning 
true for types defined in system headers. For now I try to propose 
something similar without compiler help.

François



[-- Attachment #2: pr91263.patch --]
[-- Type: text/x-patch, Size: 9355 bytes --]

diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 8fac385570b..9e721aad8cc 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -337,6 +337,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	       bool _Constant_iteratorsa>
 	friend struct __detail::_Insert;
 
+      template<typename _Keya, typename _Valuea, typename _Alloca,
+	       typename _ExtractKeya, typename _Equala,
+	       typename _H1a, typename _H2a, typename _Hasha,
+	       typename _RehashPolicya, typename _Traitsa,
+	       bool _Unique_keysa>
+	friend struct __detail::_Equality;
+
     public:
       using size_type = typename __hashtable_base::size_type;
       using difference_type = typename __hashtable_base::difference_type;
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 7bbfdfd375b..55c020f93d1 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -34,6 +34,7 @@
 #include <tuple>		// for std::tuple, std::forward_as_tuple
 #include <limits>		// for std::numeric_limits
 #include <bits/stl_algobase.h>	// for std::min.
+#include <bits/stl_algo.h>	// for std::is_permutation.
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -1815,65 +1816,6 @@ namespace __detail
     _M_eq() const { return _EqualEBO::_M_cget(); }
   };
 
-  /**
-   *  struct _Equality_base.
-   *
-   *  Common types and functions for class _Equality.
-   */
-  struct _Equality_base
-  {
-  protected:
-    template<typename _Uiterator>
-      static bool
-      _S_is_permutation(_Uiterator, _Uiterator, _Uiterator);
-  };
-
-  // See std::is_permutation in N3068.
-  template<typename _Uiterator>
-    bool
-    _Equality_base::
-    _S_is_permutation(_Uiterator __first1, _Uiterator __last1,
-		      _Uiterator __first2)
-    {
-      for (; __first1 != __last1; ++__first1, ++__first2)
-	if (!(*__first1 == *__first2))
-	  break;
-
-      if (__first1 == __last1)
-	return true;
-
-      _Uiterator __last2 = __first2;
-      std::advance(__last2, std::distance(__first1, __last1));
-
-      for (_Uiterator __it1 = __first1; __it1 != __last1; ++__it1)
-	{
-	  _Uiterator __tmp =  __first1;
-	  while (__tmp != __it1 && !bool(*__tmp == *__it1))
-	    ++__tmp;
-
-	  // We've seen this one before.
-	  if (__tmp != __it1)
-	    continue;
-
-	  std::ptrdiff_t __n2 = 0;
-	  for (__tmp = __first2; __tmp != __last2; ++__tmp)
-	    if (*__tmp == *__it1)
-	      ++__n2;
-
-	  if (!__n2)
-	    return false;
-
-	  std::ptrdiff_t __n1 = 0;
-	  for (__tmp = __it1; __tmp != __last1; ++__tmp)
-	    if (*__tmp == *__it1)
-	      ++__n1;
-
-	  if (__n1 != __n2)
-	    return false;
-	}
-      return true;
-    }
-
   /**
    *  Primary class template  _Equality.
    *
@@ -1889,7 +1831,7 @@ namespace __detail
 	   bool _Unique_keys = _Traits::__unique_keys::value>
     struct _Equality;
 
-  /// Specialization.
+  /// unordered_map and unordered_set specializations.
   template<typename _Key, typename _Value, typename _Alloc,
 	   typename _ExtractKey, typename _Equal,
 	   typename _H1, typename _H2, typename _Hash,
@@ -1913,17 +1855,31 @@ namespace __detail
 	      _H1, _H2, _Hash, _RehashPolicy, _Traits, true>::
     _M_equal(const __hashtable& __other) const
     {
+      using __node_base = typename __hashtable::__node_base;
+      using __node_type = typename __hashtable::__node_type;
       const __hashtable* __this = static_cast<const __hashtable*>(this);
-
       if (__this->size() != __other.size())
 	return false;
 
       for (auto __itx = __this->begin(); __itx != __this->end(); ++__itx)
 	{
-	  const auto __ity = __other.find(_ExtractKey()(*__itx));
-	  if (__ity == __other.end() || !bool(*__ity == *__itx))
+	  std::size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
+	  __node_base* __prev_n = __other._M_buckets[__ybkt];
+	  if (!__prev_n)
 	    return false;
+
+	  for (__node_type* __n = static_cast<__node_type*>(__prev_n->_M_nxt);;
+	       __n = __n->_M_next())
+	    {
+	      if (__n->_M_v() == *__itx)
+		break;
+
+	      if (!__n->_M_nxt
+		  || __other._M_bucket_index(__n->_M_next()) != __ybkt)
+		return false;
+	    }
 	}
+
       return true;
     }
 
@@ -1934,7 +1890,6 @@ namespace __detail
 	   typename _RehashPolicy, typename _Traits>
     struct _Equality<_Key, _Value, _Alloc, _ExtractKey, _Equal,
 		     _H1, _H2, _Hash, _RehashPolicy, _Traits, false>
-    : public _Equality_base
     {
       using __hashtable = _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
 				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
@@ -1953,24 +1908,27 @@ namespace __detail
     _M_equal(const __hashtable& __other) const
     {
       const __hashtable* __this = static_cast<const __hashtable*>(this);
-
       if (__this->size() != __other.size())
 	return false;
 
       for (auto __itx = __this->begin(); __itx != __this->end();)
 	{
-	  const auto __xrange = __this->equal_range(_ExtractKey()(*__itx));
-	  const auto __yrange = __other.equal_range(_ExtractKey()(*__itx));
+	  std::size_t __x_count = 1;
+	  auto __itx_end = __itx;
+	  for (++__itx_end; __itx_end != __this->end()
+		 && __this->key_eq()(_ExtractKey()(*__itx_end),
+				     _ExtractKey()(*__itx));
+	       ++__itx_end)
+	    ++__x_count;
 
-	  if (std::distance(__xrange.first, __xrange.second)
-	      != std::distance(__yrange.first, __yrange.second))
+	  const auto __yrange = __other.equal_range(_ExtractKey()(*__itx));
+	  if (__x_count != std::distance(__yrange.first, __yrange.second))
 	    return false;
 
-	  if (!_S_is_permutation(__xrange.first, __xrange.second,
-				 __yrange.first))
+	  if (!std::is_permutation(__itx, __itx_end, __yrange.first))
 	    return false;
 
-	  __itx = __xrange.second;
+	  __itx = __itx_end;
 	}
       return true;
     }
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
index 4b87f62b74a..7252cad29c2 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
@@ -99,8 +99,64 @@ void test01()
   VERIFY( !(ums1 != cums2) );
 }
 
+void test02()
+{
+  std::unordered_multiset<int> us1
+  { 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9 };
+  std::unordered_multiset<int> us2
+  { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+
+  VERIFY( us1 == us2 );
+}
+
+struct Hash
+{
+  std::size_t
+  operator()(const std::pair<int, int>& p) const
+  { return p.first; }
+};
+
+struct Equal
+{
+  bool
+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
+  { return lhs.first == rhs.first; }
+};
+
+void test03()
+{
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us1
+  {
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 }
+  };
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us2
+  {
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
+  };
+
+  VERIFY( us1 == us2 );
+
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us3
+  {
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 1 }, { 8, 0 }, { 9, 0 },
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
+  };
+
+  VERIFY( us1 != us3 );
+}
+
 int main()
 {
   test01();
+  test02();
+  test03();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
index d841246e2c1..36a45dfa099 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
@@ -99,8 +99,56 @@ void test01()
   VERIFY( !(us1 != cus2) );
 }
 
+void test02()
+{
+  std::unordered_set<int> us1 { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+  std::unordered_set<int> us2 { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };
+
+  VERIFY( us1 == us2 );
+}
+
+struct Hash
+{
+  std::size_t
+  operator()(const std::pair<int, int>& p) const
+  { return p.first; }
+};
+
+struct Equal
+{
+  bool
+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
+  { return lhs.first == rhs.first; }
+};
+
+void test03()
+{
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us1
+  {
+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 },
+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 }
+  };
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us2
+  {
+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
+  };
+
+  VERIFY( us1 == us2 );
+
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us3
+  {
+    { 5, -5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
+    { 0, 0  }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
+  };
+
+  VERIFY( us1 != us3 );
+}
+
 int main()
 {
   test01();
+  test02();
+  test03();
   return 0;
 }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-13 22:01   ` François Dumont
@ 2020-01-13 22:56     ` Jonathan Wakely
  2020-01-14 21:52       ` François Dumont
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-13 22:56 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On 13/01/20 22:41 +0100, François Dumont wrote:
>On 1/10/20 11:01 PM, Jonathan Wakely wrote:
>>On 10/01/20 18:54 +0100, François Dumont wrote:
>>>Hi
>>>
>>>    Here is my attempt to improve == operator.
>>>
>>>    There is a small optimization for the std::unordered_mutiXXX 
>>>containers but the main enhancement rely on some partial template 
>>>specialization of the _Equality type. I limit it to usage of 
>>>unordered containers with std::equal_to to be sure that the 
>>>container _Equal functor is like the key type ==.
>>
>>I think we can assume that for any _Equal, not just std::equal_to:
>>http://eel.is/c++draft/unord.req#12.sentence-5
>>
>>However ...
>
>Ok but...
>
>>
>>>    Do I need to also consider user partial template 
>>>specialization of std::equal_to ? It is a well know bad practice 
>>>so I hope the Standard says that such a partial specialization 
>>>leads to undefined behavior.
>>
>>It's certainly not undefined to specialize equal_to, and I'm not sure
>>how to make your optimisations valid in that case.
>
>This proposal is indeed invalid if you use a std::equal_to partial 
>specialization, this is why I asked.
>
>>
>>Consider:
>>
>>struct X
>>{
>>  int i;
>>  int rounded() const { return i - (i % 10); }
>>  bool operator==(X x) const { return i == x.i; }
>>};
>>
>>template<> struct std::equal_to<X>
>>{
>>  bool operator()(X l, X r) const
>>  { return l.rounded() == r.rounded(); }
>>};
>>
>>template<> std::hash<X>
>>{
>>  bool operator()(X x) const { return hash<int>()(x.rounded()); }
>>};
>>
>>std::unordered_multiset<X> u{ X{10}, X{11}, X{12} };
>>std::unordered_multiset<X> v{ X{15}, X{16}, X{17} };
>>bool b1 = u == v;
>>bool b2 = std::is_permutation(u.begin(), u.end(), v.begin());
>>assert(b1 == b2);
>>
>>I think the last new specialization in your patch would be used for
>>this case, and because __x_count == v.count(*u.begin()) it will say
>>they're equal. But the is_permutation call says they're not.
>>
>>So I think the assertion would fail, but the standard says it should
>>pass. Am I mistaken?
>
>I agree, it would fail and the Standard says it should pass.
>
>So here is a new proposal. For the unique keys case I think we are 
>good, I do not see any other optimization.
>
>For the multi-keys we could still avoid redundant comparisons when 
>_Equal is just doing == on the key type. On unordered_multiset we 
>could just avoids the call to is_permuation and on the 
>unordered_multimap we could check the is_permutation only on the 
>associated value rather than on the std::pair.
>
>In order to detect that _Equal is the std::equal_to from 
>stl_function.h it would be great to have something like a 
>__builtin_is_system returning true for types defined in system 
>headers. For now I try to propose something similar without compiler 
>help.

I don't think that's necessary, or helpful.

The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
that you shouldn't be using _Equal at all, and therefore it doesn't
matter whether it's std::equal_to or not.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-13 22:56     ` Jonathan Wakely
@ 2020-01-14 21:52       ` François Dumont
  2020-01-15 22:27         ` Jonathan Wakely
  0 siblings, 1 reply; 12+ messages in thread
From: François Dumont @ 2020-01-14 21:52 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1309 bytes --]

On 1/13/20 10:53 PM, Jonathan Wakely wrote:
> On 13/01/20 22:41 +0100, François Dumont wrote:
>>
>> For the multi-keys we could still avoid redundant comparisons when 
>> _Equal is just doing == on the key type. On unordered_multiset we 
>> could just avoids the call to is_permuation and on the 
>> unordered_multimap we could check the is_permutation only on the 
>> associated value rather than on the std::pair.
>>
> I don't think that's necessary, or helpful.
>
> The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
> that you shouldn't be using _Equal at all, and therefore it doesn't
> matter whether it's std::equal_to or not.
>
>
And it was indeed possible.

     PR libstdc++/91223
     * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
     * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
     (_Equality_base): Remove.
     (_Equality<>::_M_equal): Review implementation. Use 
std::is_permutation.
     * testsuite/23_containers/unordered_multiset/operators/1.cc
     (Hash, Equal, test02, test03): New.
     * testsuite/23_containers/unordered_set/operators/1.cc
     (Hash, Equal, test02, test03): New.

Tested under Linux x86_64.

Ok to commit ?

François


[-- Attachment #2: pr91263.patch --]
[-- Type: text/x-patch, Size: 10290 bytes --]

diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 8fac385570b..9e721aad8cc 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -337,6 +337,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	       bool _Constant_iteratorsa>
 	friend struct __detail::_Insert;
 
+      template<typename _Keya, typename _Valuea, typename _Alloca,
+	       typename _ExtractKeya, typename _Equala,
+	       typename _H1a, typename _H2a, typename _Hasha,
+	       typename _RehashPolicya, typename _Traitsa,
+	       bool _Unique_keysa>
+	friend struct __detail::_Equality;
+
     public:
       using size_type = typename __hashtable_base::size_type;
       using difference_type = typename __hashtable_base::difference_type;
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 7bbfdfd375b..4024e6c37fa 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -34,6 +34,7 @@
 #include <tuple>		// for std::tuple, std::forward_as_tuple
 #include <limits>		// for std::numeric_limits
 #include <bits/stl_algobase.h>	// for std::min.
+#include <bits/stl_algo.h>	// for std::is_permutation.
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -1815,65 +1816,6 @@ namespace __detail
     _M_eq() const { return _EqualEBO::_M_cget(); }
   };
 
-  /**
-   *  struct _Equality_base.
-   *
-   *  Common types and functions for class _Equality.
-   */
-  struct _Equality_base
-  {
-  protected:
-    template<typename _Uiterator>
-      static bool
-      _S_is_permutation(_Uiterator, _Uiterator, _Uiterator);
-  };
-
-  // See std::is_permutation in N3068.
-  template<typename _Uiterator>
-    bool
-    _Equality_base::
-    _S_is_permutation(_Uiterator __first1, _Uiterator __last1,
-		      _Uiterator __first2)
-    {
-      for (; __first1 != __last1; ++__first1, ++__first2)
-	if (!(*__first1 == *__first2))
-	  break;
-
-      if (__first1 == __last1)
-	return true;
-
-      _Uiterator __last2 = __first2;
-      std::advance(__last2, std::distance(__first1, __last1));
-
-      for (_Uiterator __it1 = __first1; __it1 != __last1; ++__it1)
-	{
-	  _Uiterator __tmp =  __first1;
-	  while (__tmp != __it1 && !bool(*__tmp == *__it1))
-	    ++__tmp;
-
-	  // We've seen this one before.
-	  if (__tmp != __it1)
-	    continue;
-
-	  std::ptrdiff_t __n2 = 0;
-	  for (__tmp = __first2; __tmp != __last2; ++__tmp)
-	    if (*__tmp == *__it1)
-	      ++__n2;
-
-	  if (!__n2)
-	    return false;
-
-	  std::ptrdiff_t __n1 = 0;
-	  for (__tmp = __it1; __tmp != __last1; ++__tmp)
-	    if (*__tmp == *__it1)
-	      ++__n1;
-
-	  if (__n1 != __n2)
-	    return false;
-	}
-      return true;
-    }
-
   /**
    *  Primary class template  _Equality.
    *
@@ -1889,7 +1831,7 @@ namespace __detail
 	   bool _Unique_keys = _Traits::__unique_keys::value>
     struct _Equality;
 
-  /// Specialization.
+  /// unordered_map and unordered_set specializations.
   template<typename _Key, typename _Value, typename _Alloc,
 	   typename _ExtractKey, typename _Equal,
 	   typename _H1, typename _H2, typename _Hash,
@@ -1913,28 +1855,41 @@ namespace __detail
 	      _H1, _H2, _Hash, _RehashPolicy, _Traits, true>::
     _M_equal(const __hashtable& __other) const
     {
+      using __node_base = typename __hashtable::__node_base;
+      using __node_type = typename __hashtable::__node_type;
       const __hashtable* __this = static_cast<const __hashtable*>(this);
-
       if (__this->size() != __other.size())
 	return false;
 
       for (auto __itx = __this->begin(); __itx != __this->end(); ++__itx)
 	{
-	  const auto __ity = __other.find(_ExtractKey()(*__itx));
-	  if (__ity == __other.end() || !bool(*__ity == *__itx))
+	  std::size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
+	  __node_base* __prev_n = __other._M_buckets[__ybkt];
+	  if (!__prev_n)
 	    return false;
+
+	  for (__node_type* __n = static_cast<__node_type*>(__prev_n->_M_nxt);;
+	       __n = __n->_M_next())
+	    {
+	      if (__n->_M_v() == *__itx)
+		break;
+
+	      if (!__n->_M_nxt
+		  || __other._M_bucket_index(__n->_M_next()) != __ybkt)
+		return false;
+	    }
 	}
+
       return true;
     }
 
-  /// Specialization.
+  /// unordered_multiset and unordered_multimap specializations.
   template<typename _Key, typename _Value, typename _Alloc,
 	   typename _ExtractKey, typename _Equal,
 	   typename _H1, typename _H2, typename _Hash,
 	   typename _RehashPolicy, typename _Traits>
     struct _Equality<_Key, _Value, _Alloc, _ExtractKey, _Equal,
 		     _H1, _H2, _Hash, _RehashPolicy, _Traits, false>
-    : public _Equality_base
     {
       using __hashtable = _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
 				     _H1, _H2, _Hash, _RehashPolicy, _Traits>;
@@ -1952,25 +1907,51 @@ namespace __detail
 	      _H1, _H2, _Hash, _RehashPolicy, _Traits, false>::
     _M_equal(const __hashtable& __other) const
     {
+      using __node_base = typename __hashtable::__node_base;
+      using __node_type = typename __hashtable::__node_type;
       const __hashtable* __this = static_cast<const __hashtable*>(this);
-
       if (__this->size() != __other.size())
 	return false;
 
       for (auto __itx = __this->begin(); __itx != __this->end();)
 	{
-	  const auto __xrange = __this->equal_range(_ExtractKey()(*__itx));
-	  const auto __yrange = __other.equal_range(_ExtractKey()(*__itx));
+	  std::size_t __x_count = 1;
+	  auto __itx_end = __itx;
+	  for (++__itx_end; __itx_end != __this->end()
+		 && __this->key_eq()(_ExtractKey()(*__itx),
+				     _ExtractKey()(*__itx_end));
+	       ++__itx_end)
+	    ++__x_count;
+
+	  std::size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
+	  __node_base* __y_prev_n = __other._M_buckets[__ybkt];
+	  if (!__y_prev_n)
+	    return false;
+
+	  __node_type* __y_n = static_cast<__node_type*>(__y_prev_n->_M_nxt);
+	  for (;; __y_n = __y_n->_M_next())
+	    {
+	      if (__this->key_eq()(_ExtractKey()(__y_n->_M_v()),
+				   _ExtractKey()(*__itx)))
+		break;
+
+	      if (!__y_n->_M_nxt
+		  || __other._M_bucket_index(__y_n->_M_next()) != __ybkt)
+		return false;
+	    }
+
+	  typename __hashtable::const_iterator __ity(__y_n);
+	  for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
+	    if (--__x_count == 0)
+	      break;
 
-	  if (std::distance(__xrange.first, __xrange.second)
-	      != std::distance(__yrange.first, __yrange.second))
+	  if (__x_count != 0)
 	    return false;
 
-	  if (!_S_is_permutation(__xrange.first, __xrange.second,
-				 __yrange.first))
+	  if (!std::is_permutation(__itx, __itx_end, __ity))
 	    return false;
 
-	  __itx = __xrange.second;
+	  __itx = __itx_end;
 	}
       return true;
     }
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
index 4b87f62b74a..7252cad29c2 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/operators/1.cc
@@ -99,8 +99,64 @@ void test01()
   VERIFY( !(ums1 != cums2) );
 }
 
+void test02()
+{
+  std::unordered_multiset<int> us1
+  { 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9 };
+  std::unordered_multiset<int> us2
+  { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+
+  VERIFY( us1 == us2 );
+}
+
+struct Hash
+{
+  std::size_t
+  operator()(const std::pair<int, int>& p) const
+  { return p.first; }
+};
+
+struct Equal
+{
+  bool
+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
+  { return lhs.first == rhs.first; }
+};
+
+void test03()
+{
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us1
+  {
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 }
+  };
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us2
+  {
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 0 }, { 8, 0 }, { 9, 0 },
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
+  };
+
+  VERIFY( us1 == us2 );
+
+  std::unordered_multiset<std::pair<int, int>, Hash, Equal> us3
+  {
+    { 5, 1 }, { 6, 1 }, { 7, 1 }, { 8, 1 }, { 9, 1 },
+    { 0, 1 }, { 1, 1 }, { 2, 1 }, { 3, 1 }, { 4, 1 },
+    { 5, 0 }, { 6, 0 }, { 7, 1 }, { 8, 0 }, { 9, 0 },
+    { 0, 0 }, { 1, 0 }, { 2, 0 }, { 3, 0 }, { 4, 0 }
+  };
+
+  VERIFY( us1 != us3 );
+}
+
 int main()
 {
   test01();
+  test02();
+  test03();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
index d841246e2c1..36a45dfa099 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/operators/1.cc
@@ -99,8 +99,56 @@ void test01()
   VERIFY( !(us1 != cus2) );
 }
 
+void test02()
+{
+  std::unordered_set<int> us1 { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+  std::unordered_set<int> us2 { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 };
+
+  VERIFY( us1 == us2 );
+}
+
+struct Hash
+{
+  std::size_t
+  operator()(const std::pair<int, int>& p) const
+  { return p.first; }
+};
+
+struct Equal
+{
+  bool
+  operator()(const std::pair<int, int>& lhs, const std::pair<int, int>& rhs) const
+  { return lhs.first == rhs.first; }
+};
+
+void test03()
+{
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us1
+  {
+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 },
+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 }
+  };
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us2
+  {
+    { 5, 5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
+    { 0, 0 }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
+  };
+
+  VERIFY( us1 == us2 );
+
+  std::unordered_set<std::pair<int, int>, Hash, Equal> us3
+  {
+    { 5, -5 }, { 6, 6 }, { 7, 7 }, { 8, 8 }, { 9, 9 },
+    { 0, 0  }, { 1, 1 }, { 2, 2 }, { 3, 3 }, { 4, 4 }
+  };
+
+  VERIFY( us1 != us3 );
+}
+
 int main()
 {
   test01();
+  test02();
+  test03();
   return 0;
 }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-14 21:52       ` François Dumont
@ 2020-01-15 22:27         ` Jonathan Wakely
  2020-01-15 22:32           ` Jonathan Wakely
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-15 22:27 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On 14/01/20 22:25 +0100, François Dumont wrote:
>On 1/13/20 10:53 PM, Jonathan Wakely wrote:
>>On 13/01/20 22:41 +0100, François Dumont wrote:
>>>
>>>For the multi-keys we could still avoid redundant comparisons when 
>>>_Equal is just doing == on the key type. On unordered_multiset we 
>>>could just avoids the call to is_permuation and on the 
>>>unordered_multimap we could check the is_permutation only on the 
>>>associated value rather than on the std::pair.
>>>
>>I don't think that's necessary, or helpful.
>>
>>The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
>>that you shouldn't be using _Equal at all, and therefore it doesn't
>>matter whether it's std::equal_to or not.
>>
>>
>And it was indeed possible.

Nice!

>    PR libstdc++/91223
>    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
>    * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
>    (_Equality_base): Remove.
>    (_Equality<>::_M_equal): Review implementation. Use 
>std::is_permutation.
>    * testsuite/23_containers/unordered_multiset/operators/1.cc
>    (Hash, Equal, test02, test03): New.
>    * testsuite/23_containers/unordered_set/operators/1.cc
>    (Hash, Equal, test02, test03): New.
>
>Tested under Linux x86_64.
>
>Ok to commit ?

Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
and fixes a pretty nasty performance bug, so is OK now).

N.B. GCC has moved to Git instead of Subversion. If you don't have Git
access set up let me know and I can commit the patch for you.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-15 22:27         ` Jonathan Wakely
@ 2020-01-15 22:32           ` Jonathan Wakely
  2020-01-16  7:38             ` François Dumont
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-15 22:32 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
>On 14/01/20 22:25 +0100, François Dumont wrote:
>>On 1/13/20 10:53 PM, Jonathan Wakely wrote:
>>>On 13/01/20 22:41 +0100, François Dumont wrote:
>>>>
>>>>For the multi-keys we could still avoid redundant comparisons 
>>>>when _Equal is just doing == on the key type. On 
>>>>unordered_multiset we could just avoids the call to 
>>>>is_permuation and on the unordered_multimap we could check the 
>>>>is_permutation only on the associated value rather than on the 
>>>>std::pair.
>>>>
>>>I don't think that's necessary, or helpful.
>>>
>>>The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
>>>that you shouldn't be using _Equal at all, and therefore it doesn't
>>>matter whether it's std::equal_to or not.
>>>
>>>
>>And it was indeed possible.
>
>Nice!
>
>>    PR libstdc++/91223
>>    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
>>    * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
>>    (_Equality_base): Remove.
>>    (_Equality<>::_M_equal): Review implementation. Use 
>>std::is_permutation.
>>    * testsuite/23_containers/unordered_multiset/operators/1.cc
>>    (Hash, Equal, test02, test03): New.
>>    * testsuite/23_containers/unordered_set/operators/1.cc
>>    (Hash, Equal, test02, test03): New.
>>
>>Tested under Linux x86_64.
>>
>>Ok to commit ?
>
>Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
>and fixes a pretty nasty performance bug, so is OK now).
>
>N.B. GCC has moved to Git instead of Subversion. If you don't have Git
>access set up let me know and I can commit the patch for you.

P.S. some performance numbers using the code in the bug report
(calling Nested(n+1) and Nested(n) where n is the number of levels
shown) ...

Before:

10 levels of nesting, 0.000090 seconds
20 levels of nesting, 0.082400 seconds
22 levels of nesting, 0.285758 seconds
24 levels of nesting, 1.146782 seconds
26 levels of nesting, 4.659524 seconds
28 levels of nesting, 17.739022 seconds
30 levels of nesting, 76.288977 seconds

real    1m40.204s
user    1m40.039s
sys     0m0.005s

After:

10 levels of nesting, 0.000001 seconds
20 levels of nesting, 0.000001 seconds
22 levels of nesting, 0.000001 seconds
24 levels of nesting, 0.000001 seconds
26 levels of nesting, 0.000001 seconds
28 levels of nesting, 0.000001 seconds
30 levels of nesting, 0.000001 seconds
20000 levels of nesting, 0.002905 seconds

real    0m0.002s
user    0m0.001s
sys     0m0.001s

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-15 22:32           ` Jonathan Wakely
@ 2020-01-16  7:38             ` François Dumont
  2020-01-16 13:53               ` Jonathan Wakely
  0 siblings, 1 reply; 12+ messages in thread
From: François Dumont @ 2020-01-16  7:38 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++, gcc-patches

On 1/15/20 10:52 PM, Jonathan Wakely wrote:
> On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
>> On 14/01/20 22:25 +0100, François Dumont wrote:
>>> On 1/13/20 10:53 PM, Jonathan Wakely wrote:
>>>> On 13/01/20 22:41 +0100, François Dumont wrote:
>>>>>
>>>>> For the multi-keys we could still avoid redundant comparisons when 
>>>>> _Equal is just doing == on the key type. On unordered_multiset we 
>>>>> could just avoids the call to is_permuation and on the 
>>>>> unordered_multimap we could check the is_permutation only on the 
>>>>> associated value rather than on the std::pair.
>>>>>
>>>> I don't think that's necessary, or helpful.
>>>>
>>>> The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
>>>> that you shouldn't be using _Equal at all, and therefore it doesn't
>>>> matter whether it's std::equal_to or not.
>>>>
>>>>
>>> And it was indeed possible.
>>
>> Nice!
>>
>>>     PR libstdc++/91223
>>>     * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
>>>     * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
>>>     (_Equality_base): Remove.
>>>     (_Equality<>::_M_equal): Review implementation. Use 
>>> std::is_permutation.
>>>     * testsuite/23_containers/unordered_multiset/operators/1.cc
>>>     (Hash, Equal, test02, test03): New.
>>>     * testsuite/23_containers/unordered_set/operators/1.cc
>>>     (Hash, Equal, test02, test03): New.
>>>
>>> Tested under Linux x86_64.
>>>
>>> Ok to commit ?
>>
>> Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
>> and fixes a pretty nasty performance bug, so is OK now).
>>
>> N.B. GCC has moved to Git instead of Subversion. If you don't have Git
>> access set up let me know and I can commit the patch for you.

I haven't done the move yet and won't be able to do it before the 
week-end. So please proceed to the commit for me, thanks.

>
> P.S. some performance numbers using the code in the bug report
> (calling Nested(n+1) and Nested(n) where n is the number of levels
> shown) ...
>
> Before:
>
> 10 levels of nesting, 0.000090 seconds
> 20 levels of nesting, 0.082400 seconds
> 22 levels of nesting, 0.285758 seconds
> 24 levels of nesting, 1.146782 seconds
> 26 levels of nesting, 4.659524 seconds
> 28 levels of nesting, 17.739022 seconds
> 30 levels of nesting, 76.288977 seconds
>
> real    1m40.204s
> user    1m40.039s
> sys     0m0.005s
>
> After:
>
> 10 levels of nesting, 0.000001 seconds
> 20 levels of nesting, 0.000001 seconds
> 22 levels of nesting, 0.000001 seconds
> 24 levels of nesting, 0.000001 seconds
> 26 levels of nesting, 0.000001 seconds
> 28 levels of nesting, 0.000001 seconds
> 30 levels of nesting, 0.000001 seconds
> 20000 levels of nesting, 0.002905 seconds
>
> real    0m0.002s
> user    0m0.001s
> sys     0m0.001s
>
Very nice indeed !

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-16  7:38             ` François Dumont
@ 2020-01-16 13:53               ` Jonathan Wakely
  2020-01-16 16:05                 ` Jonathan Wakely
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-16 13:53 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

On 16/01/20 07:42 +0100, François Dumont wrote:
>On 1/15/20 10:52 PM, Jonathan Wakely wrote:
>>On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
>>>On 14/01/20 22:25 +0100, François Dumont wrote:
>>>>On 1/13/20 10:53 PM, Jonathan Wakely wrote:
>>>>>On 13/01/20 22:41 +0100, François Dumont wrote:
>>>>>>
>>>>>>For the multi-keys we could still avoid redundant 
>>>>>>comparisons when _Equal is just doing == on the key type. On 
>>>>>>unordered_multiset we could just avoids the call to 
>>>>>>is_permuation and on the unordered_multimap we could check 
>>>>>>the is_permutation only on the associated value rather than 
>>>>>>on the std::pair.
>>>>>>
>>>>>I don't think that's necessary, or helpful.
>>>>>
>>>>>The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
>>>>>that you shouldn't be using _Equal at all, and therefore it doesn't
>>>>>matter whether it's std::equal_to or not.
>>>>>
>>>>>
>>>>And it was indeed possible.
>>>
>>>Nice!
>>>
>>>>    PR libstdc++/91223
>>>>    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
>>>>    * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
>>>>    (_Equality_base): Remove.
>>>>    (_Equality<>::_M_equal): Review implementation. Use 
>>>>std::is_permutation.
>>>>    * testsuite/23_containers/unordered_multiset/operators/1.cc
>>>>    (Hash, Equal, test02, test03): New.
>>>>    * testsuite/23_containers/unordered_set/operators/1.cc
>>>>    (Hash, Equal, test02, test03): New.
>>>>
>>>>Tested under Linux x86_64.
>>>>
>>>>Ok to commit ?
>>>
>>>Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
>>>and fixes a pretty nasty performance bug, so is OK now).
>>>
>>>N.B. GCC has moved to Git instead of Subversion. If you don't have Git
>>>access set up let me know and I can commit the patch for you.
>
>I haven't done the move yet and won't be able to do it before the 
>week-end. So please proceed to the commit for me, thanks.

No problem, I can do that.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-16 13:53               ` Jonathan Wakely
@ 2020-01-16 16:05                 ` Jonathan Wakely
  2020-01-16 22:39                   ` François Dumont
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Wakely @ 2020-01-16 16:05 UTC (permalink / raw)
  To: François Dumont; +Cc: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3748 bytes --]

On 16/01/20 13:25 +0000, Jonathan Wakely wrote:
>On 16/01/20 07:42 +0100, François Dumont wrote:
>>On 1/15/20 10:52 PM, Jonathan Wakely wrote:
>>>On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
>>>>On 14/01/20 22:25 +0100, François Dumont wrote:
>>>>>On 1/13/20 10:53 PM, Jonathan Wakely wrote:
>>>>>>On 13/01/20 22:41 +0100, François Dumont wrote:
>>>>>>>
>>>>>>>For the multi-keys we could still avoid redundant 
>>>>>>>comparisons when _Equal is just doing == on the key type. 
>>>>>>>On unordered_multiset we could just avoids the call to 
>>>>>>>is_permuation and on the unordered_multimap we could check 
>>>>>>>the is_permutation only on the associated value rather 
>>>>>>>than on the std::pair.
>>>>>>>
>>>>>>I don't think that's necessary, or helpful.
>>>>>>
>>>>>>The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
>>>>>>that you shouldn't be using _Equal at all, and therefore it doesn't
>>>>>>matter whether it's std::equal_to or not.
>>>>>>
>>>>>>
>>>>>And it was indeed possible.
>>>>
>>>>Nice!
>>>>
>>>>>    PR libstdc++/91223
>>>>>    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
>>>>>    * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
>>>>>    (_Equality_base): Remove.
>>>>>    (_Equality<>::_M_equal): Review implementation. Use 
>>>>>std::is_permutation.
>>>>>    * testsuite/23_containers/unordered_multiset/operators/1.cc
>>>>>    (Hash, Equal, test02, test03): New.
>>>>>    * testsuite/23_containers/unordered_set/operators/1.cc
>>>>>    (Hash, Equal, test02, test03): New.
>>>>>
>>>>>Tested under Linux x86_64.
>>>>>
>>>>>Ok to commit ?
>>>>
>>>>Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
>>>>and fixes a pretty nasty performance bug, so is OK now).
>>>>
>>>>N.B. GCC has moved to Git instead of Subversion. If you don't have Git
>>>>access set up let me know and I can commit the patch for you.
>>
>>I haven't done the move yet and won't be able to do it before the 
>>week-end. So please proceed to the commit for me, thanks.
>
>No problem, I can do that.

Your patch is now committed to trunk. Thanks for the major
improvement.

I had a look at std::is_permutation and I think we can make some
simplifications to the 4-argument overload, and we can share most of
the code between the 3-arg and 4-arg overloads (once they've confirmed
the lengths are the same they do exactly the same thing). See the
attached patch. This should probably wait for stage1 though.

I also wanted to add some comments to the _Equality::_M_equal
specialiation for unordered_multimap/multiset to explain what the code
was doing, and had some more ideas. See patch again.

It looks like this loop can potentially visit every element of
__other, instead of stopping at the end of the bucket:

   typename __hashtable::const_iterator __ity(__y_n);
   for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
     if (--__x_count == 0)
       break;

Consider a case like this:

unordered_multiset<int> a{1, 2, 3, 4};
for (int i = 0; i <10000; ++i)
   a.insert(1);
unordered_multiset<int> b{1, 2, 3, 4};
for (int i = 0; i <10000; ++i)
   b.insert(2);

When doing a == b we'll find 10000 elements in a with key '1',
and find one element in b with that key. But then we iterate through
every element in b after that one, even though they have different
keys and are probably in different buckets.

Instead of just iterating from __ity to __other.end(), can we use a
local iterator so we stop at the end of the bucket?

This seems to make the PR91263 example *very* slightly slower, but
makes the example above significantly faster.

What do you think?



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 6060 bytes --]

diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 6503d1518d3..6d5d3b2fced 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -3588,6 +3588,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return std::make_pair(*__p.first, *__p.second);
     }
 
+  // precondition: distance(__first1, __last1) == distance(__first2, __last2).
+  template<typename _FwdIter1, typename _FwdIter2,
+	   typename _BinaryPredicate = __gnu_cxx::__ops::_Iter_equal_to_iter>
+    _GLIBCXX20_CONSTEXPR
+    bool
+    __is_permutation_impl(_FwdIter1 __first1, _FwdIter1 __last1,
+			  _FwdIter2 __first2, _FwdIter2 __last2,
+			  _BinaryPredicate __pred = {})
+    {
+      for (auto __scan = __first1; __scan != __last1; ++__scan)
+	{
+	  if (__scan != std::__find_if(__first1, __scan,
+			  __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
+	    continue; // We've seen this one before.
+
+	  auto __matches
+	    = std::__count_if(__first2, __last2,
+			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
+	  if (0 == __matches ||
+	      std::__count_if(__scan, __last1,
+			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
+	      != __matches)
+	    return false;
+	}
+      return true;
+    }
+
   template<typename _ForwardIterator1, typename _ForwardIterator2,
 	   typename _BinaryPredicate>
     _GLIBCXX20_CONSTEXPR
@@ -3604,26 +3631,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       if (__first1 == __last1)
 	return true;
 
-      // Establish __last2 assuming equal ranges by iterating over the
-      // rest of the list.
-      _ForwardIterator2 __last2 = __first2;
-      std::advance(__last2, std::distance(__first1, __last1));
-      for (_ForwardIterator1 __scan = __first1; __scan != __last1; ++__scan)
-	{
-	  if (__scan != std::__find_if(__first1, __scan,
-			  __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
-	    continue; // We've seen this one before.
-	  
-	  auto __matches
-	    = std::__count_if(__first2, __last2,
-			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
-	  if (0 == __matches ||
-	      std::__count_if(__scan, __last1,
-			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
-	      != __matches)
-	    return false;
-	}
-      return true;
+      // Establish __last2 by iterating over the rest of the list.
+      auto __last2 = std::next(__first2, std::distance(__first1, __last1));
+      return std::__is_permutation_impl(__first1, __last1, __first2, __last2,
+					__pred);
     }
 
   /**
@@ -3720,36 +3731,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	if (!__pred(__first1, __first2))
 	  break;
 
-      if (__ra_iters)
+      if (__first1 == __last1)
 	{
-	  if (__first1 == __last1)
+	  if (__ra_iters)
 	    return true;
-	}
-      else
-	{
-	  auto __d1 = std::distance(__first1, __last1);
-	  auto __d2 = std::distance(__first2, __last2);
-	  if (__d1 == 0 && __d2 == 0)
-	    return true;
-	  if (__d1 != __d2)
-	    return false;
+	  return __first2 == __last2;
 	}
 
-      for (_ForwardIterator1 __scan = __first1; __scan != __last1; ++__scan)
+      auto __d1 = std::distance(__first1, __last1);
+      while (__first2 != __last2 && __d1 > 0)
 	{
-	  if (__scan != std::__find_if(__first1, __scan,
-			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
-	    continue; // We've seen this one before.
-
-	  auto __matches = std::__count_if(__first2, __last2,
-		__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
-	  if (0 == __matches
-	      || std::__count_if(__scan, __last1,
-			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
-	      != __matches)
-	    return false;
+	  ++__first2;
+	  --__d1;
 	}
-      return true;
+      if (__d1 > 0 || __first2 != __last2)
+	return false;
+
+      return std::__is_permutation_impl(__first1, __last1, __first2, __last2,
+					__pred);
     }
 
   /**
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 4024e6c37fa..c9188920cd8 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -1915,6 +1915,7 @@ namespace __detail
 
       for (auto __itx = __this->begin(); __itx != __this->end();)
 	{
+	  // Find all the elements in *this with keys equivalent to *itx.
 	  std::size_t __x_count = 1;
 	  auto __itx_end = __itx;
 	  for (++__itx_end; __itx_end != __this->end()
@@ -1922,32 +1923,21 @@ namespace __detail
 				     _ExtractKey()(*__itx_end));
 	       ++__itx_end)
 	    ++__x_count;
+	  // Now [itx, itx_end) is the result of equal_range(key-of(*itx))
+	  // and xcount is the result of std::distance(itx, itx_end).
 
-	  std::size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
-	  __node_base* __y_prev_n = __other._M_buckets[__ybkt];
-	  if (!__y_prev_n)
-	    return false;
-
-	  __node_type* __y_n = static_cast<__node_type*>(__y_prev_n->_M_nxt);
-	  for (;; __y_n = __y_n->_M_next())
-	    {
-	      if (__this->key_eq()(_ExtractKey()(__y_n->_M_v()),
-				   _ExtractKey()(*__itx)))
-		break;
-
-	      if (!__y_n->_M_nxt
-		  || __other._M_bucket_index(__y_n->_M_next()) != __ybkt)
-		return false;
-	    }
-
-	  typename __hashtable::const_iterator __ity(__y_n);
-	  for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
-	    if (--__x_count == 0)
-	      break;
-
-	  if (__x_count != 0)
+	  // Find the first element in __other with key equivalent to *itx.
+	  size_t __ybkt = __other._M_bucket_index(__itx._M_cur);
+	  auto __ity = __other.begin(__ybkt), __ity_end = __other.end(__ybkt);
+	  while (__ity != __ity_end
+		  && !__this->key_eq()(_ExtractKey()(*__ity),
+				       _ExtractKey()(*__itx)))
+	    ++__ity;
+
+	  if (std::distance(__ity, __ity_end) < __x_count)
 	    return false;
 
+	  using const_iterator = typename __hashtable::const_iterator;
 	  if (!std::is_permutation(__itx, __itx_end, __ity))
 	    return false;
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] libstdc++/91223 Improve unordered containers == operator
  2020-01-16 16:05                 ` Jonathan Wakely
@ 2020-01-16 22:39                   ` François Dumont
  0 siblings, 0 replies; 12+ messages in thread
From: François Dumont @ 2020-01-16 22:39 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++, gcc-patches

On 1/16/20 5:01 PM, Jonathan Wakely wrote:
> On 16/01/20 13:25 +0000, Jonathan Wakely wrote:
>> On 16/01/20 07:42 +0100, François Dumont wrote:
>>> On 1/15/20 10:52 PM, Jonathan Wakely wrote:
>>>> On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
>>>>> On 14/01/20 22:25 +0100, François Dumont wrote:
>>>>>> On 1/13/20 10:53 PM, Jonathan Wakely wrote:
>>>>>>> On 13/01/20 22:41 +0100, François Dumont wrote:
>>>>>>>>
>>>>>>>> For the multi-keys we could still avoid redundant comparisons 
>>>>>>>> when _Equal is just doing == on the key type. On 
>>>>>>>> unordered_multiset we could just avoids the call to 
>>>>>>>> is_permuation and on the unordered_multimap we could check the 
>>>>>>>> is_permutation only on the associated value rather than on the 
>>>>>>>> std::pair.
>>>>>>>>
>>>>>>> I don't think that's necessary, or helpful.
>>>>>>>
>>>>>>> The idea of 
>>>>>>> https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
>>>>>>> that you shouldn't be using _Equal at all, and therefore it doesn't
>>>>>>> matter whether it's std::equal_to or not.
>>>>>>>
>>>>>>>
>>>>>> And it was indeed possible.
>>>>>
>>>>> Nice!
>>>>>
>>>>>>     PR libstdc++/91223
>>>>>>     * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> 
>>>>>> friend.
>>>>>>     * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
>>>>>>     (_Equality_base): Remove.
>>>>>>     (_Equality<>::_M_equal): Review implementation. Use 
>>>>>> std::is_permutation.
>>>>>>     * testsuite/23_containers/unordered_multiset/operators/1.cc
>>>>>>     (Hash, Equal, test02, test03): New.
>>>>>>     * testsuite/23_containers/unordered_set/operators/1.cc
>>>>>>     (Hash, Equal, test02, test03): New.
>>>>>>
>>>>>> Tested under Linux x86_64.
>>>>>>
>>>>>> Ok to commit ?
>>>>>
>>>>> Yes, OK for trunk (we're in stage4 but your patch was posted in 
>>>>> stage3
>>>>> and fixes a pretty nasty performance bug, so is OK now).
>>>>>
>>>>> N.B. GCC has moved to Git instead of Subversion. If you don't have 
>>>>> Git
>>>>> access set up let me know and I can commit the patch for you.
>>>
>>> I haven't done the move yet and won't be able to do it before the 
>>> week-end. So please proceed to the commit for me, thanks.
>>
>> No problem, I can do that.
>
> Your patch is now committed to trunk. Thanks for the major
> improvement.
>
> I had a look at std::is_permutation and I think we can make some
> simplifications to the 4-argument overload, and we can share most of
> the code between the 3-arg and 4-arg overloads (once they've confirmed
> the lengths are the same they do exactly the same thing). See the
> attached patch. This should probably wait for stage1 though.
>
> I also wanted to add some comments to the _Equality::_M_equal
> specialiation for unordered_multimap/multiset to explain what the code
> was doing, and had some more ideas. See patch again.
>
> It looks like this loop can potentially visit every element of
> __other, instead of stopping at the end of the bucket:
>
>   typename __hashtable::const_iterator __ity(__y_n);
>   for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
>     if (--__x_count == 0)
>       break;
>
> Consider a case like this:
>
> unordered_multiset<int> a{1, 2, 3, 4};
> for (int i = 0; i <10000; ++i)
>   a.insert(1);
> unordered_multiset<int> b{1, 2, 3, 4};
> for (int i = 0; i <10000; ++i)
>   b.insert(2);
>
> When doing a == b we'll find 10000 elements in a with key '1',
> and find one element in b with that key. But then we iterate through
> every element in b after that one, even though they have different
> keys and are probably in different buckets.
>
> Instead of just iterating from __ity to __other.end(), can we use a
> local iterator so we stop at the end of the bucket?
>
> This seems to make the PR91263 example *very* slightly slower, but
> makes the example above significantly faster.
>
> What do you think?
>
>
The hashtable implementation is doing its best to provide good 
performances as long as the user does its part of the job. Mainly 
provide a good hash to distrubute elements smoothly throughout the 
buckets. But also avoid this kind of unordered_multiset.

If you check if you don't move out of bucket you'll have to pay for the 
bucket computation (subject of PR 68303) or perform a redundant _Equal 
to check when we left the range of equivalent elements like it used to 
be done. Current implementation leave it to the std::is_permutation to 
do that which in normal situation will be better I think.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-01-16 21:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-10 17:58 [PATCH] libstdc++/91223 Improve unordered containers == operator François Dumont
2020-01-10 22:10 ` Jonathan Wakely
2020-01-10 22:31   ` Jonathan Wakely
2020-01-13 22:01   ` François Dumont
2020-01-13 22:56     ` Jonathan Wakely
2020-01-14 21:52       ` François Dumont
2020-01-15 22:27         ` Jonathan Wakely
2020-01-15 22:32           ` Jonathan Wakely
2020-01-16  7:38             ` François Dumont
2020-01-16 13:53               ` Jonathan Wakely
2020-01-16 16:05                 ` Jonathan Wakely
2020-01-16 22:39                   ` François Dumont

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).