public inbox for libstdc++-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r12-10] libstdc++: Refactor/cleanup of C++20 atomic wait implementation
@ 2021-04-20 14:24 Jonathan Wakely
  0 siblings, 0 replies; only message in thread
From: Jonathan Wakely @ 2021-04-20 14:24 UTC (permalink / raw)
  To: gcc-cvs, libstdc++-cvs

https://gcc.gnu.org/g:b52aef3a8cbcc817c18c474806a29ad7f3453f6d

commit r12-10-gb52aef3a8cbcc817c18c474806a29ad7f3453f6d
Author: Thomas Rodgers <trodgers@redhat.com>
Date:   Tue Apr 20 11:54:27 2021 +0100

    libstdc++: Refactor/cleanup of C++20 atomic wait implementation
    
    This is a substantial rewrite of the atomic wait/notify (and timed wait
    counterparts) implementation.
    
    The previous __platform_wait looped on EINTR however this behavior is
    not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
    now controls whether wait/notify are implemented using a platform
    specific primitive or with a platform agnostic mutex/condvar. This
    patch only supplies a definition for linux futexes. A future update
    could add support __ulock_wait/wake on Darwin, for instance.
    
    The members of __waiters were lifted to a new base class. The members
    are now arranged such that overall sizeof(__waiter_pool_base) fits in
    two cache lines (on platforms with at least 64 byte cache lines). The
    definition will also use destructive_interference_size for this if it is
    available.
    
    The __waiters type is now specific to untimed waits, and is renamed to
    __waiter_pool. Timed waits have a corresponding __timed_waiter_pool
    type.  Much of the code has been moved from the previous __atomic_wait()
    free function to the __waiter_base template and a __waiter derived type
    is provided to implement the un-timed wait operations. A similar change
    has been made to the timed wait implementation.
    
    The __atomic_spin code has been extended to take a spin policy which is
    invoked after the initial busy wait loop. The default policy is to
    return from the spin. The timed wait code adds a timed backoff spinning
    policy. The code from <thread> which implements this_thread::sleep_for,
    sleep_until has been moved to a new <bits/std_thread_sleep.h> header
    which allows the thread sleep code to be consumed without pulling in the
    whole of <thread>.
    
    The entry points into the wait/notify code have been restructured to
    support either -
       * Testing the current value of the atomic stored at the given address
         and waiting on a notification.
       * Applying a predicate to determine if the wait was satisfied.
    The entry points were renamed to make it clear that the wait and wake
    operations operate on addresses. The first variant takes the expected
    value and a function which returns the current value that should be used
    in comparison operations, these operations are named with a _v suffix
    (e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
    variant. Barriers, latches and semaphores use the predicate variant.
    
    This change also centralizes what it means to compare values for the
    purposes of atomic<T>::wait rather than scattering through individual
    predicates.
    
    This change also centralizes the repetitive code which adjusts for
    different user supplied clocks (this should be moved elsewhere
    and all such adjustments should use a common implementation).
    
    This change also removes the hashing of the pointer and uses
    the pointer value directly for indexing into the waiters table.
    
    libstdc++-v3/ChangeLog:
    
            * include/Makefile.am: Add new <bits/this_thread_sleep.h> header.
            * include/Makefile.in: Regenerate.
            * include/bits/this_thread_sleep.h: New file.
            * include/bits/atomic_base.h: Adjust all calls
            to __atomic_wait/__atomic_notify for new call signatures.
            * include/bits/atomic_timed_wait.h: Extensive rewrite.
            * include/bits/atomic_wait.h: Likewise.
            * include/bits/semaphore_base.h: Adjust all calls
            to __atomic_wait/__atomic_notify for new call signatures.
            * include/std/atomic: Likewise.
            * include/std/barrier: Likewise.
            * include/std/latch: Likewise.
            * include/std/semaphore: Likewise.
            * include/std/thread (this_thread::sleep_for)
            (this_thread::sleep_until): Move to new header.
            * testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
            test.
            * testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
            * testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
            * testsuite/29_atomics/atomic_flag/wait_notify/1.cc: Likewise.
            * testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
            * testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
            * testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.

Diff:
---
 libstdc++-v3/include/Makefile.am                   |   1 +
 libstdc++-v3/include/Makefile.in                   |   1 +
 libstdc++-v3/include/bits/atomic_base.h            |  39 +-
 libstdc++-v3/include/bits/atomic_timed_wait.h      | 465 ++++++++++++++-------
 libstdc++-v3/include/bits/atomic_wait.h            | 457 +++++++++++++-------
 libstdc++-v3/include/bits/semaphore_base.h         | 191 ++++-----
 libstdc++-v3/include/bits/this_thread_sleep.h      | 119 ++++++
 libstdc++-v3/include/std/atomic                    |  15 +-
 libstdc++-v3/include/std/barrier                   |  13 +-
 libstdc++-v3/include/std/latch                     |   8 +-
 libstdc++-v3/include/std/semaphore                 |   9 +-
 libstdc++-v3/include/std/thread                    |  68 +--
 .../29_atomics/atomic/wait_notify/bool.cc          |  37 +-
 .../29_atomics/atomic/wait_notify/generic.cc       |  19 +-
 .../29_atomics/atomic/wait_notify/pointers.cc      |  36 +-
 .../29_atomics/atomic_flag/wait_notify/1.cc        |  37 +-
 .../29_atomics/atomic_float/wait_notify.cc         |  26 +-
 .../29_atomics/atomic_integral/wait_notify.cc      |  73 ++--
 .../testsuite/29_atomics/atomic_ref/wait_notify.cc |  74 +---
 19 files changed, 978 insertions(+), 710 deletions(-)

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index f24a5489e8e..40a41ef2a1c 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -225,6 +225,7 @@ bits_headers = \
 	${bits_srcdir}/streambuf.tcc \
 	${bits_srcdir}/stringfwd.h \
 	${bits_srcdir}/string_view.tcc \
+	${bits_srcdir}/this_thread_sleep.h \
 	${bits_srcdir}/uniform_int_dist.h \
 	${bits_srcdir}/unique_lock.h \
 	${bits_srcdir}/unique_ptr.h \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 12c63400706..fcd2b5b2d40 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -575,6 +575,7 @@ bits_headers = \
 	${bits_srcdir}/streambuf.tcc \
 	${bits_srcdir}/stringfwd.h \
 	${bits_srcdir}/string_view.tcc \
+	${bits_srcdir}/this_thread_sleep.h \
 	${bits_srcdir}/uniform_int_dist.h \
 	${bits_srcdir}/unique_lock.h \
 	${bits_srcdir}/unique_ptr.h \
diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index b75f61138a7..029b8ad65a9 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -235,22 +235,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     wait(bool __old,
 	memory_order __m = memory_order_seq_cst) const noexcept
     {
-      std::__atomic_wait(&_M_i, static_cast<__atomic_flag_data_type>(__old),
-			 [__m, this, __old]()
-			 { return this->test(__m) != __old; });
+      const __atomic_flag_data_type __v
+	= __old ? __GCC_ATOMIC_TEST_AND_SET_TRUEVAL : 0;
+
+      std::__atomic_wait_address_v(&_M_i, __v,
+	  [__m, this] { return __atomic_load_n(&_M_i, int(__m)); });
     }
 
     // TODO add const volatile overload
 
     _GLIBCXX_ALWAYS_INLINE void
     notify_one() const noexcept
-    { std::__atomic_notify(&_M_i, false); }
+    { std::__atomic_notify_address(&_M_i, false); }
 
     // TODO add const volatile overload
 
     _GLIBCXX_ALWAYS_INLINE void
     notify_all() const noexcept
-    { std::__atomic_notify(&_M_i, true); }
+    { std::__atomic_notify_address(&_M_i, true); }
 
     // TODO add const volatile overload
 #endif // __cpp_lib_atomic_wait
@@ -609,22 +611,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       wait(__int_type __old,
 	  memory_order __m = memory_order_seq_cst) const noexcept
       {
-	std::__atomic_wait(&_M_i, __old,
-			   [__m, this, __old]
-			   { return this->load(__m) != __old; });
+	std::__atomic_wait_address_v(&_M_i, __old,
+			   [__m, this] { return this->load(__m); });
       }
 
       // TODO add const volatile overload
 
       _GLIBCXX_ALWAYS_INLINE void
       notify_one() const noexcept
-      { std::__atomic_notify(&_M_i, false); }
+      { std::__atomic_notify_address(&_M_i, false); }
 
       // TODO add const volatile overload
 
       _GLIBCXX_ALWAYS_INLINE void
       notify_all() const noexcept
-      { std::__atomic_notify(&_M_i, true); }
+      { std::__atomic_notify_address(&_M_i, true); }
 
       // TODO add const volatile overload
 #endif // __cpp_lib_atomic_wait
@@ -903,22 +904,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       wait(__pointer_type __old,
 	   memory_order __m = memory_order_seq_cst) noexcept
       {
-	std::__atomic_wait(&_M_p, __old,
-		      [__m, this, __old]()
-		      { return this->load(__m) != __old; });
+	std::__atomic_wait_address_v(&_M_p, __old,
+				     [__m, this]
+				     { return this->load(__m); });
       }
 
       // TODO add const volatile overload
 
       _GLIBCXX_ALWAYS_INLINE void
       notify_one() const noexcept
-      { std::__atomic_notify(&_M_p, false); }
+      { std::__atomic_notify_address(&_M_p, false); }
 
       // TODO add const volatile overload
 
       _GLIBCXX_ALWAYS_INLINE void
       notify_all() const noexcept
-      { std::__atomic_notify(&_M_p, true); }
+      { std::__atomic_notify_address(&_M_p, true); }
 
       // TODO add const volatile overload
 #endif // __cpp_lib_atomic_wait
@@ -1017,8 +1018,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       wait(const _Tp* __ptr, _Val<_Tp> __old,
 	   memory_order __m = memory_order_seq_cst) noexcept
       {
-	std::__atomic_wait(__ptr, __old,
-	    [=]() { return load(__ptr, __m) == __old; });
+	std::__atomic_wait_address_v(__ptr, __old,
+	    [__ptr, __m]() { return __atomic_impl::load(__ptr, __m); });
       }
 
       // TODO add const volatile overload
@@ -1026,14 +1027,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     template<typename _Tp>
       _GLIBCXX_ALWAYS_INLINE void
       notify_one(const _Tp* __ptr) noexcept
-      { std::__atomic_notify(__ptr, false); }
+      { std::__atomic_notify_address(__ptr, false); }
 
       // TODO add const volatile overload
 
     template<typename _Tp>
       _GLIBCXX_ALWAYS_INLINE void
       notify_all(const _Tp* __ptr) noexcept
-      { std::__atomic_notify(__ptr, true); }
+      { std::__atomic_notify_address(__ptr, true); }
 
       // TODO add const volatile overload
 #endif // __cpp_lib_atomic_wait
diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h b/libstdc++-v3/include/bits/atomic_timed_wait.h
index a0c5ef4374e..70e5335cfd7 100644
--- a/libstdc++-v3/include/bits/atomic_timed_wait.h
+++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
@@ -36,6 +36,7 @@
 
 #if __cpp_lib_atomic_wait
 #include <bits/functional_hash.h>
+#include <bits/this_thread_sleep.h>
 
 #include <chrono>
 
@@ -48,19 +49,38 @@ namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
-  enum class __atomic_wait_status { no_timeout, timeout };
-
   namespace __detail
   {
+    using __wait_clock_t = chrono::steady_clock;
+
+    template<typename _Clock, typename _Dur>
+      __wait_clock_t::time_point
+      __to_wait_clock(const chrono::time_point<_Clock, _Dur>& __atime) noexcept
+      {
+	const typename _Clock::time_point __c_entry = _Clock::now();
+	const __wait_clock_t::time_point __w_entry = __wait_clock_t::now();
+	const auto __delta = __atime - __c_entry;
+	using __w_dur = typename __wait_clock_t::duration;
+	return __w_entry + chrono::ceil<__w_dur>(__delta);
+      }
+
+    template<typename _Dur>
+      __wait_clock_t::time_point
+      __to_wait_clock(const chrono::time_point<__wait_clock_t,
+					       _Dur>& __atime) noexcept
+      {
+	using __w_dur = typename __wait_clock_t::duration;
+	return chrono::ceil<__w_dur>(__atime);
+      }
+
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-    using __platform_wait_clock_t = chrono::steady_clock;
-
-    template<typename _Duration>
-      __atomic_wait_status
-      __platform_wait_until_impl(__platform_wait_t* __addr,
-				 __platform_wait_t __val,
-				 const chrono::time_point<
-					  __platform_wait_clock_t, _Duration>&
+#define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
+    // returns true if wait ended before timeout
+    template<typename _Dur>
+      bool
+      __platform_wait_until_impl(const __platform_wait_t* __addr,
+				 __platform_wait_t __old,
+				 const chrono::time_point<__wait_clock_t, _Dur>&
 				      __atime) noexcept
       {
 	auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
@@ -75,52 +95,55 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	auto __e = syscall (SYS_futex, __addr,
 			    static_cast<int>(__futex_wait_flags::
 						__wait_bitset_private),
-			    __val, &__rt, nullptr,
+			    __old, &__rt, nullptr,
 			    static_cast<int>(__futex_wait_flags::
 						__bitset_match_any));
-	if (__e && !(errno == EINTR || errno == EAGAIN || errno == ETIMEDOUT))
-	    std::terminate();
-	return (__platform_wait_clock_t::now() < __atime)
-	       ? __atomic_wait_status::no_timeout
-	       : __atomic_wait_status::timeout;
+
+	if (__e)
+	  {
+	    if ((errno != ETIMEDOUT) && (errno != EINTR)
+		&& (errno != EAGAIN))
+	      __throw_system_error(errno);
+	    return true;
+	  }
+	return false;
       }
 
-    template<typename _Clock, typename _Duration>
-      __atomic_wait_status
-      __platform_wait_until(__platform_wait_t* __addr, __platform_wait_t __val,
-			    const chrono::time_point<_Clock, _Duration>&
-				__atime)
+    // returns true if wait ended before timeout
+    template<typename _Clock, typename _Dur>
+      bool
+      __platform_wait_until(const __platform_wait_t* __addr, __platform_wait_t __old,
+			    const chrono::time_point<_Clock, _Dur>& __atime)
       {
-	if constexpr (is_same_v<__platform_wait_clock_t, _Clock>)
+	if constexpr (is_same_v<__wait_clock_t, _Clock>)
 	  {
-	    return __detail::__platform_wait_until_impl(__addr, __val, __atime);
+	    return __platform_wait_until_impl(__addr, __old, __atime);
 	  }
 	else
 	  {
-	    const typename _Clock::time_point __c_entry = _Clock::now();
-	    const __platform_wait_clock_t::time_point __s_entry =
-		    __platform_wait_clock_t::now();
-	    const auto __delta = __atime - __c_entry;
-	    const auto __s_atime = __s_entry + __delta;
-	    if (__detail::__platform_wait_until_impl(__addr, __val, __s_atime)
-		  == __atomic_wait_status::no_timeout)
-	      return __atomic_wait_status::no_timeout;
-
-	    // We got a timeout when measured against __clock_t but
-	    // we need to check against the caller-supplied clock
-	    // to tell whether we should return a timeout.
-	    if (_Clock::now() < __atime)
-	      return __atomic_wait_status::no_timeout;
-	    return __atomic_wait_status::timeout;
+	    if (!__platform_wait_until_impl(__addr, __old,
+					    __to_wait_clock(__atime)))
+	      {
+		// We got a timeout when measured against __clock_t but
+		// we need to check against the caller-supplied clock
+		// to tell whether we should return a timeout.
+		if (_Clock::now() < __atime)
+		  return true;
+	      }
+	    return false;
 	  }
       }
-#else // ! FUTEX
-
-#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
-    template<typename _Duration>
-      __atomic_wait_status
+#else
+// define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT and implement __platform_wait_until()
+// if there is a more efficient primitive supported by the platform
+// (e.g. __ulock_wait())which is better than pthread_cond_clockwait
+#endif // ! PLATFORM_TIMED_WAIT
+
+    // returns true if wait ended before timeout
+    template<typename _Dur>
+      bool
       __cond_wait_until_impl(__condvar& __cv, mutex& __mx,
-	  const chrono::time_point<chrono::steady_clock, _Duration>& __atime)
+	  const chrono::time_point<chrono::steady_clock, _Dur>& __atime)
       {
 	auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
 	auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
@@ -131,45 +154,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    static_cast<long>(__ns.count())
 	  };
 
+#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
 	__cv.wait_until(__mx, CLOCK_MONOTONIC, __ts);
-
-	return (chrono::steady_clock::now() < __atime)
-	       ? __atomic_wait_status::no_timeout
-	       : __atomic_wait_status::timeout;
-      }
-#endif
-
-    template<typename _Duration>
-      __atomic_wait_status
-      __cond_wait_until_impl(__condvar& __cv, mutex& __mx,
-	  const chrono::time_point<chrono::system_clock, _Duration>& __atime)
-      {
-	auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
-	auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
-
-	__gthread_time_t __ts =
-	{
-	  static_cast<std::time_t>(__s.time_since_epoch().count()),
-	  static_cast<long>(__ns.count())
-	};
-
+	return chrono::steady_clock::now() < __atime;
+#else
 	__cv.wait_until(__mx, __ts);
-
-	return (chrono::system_clock::now() < __atime)
-	       ? __atomic_wait_status::no_timeout
-	       : __atomic_wait_status::timeout;
+	return chrono::system_clock::now() < __atime;
+#endif // ! _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
       }
 
-    // return true if timeout
-    template<typename _Clock, typename _Duration>
-      __atomic_wait_status
+    // returns true if wait ended before timeout
+    template<typename _Clock, typename _Dur>
+      bool
       __cond_wait_until(__condvar& __cv, mutex& __mx,
-	  const chrono::time_point<_Clock, _Duration>& __atime)
+	  const chrono::time_point<_Clock, _Dur>& __atime)
       {
-#ifndef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
-	using __clock_t = chrono::system_clock;
-#else
-	using __clock_t = chrono::steady_clock;
+#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
 	if constexpr (is_same_v<_Clock, chrono::steady_clock>)
 	  return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
 	else
@@ -178,118 +178,265 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
 	else
 	  {
-	    const typename _Clock::time_point __c_entry = _Clock::now();
-	    const __clock_t::time_point __s_entry = __clock_t::now();
-	    const auto __delta = __atime - __c_entry;
-	    const auto __s_atime = __s_entry + __delta;
-	    if (__detail::__cond_wait_until_impl(__cv, __mx, __s_atime)
-		== __atomic_wait_status::no_timeout)
-	      return __atomic_wait_status::no_timeout;
-	    // We got a timeout when measured against __clock_t but
-	    // we need to check against the caller-supplied clock
-	    // to tell whether we should return a timeout.
-	    if (_Clock::now() < __atime)
-	      return __atomic_wait_status::no_timeout;
-	    return __atomic_wait_status::timeout;
+	    if (__cond_wait_until_impl(__cv, __mx,
+				       __to_wait_clock(__atime)))
+	      {
+		// We got a timeout when measured against __clock_t but
+		// we need to check against the caller-supplied clock
+		// to tell whether we should return a timeout.
+		if (_Clock::now() < __atime)
+		  return true;
+	      }
+	    return false;
 	  }
       }
-#endif // FUTEX
 
-    struct __timed_waiters : __waiters
+    struct __timed_waiter_pool : __waiter_pool_base
     {
-      template<typename _Clock, typename _Duration>
-	__atomic_wait_status
-	_M_do_wait_until(__platform_wait_t __version,
-			 const chrono::time_point<_Clock, _Duration>& __atime)
+      // returns true if wait ended before timeout
+      template<typename _Clock, typename _Dur>
+	bool
+	_M_do_wait_until(__platform_wait_t* __addr, __platform_wait_t __old,
+			 const chrono::time_point<_Clock, _Dur>& __atime)
 	{
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-	  return __detail::__platform_wait_until(&_M_ver, __version, __atime);
+#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
+	  return __platform_wait_until(__addr, __old, __atime);
 #else
-	  __platform_wait_t __cur = 0;
-	  __waiters::__lock_t __l(_M_mtx);
-	  while (__cur <= __version)
+	  __platform_wait_t __val;
+	  __atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+	  if (__val == __old)
 	    {
-	      if (__detail::__cond_wait_until(_M_cv, _M_mtx, __atime)
-		    == __atomic_wait_status::timeout)
-		return __atomic_wait_status::timeout;
-
-	      __platform_wait_t __last = __cur;
-	      __atomic_load(&_M_ver, &__cur, __ATOMIC_ACQUIRE);
-	      if (__cur < __last)
-		break; // break the loop if version overflows
+	      lock_guard<mutex> __l(_M_mtx);
+	      return __cond_wait_until(_M_cv, _M_mtx, __atime);
 	    }
-	  return __atomic_wait_status::no_timeout;
-#endif
+#endif // _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
 	}
+    };
 
-      static __timed_waiters&
-      _S_timed_for(void* __t)
+    struct __timed_backoff_spin_policy
+    {
+      __wait_clock_t::time_point _M_deadline;
+      __wait_clock_t::time_point _M_t0;
+
+      template<typename _Clock, typename _Dur>
+	__timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur>
+				      __deadline = _Clock::time_point::max(),
+				    chrono::time_point<_Clock, _Dur>
+				      __t0 = _Clock::now()) noexcept
+	  : _M_deadline(__to_wait_clock(__deadline))
+	  , _M_t0(__to_wait_clock(__t0))
+	{ }
+
+      bool
+      operator()() const noexcept
       {
-	static_assert(sizeof(__timed_waiters) == sizeof(__waiters));
-	return static_cast<__timed_waiters&>(__waiters::_S_for(__t));
+	using namespace literals::chrono_literals;
+	auto __now = __wait_clock_t::now();
+	if (_M_deadline <= __now)
+	  return false;
+
+	auto __elapsed = __now - _M_t0;
+	if (__elapsed > 128ms)
+	  {
+	    this_thread::sleep_for(64ms);
+	  }
+	else if (__elapsed > 64us)
+	  {
+	    this_thread::sleep_for(__elapsed / 2);
+	  }
+	else if (__elapsed > 4us)
+	  {
+	    __thread_yield();
+	  }
+	else
+	  return false;
+	return true;
       }
     };
+
+    template<typename _EntersWait>
+      struct __timed_waiter : __waiter_base<__timed_waiter_pool>
+      {
+	using __base_type = __waiter_base<__timed_waiter_pool>;
+
+	template<typename _Tp>
+	  __timed_waiter(const _Tp* __addr) noexcept
+	  : __base_type(__addr)
+	{
+	  if constexpr (_EntersWait::value)
+	    _M_w._M_enter_wait();
+	}
+
+	~__timed_waiter()
+	{
+	  if constexpr (_EntersWait::value)
+	    _M_w._M_leave_wait();
+	}
+
+	// returns true if wait ended before timeout
+	template<typename _Tp, typename _ValFn,
+		 typename _Clock, typename _Dur>
+	  bool
+	  _M_do_wait_until_v(_Tp __old, _ValFn __vfn,
+			     const chrono::time_point<_Clock, _Dur>&
+								__atime) noexcept
+	  {
+	    __platform_wait_t __val;
+	    if (_M_do_spin(__old, std::move(__vfn), __val,
+			   __timed_backoff_spin_policy(__atime)))
+	      return true;
+	    return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr, __val, __atime);
+	  }
+
+	// returns true if wait ended before timeout
+	template<typename _Pred,
+		 typename _Clock, typename _Dur>
+	  bool
+	  _M_do_wait_until(_Pred __pred, __platform_wait_t __val,
+			  const chrono::time_point<_Clock, _Dur>&
+							      __atime) noexcept
+	  {
+	    for (auto __now = _Clock::now(); __now < __atime;
+		  __now = _Clock::now())
+	      {
+		if (__base_type::_M_w._M_do_wait_until(
+		      __base_type::_M_addr, __val, __atime)
+		    && __pred())
+		  return true;
+
+		if (__base_type::_M_do_spin(__pred, __val,
+			       __timed_backoff_spin_policy(__atime, __now)))
+		  return true;
+	      }
+	    return false;
+	  }
+
+	// returns true if wait ended before timeout
+	template<typename _Pred,
+		 typename _Clock, typename _Dur>
+	  bool
+	  _M_do_wait_until(_Pred __pred,
+			   const chrono::time_point<_Clock, _Dur>&
+								__atime) noexcept
+	  {
+	    __platform_wait_t __val;
+	    if (__base_type::_M_do_spin(__pred, __val,
+					__timed_backoff_spin_policy(__atime)))
+	      return true;
+	    return _M_do_wait_until(__pred, __val, __atime);
+	  }
+
+	template<typename _Tp, typename _ValFn,
+		 typename _Rep, typename _Period>
+	  bool
+	  _M_do_wait_for_v(_Tp __old, _ValFn __vfn,
+			   const chrono::duration<_Rep, _Period>&
+								__rtime) noexcept
+	  {
+	    __platform_wait_t __val;
+	    if (_M_do_spin_v(__old, std::move(__vfn), __val))
+	      return true;
+
+	    if (!__rtime.count())
+	      return false; // no rtime supplied, and spin did not acquire
+
+	    auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
+
+	    return __base_type::_M_w._M_do_wait_until(
+					  __base_type::_M_addr,
+					  __val,
+					  chrono::steady_clock::now() + __reltime);
+	  }
+
+	template<typename _Pred,
+		 typename _Rep, typename _Period>
+	  bool
+	  _M_do_wait_for(_Pred __pred,
+			 const chrono::duration<_Rep, _Period>& __rtime) noexcept
+	  {
+	    __platform_wait_t __val;
+	    if (__base_type::_M_do_spin(__pred, __val))
+	      return true;
+
+	    if (!__rtime.count())
+	      return false; // no rtime supplied, and spin did not acquire
+
+	    auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
+
+	    return _M_do_wait_until(__pred, __val,
+				    chrono::steady_clock::now() + __reltime);
+	  }
+      };
+
+    using __enters_timed_wait = __timed_waiter<std::true_type>;
+    using __bare_timed_wait = __timed_waiter<std::false_type>;
   } // namespace __detail
 
-  template<typename _Tp, typename _Pred,
-	   typename _Clock, typename _Duration>
+  // returns true if wait ended before timeout
+  template<typename _Tp, typename _ValFn,
+	   typename _Clock, typename _Dur>
     bool
-    __atomic_wait_until(const _Tp* __addr, _Tp __old, _Pred __pred,
-			const chrono::time_point<_Clock, _Duration>&
+    __atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn,
+			const chrono::time_point<_Clock, _Dur>&
 			    __atime) noexcept
     {
-      using namespace __detail;
+      __detail::__enters_timed_wait __w{__addr};
+      return __w._M_do_wait_until_v(__old, __vfn, __atime);
+    }
 
-      if (std::__atomic_spin(__pred))
-	return true;
+  template<typename _Tp, typename _Pred,
+	   typename _Clock, typename _Dur>
+    bool
+    __atomic_wait_address_until(const _Tp* __addr, _Pred __pred,
+				const chrono::time_point<_Clock, _Dur>&
+							      __atime) noexcept
+    {
+      __detail::__enters_timed_wait __w{__addr};
+      return __w._M_do_wait_until(__pred, __atime);
+    }
 
-      auto& __w = __timed_waiters::_S_timed_for((void*)__addr);
-      auto __version = __w._M_enter_wait();
-      do
-	{
-	  __atomic_wait_status __res;
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-	  if constexpr (__platform_wait_uses_type<_Tp>)
-	    {
-	      __res = __detail::__platform_wait_until((__platform_wait_t*)(void*) __addr,
-						      __old, __atime);
-	    }
-	  else
-#endif
-	    {
-	      __res = __w._M_do_wait_until(__version, __atime);
-	    }
-	  if (__res == __atomic_wait_status::timeout)
-	    return false;
-	}
-      while (!__pred() && __atime < _Clock::now());
-      __w._M_leave_wait();
+  template<typename _Pred,
+	   typename _Clock, typename _Dur>
+    bool
+    __atomic_wait_address_until_bare(const __detail::__platform_wait_t* __addr,
+				_Pred __pred,
+				const chrono::time_point<_Clock, _Dur>&
+							      __atime) noexcept
+    {
+      __detail::__bare_timed_wait __w{__addr};
+      return __w._M_do_wait_until(__pred, __atime);
+    }
 
-      // if timed out, return false
-      return (_Clock::now() < __atime);
+  template<typename _Tp, typename _ValFn,
+	   typename _Rep, typename _Period>
+    bool
+    __atomic_wait_address_for_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn,
+		      const chrono::duration<_Rep, _Period>& __rtime) noexcept
+    {
+      __detail::__enters_timed_wait __w{__addr};
+      return __w._M_do_wait_for_v(__old, __vfn, __rtime);
     }
 
   template<typename _Tp, typename _Pred,
 	   typename _Rep, typename _Period>
     bool
-    __atomic_wait_for(const _Tp* __addr, _Tp __old, _Pred __pred,
+    __atomic_wait_address_for(const _Tp* __addr, _Pred __pred,
 		      const chrono::duration<_Rep, _Period>& __rtime) noexcept
     {
-      using namespace __detail;
 
-      if (std::__atomic_spin(__pred))
-	return true;
-
-      if (!__rtime.count())
-	return false; // no rtime supplied, and spin did not acquire
-
-      using __dur = chrono::steady_clock::duration;
-      auto __reltime = chrono::duration_cast<__dur>(__rtime);
-      if (__reltime < __rtime)
-	++__reltime;
+      __detail::__enters_timed_wait __w{__addr};
+      return __w._M_do_wait_for(__pred, __rtime);
+    }
 
-      return __atomic_wait_until(__addr, __old, std::move(__pred),
-				 chrono::steady_clock::now() + __reltime);
+  template<typename _Pred,
+	   typename _Rep, typename _Period>
+    bool
+    __atomic_wait_address_for_bare(const __detail::__platform_wait_t* __addr,
+			_Pred __pred,
+			const chrono::duration<_Rep, _Period>& __rtime) noexcept
+    {
+      __detail::__bare_timed_wait __w{__addr};
+      return __w._M_do_wait_for(__pred, __rtime);
     }
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
index 424fccbe4c5..0ac5575190c 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -44,12 +44,10 @@
 # include <unistd.h>
 # include <syscall.h>
 # include <bits/functexcept.h>
-// TODO get this from Autoconf
-# define _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE 1
-#else
-# include <bits/std_mutex.h>  // std::mutex, std::__condvar
 #endif
 
+# include <bits/std_mutex.h>  // std::mutex, std::__condvar
+
 #define __cpp_lib_atomic_wait 201907L
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -57,20 +55,30 @@ namespace std _GLIBCXX_VISIBILITY(default)
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
   namespace __detail
   {
+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
     using __platform_wait_t = int;
+    static constexpr size_t __platform_wait_alignment = 4;
+#else
+    using __platform_wait_t = uint64_t;
+    static constexpr size_t __platform_wait_alignment
+      = __alignof__(__platform_wait_t);
+#endif
+  } // namespace __detail
 
-    constexpr auto __atomic_spin_count_1 = 16;
-    constexpr auto __atomic_spin_count_2 = 12;
-
-    template<typename _Tp>
-      inline constexpr bool __platform_wait_uses_type
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-	= is_same_v<remove_cv_t<_Tp>, __platform_wait_t>;
+  template<typename _Tp>
+    inline constexpr bool __platform_wait_uses_type
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+      = is_scalar_v<_Tp>
+	&& ((sizeof(_Tp) == sizeof(__detail::__platform_wait_t))
+	&& (alignof(_Tp*) >= __platform_wait_alignment));
 #else
-	= false;
+      = false;
 #endif
 
+  namespace __detail
+  {
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+#define _GLIBCXX_HAVE_PLATFORM_WAIT 1
     enum class __futex_wait_flags : int
     {
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE
@@ -93,16 +101,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       void
       __platform_wait(const _Tp* __addr, __platform_wait_t __val) noexcept
       {
-	for(;;)
-	  {
-	    auto __e = syscall (SYS_futex, static_cast<const void*>(__addr),
-				  static_cast<int>(__futex_wait_flags::__wait_private),
-				    __val, nullptr);
-	    if (!__e || errno == EAGAIN)
-	      break;
-	    else if (errno != EINTR)
-	      __throw_system_error(__e);
-	  }
+	auto __e = syscall (SYS_futex, static_cast<const void*>(__addr),
+			    static_cast<int>(__futex_wait_flags::__wait_private),
+			    __val, nullptr);
+	if (!__e || errno == EAGAIN)
+	  return;
+	if (errno != EINTR)
+	  __throw_system_error(errno);
       }
 
     template<typename _Tp>
@@ -110,72 +115,124 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       __platform_notify(const _Tp* __addr, bool __all) noexcept
       {
 	syscall (SYS_futex, static_cast<const void*>(__addr),
-		  static_cast<int>(__futex_wait_flags::__wake_private),
-		    __all ? INT_MAX : 1);
+		 static_cast<int>(__futex_wait_flags::__wake_private),
+		 __all ? INT_MAX : 1);
       }
+#else
+// define _GLIBCX_HAVE_PLATFORM_WAIT and implement __platform_wait()
+// and __platform_notify() if there is a more efficient primitive supported
+// by the platform (e.g. __ulock_wait()/__ulock_wake()) which is better than
+// a mutex/condvar based wait
 #endif
 
-    struct __waiters
+    inline void
+    __thread_yield() noexcept
     {
-      alignas(64) __platform_wait_t _M_ver = 0;
-      alignas(64) __platform_wait_t _M_wait = 0;
-
-#ifndef _GLIBCXX_HAVE_LINUX_FUTEX
-      using __lock_t = lock_guard<mutex>;
-      mutex _M_mtx;
-      __condvar _M_cv;
+#if defined _GLIBCXX_HAS_GTHREADS && defined _GLIBCXX_USE_SCHED_YIELD
+     __gthread_yield();
+#endif
+    }
 
-      __waiters() noexcept = default;
+    inline void
+    __thread_relax() noexcept
+    {
+#if defined __i386__ || defined __x86_64__
+      __builtin_ia32_pause();
+#else
+      __thread_yield();
 #endif
+    }
 
-      __platform_wait_t
-      _M_enter_wait() noexcept
+    constexpr auto __atomic_spin_count_1 = 12;
+    constexpr auto __atomic_spin_count_2 = 4;
+
+    struct __default_spin_policy
+    {
+      bool
+      operator()() const noexcept
+      { return false; }
+    };
+
+    template<typename _Pred,
+	     typename _Spin = __default_spin_policy>
+      bool
+      __atomic_spin(_Pred& __pred, _Spin __spin = _Spin{ }) noexcept
       {
-	__platform_wait_t __res;
-	__atomic_load(&_M_ver, &__res, __ATOMIC_ACQUIRE);
-	__atomic_fetch_add(&_M_wait, 1, __ATOMIC_ACQ_REL);
-	return __res;
+	for (auto __i = 0; __i < __atomic_spin_count_1; ++__i)
+	  {
+	    if (__pred())
+	      return true;
+	    __detail::__thread_relax();
+	  }
+
+	for (auto __i = 0; __i < __atomic_spin_count_2; ++__i)
+	  {
+	    if (__pred())
+	      return true;
+	    __detail::__thread_yield();
+	  }
+
+	while (__spin())
+	  {
+	    if (__pred())
+	      return true;
+	  }
+
+	return false;
       }
 
-      void
-      _M_leave_wait() noexcept
+    template<typename _Tp>
+      bool __atomic_compare(const _Tp& __a, const _Tp& __b)
       {
-	__atomic_fetch_sub(&_M_wait, 1, __ATOMIC_ACQ_REL);
+	// TODO make this do the correct padding bit ignoring comparison
+	return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) != 0;
       }
 
-      void
-      _M_do_wait(__platform_wait_t __version) noexcept
-      {
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-	__platform_wait(&_M_ver, __version);
+    struct __waiter_pool_base
+    {
+#ifdef __cpp_lib_hardware_interference_size
+    static constexpr auto _S_align = hardware_destructive_interference_size;
 #else
-	__platform_wait_t __cur = 0;
-	while (__cur <= __version)
-	  {
-	    __waiters::__lock_t __l(_M_mtx);
-	    _M_cv.wait(_M_mtx);
-	    __platform_wait_t __last = __cur;
-	    __atomic_load(&_M_ver, &__cur, __ATOMIC_ACQUIRE);
-	    if (__cur < __last)
-	      break; // break the loop if version overflows
-	  }
+    static constexpr auto _S_align = 64;
 #endif
-      }
+
+      alignas(_S_align) __platform_wait_t _M_wait = 0;
+
+#ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
+      mutex _M_mtx;
+#endif
+
+      alignas(_S_align) __platform_wait_t _M_ver = 0;
+
+#ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
+      __condvar _M_cv;
+#endif
+      __waiter_pool_base() = default;
+
+      void
+      _M_enter_wait() noexcept
+      { __atomic_fetch_add(&_M_wait, 1, __ATOMIC_ACQ_REL); }
+
+      void
+      _M_leave_wait() noexcept
+      { __atomic_fetch_sub(&_M_wait, 1, __ATOMIC_ACQ_REL); }
 
       bool
       _M_waiting() const noexcept
       {
 	__platform_wait_t __res;
 	__atomic_load(&_M_wait, &__res, __ATOMIC_ACQUIRE);
-	return __res;
+	return __res > 0;
       }
 
       void
-      _M_notify(bool __all) noexcept
+      _M_notify(const __platform_wait_t* __addr, bool __all) noexcept
       {
-	__atomic_fetch_add(&_M_ver, 1, __ATOMIC_ACQ_REL);
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-	__platform_notify(&_M_ver, __all);
+	if (!_M_waiting())
+	  return;
+
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+	__platform_notify(__addr, __all);
 #else
 	if (__all)
 	  _M_cv.notify_all();
@@ -184,114 +241,232 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
       }
 
-      static __waiters&
-      _S_for(const void* __t)
+      static __waiter_pool_base&
+      _S_for(const void* __addr) noexcept
       {
-	const unsigned char __mask = 0xf;
-	static __waiters __w[__mask + 1];
-
-	auto __key = _Hash_impl::hash(__t) & __mask;
+	constexpr uintptr_t __ct = 16;
+	static __waiter_pool_base __w[__ct];
+	auto __key = (uintptr_t(__addr) >> 2) % __ct;
 	return __w[__key];
       }
     };
 
-    struct __waiter
+    struct __waiter_pool : __waiter_pool_base
     {
-      __waiters& _M_w;
-      __platform_wait_t _M_version;
+      void
+      _M_do_wait(const __platform_wait_t* __addr, __platform_wait_t __old) noexcept
+      {
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+	__platform_wait(__addr, __old);
+#else
+	__platform_wait_t __val;
+	__atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+	if (__val == __old)
+	  {
+	    lock_guard<mutex> __l(_M_mtx);
+	    _M_cv.wait(_M_mtx);
+	  }
+#endif // __GLIBCXX_HAVE_PLATFORM_WAIT
+      }
+    };
 
-      template<typename _Tp>
-	__waiter(const _Tp* __addr) noexcept
-	  : _M_w(__waiters::_S_for(static_cast<const void*>(__addr)))
-	  , _M_version(_M_w._M_enter_wait())
-	{ }
+    template<typename _Tp>
+      struct __waiter_base
+      {
+	using __waiter_type = _Tp;
 
-      ~__waiter()
-      { _M_w._M_leave_wait(); }
+	__waiter_type& _M_w;
+	__platform_wait_t* _M_addr;
 
-      void _M_do_wait() noexcept
-      { _M_w._M_do_wait(_M_version); }
-    };
+	template<typename _Up>
+	  static __platform_wait_t*
+	  _S_wait_addr(const _Up* __a, __platform_wait_t* __b)
+	  {
+	    if constexpr (__platform_wait_uses_type<_Up>)
+	      return reinterpret_cast<__platform_wait_t*>(const_cast<_Up*>(__a));
+	    else
+	      return __b;
+	  }
 
-    inline void
-    __thread_yield() noexcept
-    {
-#if defined _GLIBCXX_HAS_GTHREADS && defined _GLIBCXX_USE_SCHED_YIELD
-      __gthread_yield();
-#endif
-    }
+	static __waiter_type&
+	_S_for(const void* __addr) noexcept
+	{
+	  static_assert(sizeof(__waiter_type) == sizeof(__waiter_pool_base));
+	  auto& res = __waiter_pool_base::_S_for(__addr);
+	  return reinterpret_cast<__waiter_type&>(res);
+	}
 
-    inline void
-    __thread_relax() noexcept
-    {
-#if defined __i386__ || defined __x86_64__
-      __builtin_ia32_pause();
-#else
-      __thread_yield();
-#endif
-    }
-  } // namespace __detail
+	template<typename _Up>
+	  explicit __waiter_base(const _Up* __addr) noexcept
+	    : _M_w(_S_for(__addr))
+	    , _M_addr(_S_wait_addr(__addr, &_M_w._M_ver))
+	  {
+	  }
 
-  template<typename _Pred>
-    bool
-    __atomic_spin(_Pred& __pred) noexcept
-    {
-      for (auto __i = 0; __i < __detail::__atomic_spin_count_1; ++__i)
+	void
+	_M_notify(bool __all)
 	{
-	  if (__pred())
-	    return true;
+	  if (_M_addr == &_M_w._M_ver)
+	    __atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL);
+	  _M_w._M_notify(_M_addr, __all);
+	}
 
-	  if (__i < __detail::__atomic_spin_count_2)
-	    __detail::__thread_relax();
-	  else
-	    __detail::__thread_yield();
+	template<typename _Up, typename _ValFn,
+		 typename _Spin = __default_spin_policy>
+	  static bool
+	  _S_do_spin_v(__platform_wait_t* __addr,
+		       const _Up& __old, _ValFn __vfn,
+		       __platform_wait_t& __val,
+		       _Spin __spin = _Spin{ })
+	  {
+	    auto const __pred = [=]
+	      { return __detail::__atomic_compare(__old, __vfn()); };
+
+	    if constexpr (__platform_wait_uses_type<_Up>)
+	      {
+		__val == __old;
+	      }
+	    else
+	      {
+		__atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+	      }
+	    return __atomic_spin(__pred, __spin);
+	  }
+
+	template<typename _Up, typename _ValFn,
+		 typename _Spin = __default_spin_policy>
+	  bool
+	  _M_do_spin_v(const _Up& __old, _ValFn __vfn,
+		       __platform_wait_t& __val,
+		       _Spin __spin = _Spin{ })
+	  { return _S_do_spin_v(_M_addr, __old, __vfn, __val, __spin); }
+
+	template<typename _Pred,
+		 typename _Spin = __default_spin_policy>
+	  static bool
+	  _S_do_spin(const __platform_wait_t* __addr,
+		     _Pred __pred,
+		     __platform_wait_t& __val,
+		     _Spin __spin = _Spin{ })
+	  {
+	    __atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+	    return __atomic_spin(__pred, __spin);
+	  }
+
+	template<typename _Pred,
+		 typename _Spin = __default_spin_policy>
+	  bool
+	  _M_do_spin(_Pred __pred, __platform_wait_t& __val,
+		     _Spin __spin = _Spin{ })
+	  { return _S_do_spin(_M_addr, __pred, __val, __spin); }
+      };
+
+    template<typename _EntersWait>
+      struct __waiter : __waiter_base<__waiter_pool>
+      {
+	using __base_type = __waiter_base<__waiter_pool>;
+
+	template<typename _Tp>
+	  explicit __waiter(const _Tp* __addr) noexcept
+	    : __base_type(__addr)
+	  {
+	    if constexpr (_EntersWait::value)
+	      _M_w._M_enter_wait();
+	  }
+
+	~__waiter()
+	{
+	  if constexpr (_EntersWait::value)
+	    _M_w._M_leave_wait();
 	}
-      return false;
+
+	template<typename _Tp, typename _ValFn>
+	  void
+	  _M_do_wait_v(_Tp __old, _ValFn __vfn)
+	  {
+	    __platform_wait_t __val;
+	    if (__base_type::_M_do_spin_v(__old, __vfn, __val))
+	      return;
+	    __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
+	  }
+
+	template<typename _Pred>
+	  void
+	  _M_do_wait(_Pred __pred) noexcept
+	  {
+	    do
+	      {
+		__platform_wait_t __val;
+		if (__base_type::_M_do_spin(__pred, __val))
+		  return;
+		__base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
+	      }
+	    while (!__pred());
+	  }
+      };
+
+    using __enters_wait = __waiter<std::true_type>;
+    using __bare_wait = __waiter<std::false_type>;
+  } // namespace __detail
+
+  template<typename _Tp, typename _ValFn>
+    void
+    __atomic_wait_address_v(const _Tp* __addr, _Tp __old,
+			    _ValFn __vfn) noexcept
+    {
+      __detail::__enters_wait __w(__addr);
+      __w._M_do_wait_v(__old, __vfn);
     }
 
   template<typename _Tp, typename _Pred>
     void
-    __atomic_wait(const _Tp* __addr, _Tp __old, _Pred __pred) noexcept
+    __atomic_wait_address(const _Tp* __addr, _Pred __pred) noexcept
     {
-      using namespace __detail;
-      if (std::__atomic_spin(__pred))
-	return;
+      __detail::__enters_wait __w(__addr);
+      __w._M_do_wait(__pred);
+    }
 
-      __waiter __w(__addr);
-      while (!__pred())
+  // This call is to be used by atomic types which track contention externally
+  template<typename _Pred>
+    void
+    __atomic_wait_address_bare(const __detail::__platform_wait_t* __addr,
+			       _Pred __pred) noexcept
+    {
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+      do
 	{
-	  if constexpr (__platform_wait_uses_type<_Tp>)
-	    {
-	      __platform_wait(__addr, __old);
-	    }
-	  else
-	    {
-	      // TODO support timed backoff when this can be moved into the lib
-	      __w._M_do_wait();
-	    }
+	  __detail::__platform_wait_t __val;
+	  if (__detail::__bare_wait::_S_do_spin(__addr, __pred, __val))
+	    return;
+	  __detail::__platform_wait(__addr, __val);
 	}
+      while (!__pred());
+#else // !_GLIBCXX_HAVE_PLATFORM_WAIT
+      __detail::__bare_wait __w(__addr);
+      __w._M_do_wait(__pred);
+#endif
     }
 
   template<typename _Tp>
     void
-    __atomic_notify(const _Tp* __addr, bool __all) noexcept
+    __atomic_notify_address(const _Tp* __addr, bool __all) noexcept
     {
-      using namespace __detail;
-      auto& __w = __waiters::_S_for((void*)__addr);
-      if (!__w._M_waiting())
-	return;
+      __detail::__bare_wait __w(__addr);
+      __w._M_notify(__all);
+    }
 
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-      if constexpr (__platform_wait_uses_type<_Tp>)
-	{
-	  __platform_notify((__platform_wait_t*)(void*) __addr, __all);
-	}
-      else
+  // This call is to be used by atomic types which track contention externally
+  inline void
+  __atomic_notify_address_bare(const __detail::__platform_wait_t* __addr,
+			       bool __all) noexcept
+  {
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+    __detail::__platform_notify(__addr, __all);
+#else
+    __detail::__bare_wait __w(__addr);
+    __w._M_notify(__all);
 #endif
-	{
-	  __w._M_notify(__all);
-	}
-    }
+  }
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 #endif // GTHREADS || LINUX_FUTEX
diff --git a/libstdc++-v3/include/bits/semaphore_base.h b/libstdc++-v3/include/bits/semaphore_base.h
index b65717e64d7..7e3235d182e 100644
--- a/libstdc++-v3/include/bits/semaphore_base.h
+++ b/libstdc++-v3/include/bits/semaphore_base.h
@@ -35,8 +35,8 @@
 #include <bits/atomic_base.h>
 #if __cpp_lib_atomic_wait
 #include <bits/atomic_timed_wait.h>
-
 #include <ext/numeric_traits.h>
+#endif // __cpp_lib_atomic_wait
 
 #ifdef _GLIBCXX_HAVE_POSIX_SEMAPHORE
 # include <limits.h>
@@ -164,138 +164,101 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   };
 #endif // _GLIBCXX_HAVE_POSIX_SEMAPHORE
 
-  template<typename _Tp>
-    struct __atomic_semaphore
+#if __cpp_lib_atomic_wait
+  struct __atomic_semaphore
+  {
+    static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits<int>::__max;
+    explicit __atomic_semaphore(__detail::__platform_wait_t __count) noexcept
+      : _M_counter(__count)
     {
-      static_assert(std::is_integral_v<_Tp>);
-      static_assert(__gnu_cxx::__int_traits<_Tp>::__max
-		      <= __gnu_cxx::__int_traits<ptrdiff_t>::__max);
-      static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits<_Tp>::__max;
+      __glibcxx_assert(__count >= 0 && __count <= _S_max);
+    }
 
-      explicit __atomic_semaphore(_Tp __count) noexcept
-	: _M_counter(__count)
-      {
-	__glibcxx_assert(__count >= 0 && __count <= _S_max);
-      }
+    __atomic_semaphore(const __atomic_semaphore&) = delete;
+    __atomic_semaphore& operator=(const __atomic_semaphore&) = delete;
+
+    static _GLIBCXX_ALWAYS_INLINE bool
+    _S_do_try_acquire(__detail::__platform_wait_t* __counter,
+		      __detail::__platform_wait_t& __old) noexcept
+    {
+      if (__old == 0)
+	return false;
+
+      return __atomic_impl::compare_exchange_strong(__counter,
+						    __old, __old - 1,
+						    memory_order::acquire,
+						    memory_order::relaxed);
+    }
+
+    _GLIBCXX_ALWAYS_INLINE void
+    _M_acquire() noexcept
+    {
+      auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
+      auto const __pred =
+	[this, &__old] { return _S_do_try_acquire(&this->_M_counter, __old); };
+      std::__atomic_wait_address_bare(&_M_counter, __pred);
+    }
 
-      __atomic_semaphore(const __atomic_semaphore&) = delete;
-      __atomic_semaphore& operator=(const __atomic_semaphore&) = delete;
+    bool
+    _M_try_acquire() noexcept
+    {
+      auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
+      auto const __pred =
+	[this, &__old] { return _S_do_try_acquire(&this->_M_counter, __old); };
+      return std::__detail::__atomic_spin(__pred);
+    }
 
-      _GLIBCXX_ALWAYS_INLINE void
-      _M_acquire() noexcept
+    template<typename _Clock, typename _Duration>
+      _GLIBCXX_ALWAYS_INLINE bool
+      _M_try_acquire_until(const chrono::time_point<_Clock,
+			   _Duration>& __atime) noexcept
       {
-	auto const __pred = [this]
-	  {
-	    auto __old = __atomic_impl::load(&this->_M_counter,
-			    memory_order::acquire);
-	    if (__old == 0)
-	      return false;
-	    return __atomic_impl::compare_exchange_strong(&this->_M_counter,
-		      __old, __old - 1,
-		      memory_order::acquire,
-		      memory_order::release);
-	  };
 	auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
-	std::__atomic_wait(&_M_counter, __old, __pred);
-      }
+	auto const __pred =
+	  [this, &__old] { return _S_do_try_acquire(&this->_M_counter, __old); };
 
-      bool
-      _M_try_acquire() noexcept
-      {
-	auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
-	auto const __pred = [this, __old]
-	  {
-	    if (__old == 0)
-	      return false;
-
-	    auto __prev = __old;
-	    return __atomic_impl::compare_exchange_weak(&this->_M_counter,
-		      __prev, __prev - 1,
-		      memory_order::acquire,
-		      memory_order::release);
-	  };
-	return std::__atomic_spin(__pred);
+	return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime);
       }
 
-      template<typename _Clock, typename _Duration>
-	_GLIBCXX_ALWAYS_INLINE bool
-	_M_try_acquire_until(const chrono::time_point<_Clock,
-			     _Duration>& __atime) noexcept
-	{
-	  auto const __pred = [this]
-	    {
-	      auto __old = __atomic_impl::load(&this->_M_counter,
-			      memory_order::acquire);
-	      if (__old == 0)
-		return false;
-	      return __atomic_impl::compare_exchange_strong(&this->_M_counter,
-			      __old, __old - 1,
-			      memory_order::acquire,
-			      memory_order::release);
-	    };
-
-	  auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
-	  return __atomic_wait_until(&_M_counter, __old, __pred, __atime);
-	}
-
-      template<typename _Rep, typename _Period>
-	_GLIBCXX_ALWAYS_INLINE bool
-	_M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
-	  noexcept
-	{
-	  auto const __pred = [this]
-	    {
-	      auto __old = __atomic_impl::load(&this->_M_counter,
-			      memory_order::acquire);
-	      if (__old == 0)
-		return false;
-	      return  __atomic_impl::compare_exchange_strong(&this->_M_counter,
-			      __old, __old - 1,
-			      memory_order::acquire,
-			      memory_order::release);
-	    };
-
-	  auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
-	  return __atomic_wait_for(&_M_counter, __old, __pred, __rtime);
-	}
-
-      _GLIBCXX_ALWAYS_INLINE void
-      _M_release(ptrdiff_t __update) noexcept
+    template<typename _Rep, typename _Period>
+      _GLIBCXX_ALWAYS_INLINE bool
+      _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
+	noexcept
       {
-	if (0 < __atomic_impl::fetch_add(&_M_counter, __update, memory_order_release))
-	  return;
-	if (__update > 1)
-	  __atomic_impl::notify_all(&_M_counter);
-	else
-	  __atomic_impl::notify_one(&_M_counter);
+	auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
+	auto const __pred =
+	  [this, &__old] { return _S_do_try_acquire(&this->_M_counter, __old); };
+
+	return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime);
       }
 
-    private:
-      alignas(__alignof__(_Tp)) _Tp _M_counter;
-    };
+    _GLIBCXX_ALWAYS_INLINE void
+    _M_release(ptrdiff_t __update) noexcept
+    {
+      if (0 < __atomic_impl::fetch_add(&_M_counter, __update, memory_order_release))
+	return;
+      if (__update > 1)
+	__atomic_notify_address_bare(&_M_counter, true);
+      else
+	__atomic_notify_address_bare(&_M_counter, false);
+    }
+
+  private:
+    alignas(__detail::__platform_wait_alignment)
+    __detail::__platform_wait_t _M_counter;
+  };
+#endif // __cpp_lib_atomic_wait
 
 // Note: the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro can be used to force the
 // use of Posix semaphores (sem_t). Doing so however, alters the ABI.
-#if defined _GLIBCXX_HAVE_LINUX_FUTEX && !_GLIBCXX_REQUIRE_POSIX_SEMAPHORE
-  // Use futex if available and didn't force use of POSIX
-  using __fast_semaphore = __atomic_semaphore<__detail::__platform_wait_t>;
+#if defined __cpp_lib_atomic_wait && !_GLIBCXX_REQUIRE_POSIX_SEMAPHORE
+  using __semaphore_impl = __atomic_semaphore;
 #elif _GLIBCXX_HAVE_POSIX_SEMAPHORE
-  using __fast_semaphore = __platform_semaphore;
+  using __semaphore_impl = __platform_semaphore;
 #else
-  using __fast_semaphore = __atomic_semaphore<ptrdiff_t>;
+#  error "No suitable semaphore implementation available"
 #endif
 
-template<ptrdiff_t __least_max_value>
-  using __semaphore_impl = conditional_t<
-		(__least_max_value > 1),
-		conditional_t<
-		    (__least_max_value <= __fast_semaphore::_S_max),
-		    __fast_semaphore,
-		    __atomic_semaphore<ptrdiff_t>>,
-		__fast_semaphore>;
-
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
-
-#endif // __cpp_lib_atomic_wait
 #endif // _GLIBCXX_SEMAPHORE_BASE_H
diff --git a/libstdc++-v3/include/bits/this_thread_sleep.h b/libstdc++-v3/include/bits/this_thread_sleep.h
new file mode 100644
index 00000000000..a87da388ec5
--- /dev/null
+++ b/libstdc++-v3/include/bits/this_thread_sleep.h
@@ -0,0 +1,119 @@
+// std::this_thread::sleep_for/until declarations -*- C++ -*-
+
+// Copyright (C) 2008-2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+/** @file bits/std_thread_sleep.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{thread}
+ */
+
+#ifndef _GLIBCXX_THIS_THREAD_SLEEP_H
+#define _GLIBCXX_THIS_THREAD_SLEEP_H 1
+
+#pragma GCC system_header
+
+#if __cplusplus >= 201103L
+#include <bits/c++config.h>
+
+#include <chrono> // std::chrono::*
+
+#ifdef _GLIBCXX_USE_NANOSLEEP
+# include <cerrno>  // errno, EINTR
+# include <time.h>  // nanosleep
+#endif
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  /** @addtogroup threads
+   *  @{
+   */
+
+  /** @namespace std::this_thread
+   *  @brief ISO C++ 2011 namespace for interacting with the current thread
+   *
+   *  C++11 30.3.2 [thread.thread.this] Namespace this_thread.
+   */
+  namespace this_thread
+  {
+#ifndef _GLIBCXX_NO_SLEEP
+
+#ifndef _GLIBCXX_USE_NANOSLEEP
+    void
+    __sleep_for(chrono::seconds, chrono::nanoseconds);
+#endif
+
+    /// this_thread::sleep_for
+    template<typename _Rep, typename _Period>
+      inline void
+      sleep_for(const chrono::duration<_Rep, _Period>& __rtime)
+      {
+	if (__rtime <= __rtime.zero())
+	  return;
+	auto __s = chrono::duration_cast<chrono::seconds>(__rtime);
+	auto __ns = chrono::duration_cast<chrono::nanoseconds>(__rtime - __s);
+#ifdef _GLIBCXX_USE_NANOSLEEP
+	struct ::timespec __ts =
+	  {
+	    static_cast<std::time_t>(__s.count()),
+	    static_cast<long>(__ns.count())
+	  };
+	while (::nanosleep(&__ts, &__ts) == -1 && errno == EINTR)
+	  { }
+#else
+	__sleep_for(__s, __ns);
+#endif
+      }
+
+    /// this_thread::sleep_until
+    template<typename _Clock, typename _Duration>
+      inline void
+      sleep_until(const chrono::time_point<_Clock, _Duration>& __atime)
+      {
+#if __cplusplus > 201703L
+	static_assert(chrono::is_clock_v<_Clock>);
+#endif
+	auto __now = _Clock::now();
+	if (_Clock::is_steady)
+	  {
+	    if (__now < __atime)
+	      sleep_for(__atime - __now);
+	    return;
+	  }
+	while (__now < __atime)
+	  {
+	    sleep_for(__atime - __now);
+	    __now = _Clock::now();
+	  }
+      }
+  } // namespace this_thread
+#endif // ! NO_SLEEP
+
+  /// @}
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace
+#endif // C++11
+
+#endif // _GLIBCXX_THIS_THREAD_SLEEP_H
diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index a77edcb3bff..9b1fb15ac41 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -384,26 +384,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     void
     wait(_Tp __old, memory_order __m = memory_order_seq_cst) const noexcept
     {
-      std::__atomic_wait(&_M_i, __old,
-			 [__m, this, __old]
-			 {
-			   const auto __v = this->load(__m);
-			   // TODO make this ignore padding bits when we
-			   // can do that
-			   return __builtin_memcmp(&__old, &__v,
-						    sizeof(_Tp)) != 0;
-			 });
+      std::__atomic_wait_address_v(&_M_i, __old,
+			 [__m, this] { return this->load(__m); });
     }
 
     // TODO add const volatile overload
 
     void
     notify_one() const noexcept
-    { std::__atomic_notify(&_M_i, false); }
+    { std::__atomic_notify_address(&_M_i, false); }
 
     void
     notify_all() const noexcept
-    { std::__atomic_notify(&_M_i, true); }
+    { std::__atomic_notify_address(&_M_i, true); }
 #endif // __cpp_lib_atomic_wait 
 
     };
diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index 6f2b9873500..fd61fb4f9da 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -94,7 +94,7 @@ It looks different from literature pseudocode for two main reasons:
       alignas(__phase_alignment) __barrier_phase_t  _M_phase;
 
       bool
-      _M_arrive(__barrier_phase_t __old_phase)
+      _M_arrive(__barrier_phase_t __old_phase, size_t __current)
       {
 	const auto __old_phase_val = static_cast<unsigned char>(__old_phase);
 	const auto __half_step =
@@ -104,8 +104,7 @@ It looks different from literature pseudocode for two main reasons:
 
 	size_t __current_expected = _M_expected;
 	std::hash<std::thread::id> __hasher;
-	size_t __current = __hasher(std::this_thread::get_id())
-					  % ((_M_expected + 1) >> 1);
+	__current %= ((_M_expected + 1) >> 1);
 
 	for (int __round = 0; ; ++__round)
 	  {
@@ -163,12 +162,14 @@ It looks different from literature pseudocode for two main reasons:
       [[nodiscard]] arrival_token
       arrive(ptrdiff_t __update)
       {
+	std::hash<std::thread::id> __hasher;
+	size_t __current = __hasher(std::this_thread::get_id());
 	__atomic_phase_ref_t __phase(_M_phase);
 	const auto __old_phase = __phase.load(memory_order_relaxed);
 	const auto __cur = static_cast<unsigned char>(__old_phase);
 	for(; __update; --__update)
 	  {
-	    if(_M_arrive(__old_phase))
+	    if(_M_arrive(__old_phase, __current))
 	      {
 		_M_completion();
 		_M_expected += _M_expected_adjustment.load(memory_order_relaxed);
@@ -185,11 +186,11 @@ It looks different from literature pseudocode for two main reasons:
       wait(arrival_token&& __old_phase) const
       {
 	__atomic_phase_const_ref_t __phase(_M_phase);
-	auto const __test_fn = [=, this]
+	auto const __test_fn = [=]
 	  {
 	    return __phase.load(memory_order_acquire) != __old_phase;
 	  };
-	std::__atomic_wait(&_M_phase, __old_phase, __test_fn);
+	std::__atomic_wait_address(&_M_phase, __test_fn);
       }
 
       void
diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
index ef8c301e5e9..20b75f8181a 100644
--- a/libstdc++-v3/include/std/latch
+++ b/libstdc++-v3/include/std/latch
@@ -48,7 +48,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   public:
     static constexpr ptrdiff_t
     max() noexcept
-    { return __gnu_cxx::__int_traits<ptrdiff_t>::__max; }
+    { return __gnu_cxx::__int_traits<__detail::__platform_wait_t>::__max; }
 
     constexpr explicit latch(ptrdiff_t __expected) noexcept
       : _M_a(__expected) { }
@@ -73,8 +73,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     _GLIBCXX_ALWAYS_INLINE void
     wait() const noexcept
     {
-      auto const __old = __atomic_impl::load(&_M_a, memory_order::acquire);
-      std::__atomic_wait(&_M_a, __old, [this] { return this->try_wait(); });
+      auto const __pred = [this] { return this->try_wait(); };
+      std::__atomic_wait_address(&_M_a, __pred);
     }
 
     _GLIBCXX_ALWAYS_INLINE void
@@ -85,7 +85,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     }
 
   private:
-    alignas(__alignof__(ptrdiff_t)) ptrdiff_t _M_a;
+    alignas(__alignof__(__detail::__platform_wait_t)) __detail::__platform_wait_t _M_a;
   };
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
diff --git a/libstdc++-v3/include/std/semaphore b/libstdc++-v3/include/std/semaphore
index 40af41b44d9..02a8214e569 100644
--- a/libstdc++-v3/include/std/semaphore
+++ b/libstdc++-v3/include/std/semaphore
@@ -33,8 +33,6 @@
 
 #if __cplusplus > 201703L
 #include <bits/semaphore_base.h>
-#if __cpp_lib_atomic_wait
-#include <ext/numeric_traits.h>
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -42,13 +40,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #define __cpp_lib_semaphore 201907L
 
-  template<ptrdiff_t __least_max_value =
-			__gnu_cxx::__int_traits<ptrdiff_t>::__max>
+  template<ptrdiff_t __least_max_value = __semaphore_impl::_S_max>
     class counting_semaphore
     {
       static_assert(__least_max_value >= 0);
+      static_assert(__least_max_value <= __semaphore_impl::_S_max);
 
-      __semaphore_impl<__least_max_value> _M_sem;
+      __semaphore_impl _M_sem;
 
     public:
       explicit counting_semaphore(ptrdiff_t __desired) noexcept
@@ -91,6 +89,5 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
-#endif // __cpp_lib_atomic_wait
 #endif // C++20
 #endif // _GLIBCXX_SEMAPHORE
diff --git a/libstdc++-v3/include/std/thread b/libstdc++-v3/include/std/thread
index 66738e1f68e..886994c1320 100644
--- a/libstdc++-v3/include/std/thread
+++ b/libstdc++-v3/include/std/thread
@@ -35,19 +35,13 @@
 # include <bits/c++0x_warning.h>
 #else
 
-#include <chrono> // std::chrono::*
-
 #if __cplusplus > 201703L
 # include <compare>	// std::strong_ordering
 # include <stop_token>	// std::stop_source, std::stop_token, std::nostopstate
 #endif
 
 #include <bits/std_thread.h> // std::thread, get_id, yield
-
-#ifdef _GLIBCXX_USE_NANOSLEEP
-# include <cerrno>  // errno, EINTR
-# include <time.h>  // nanosleep
-#endif
+#include <bits/this_thread_sleep.h> // std::this_thread::sleep_for, sleep_until
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -103,66 +97,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	return __out << __id._M_thread;
     }
 
-  /** @namespace std::this_thread
-   *  @brief ISO C++ 2011 namespace for interacting with the current thread
-   *
-   *  C++11 30.3.2 [thread.thread.this] Namespace this_thread.
-   */
-  namespace this_thread
-  {
-#ifndef _GLIBCXX_NO_SLEEP
-
-#ifndef _GLIBCXX_USE_NANOSLEEP
-    void
-    __sleep_for(chrono::seconds, chrono::nanoseconds);
-#endif
-
-    /// this_thread::sleep_for
-    template<typename _Rep, typename _Period>
-      inline void
-      sleep_for(const chrono::duration<_Rep, _Period>& __rtime)
-      {
-	if (__rtime <= __rtime.zero())
-	  return;
-	auto __s = chrono::duration_cast<chrono::seconds>(__rtime);
-	auto __ns = chrono::duration_cast<chrono::nanoseconds>(__rtime - __s);
-#ifdef _GLIBCXX_USE_NANOSLEEP
-	struct ::timespec __ts =
-	  {
-	    static_cast<std::time_t>(__s.count()),
-	    static_cast<long>(__ns.count())
-	  };
-	while (::nanosleep(&__ts, &__ts) == -1 && errno == EINTR)
-	  { }
-#else
-	__sleep_for(__s, __ns);
-#endif
-      }
-
-    /// this_thread::sleep_until
-    template<typename _Clock, typename _Duration>
-      inline void
-      sleep_until(const chrono::time_point<_Clock, _Duration>& __atime)
-      {
-#if __cplusplus > 201703L
-	static_assert(chrono::is_clock_v<_Clock>);
-#endif
-	auto __now = _Clock::now();
-	if (_Clock::is_steady)
-	  {
-	    if (__now < __atime)
-	      sleep_for(__atime - __now);
-	    return;
-	  }
-	while (__now < __atime)
-	  {
-	    sleep_for(__atime - __now);
-	    __now = _Clock::now();
-	  }
-      }
-  } // namespace this_thread
-#endif // ! NO_SLEEP
-
 #ifdef __cpp_lib_jthread
 
   /// A thread that can be requested to stop and automatically joined.
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/bool.cc b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/bool.cc
index b26ffb5749c..da25cc75c23 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/bool.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/bool.cc
@@ -23,42 +23,21 @@
 
 #include <atomic>
 #include <thread>
-#include <mutex>
-#include <condition_variable>
-#include <type_traits>
-#include <chrono>
 
 #include <testsuite_hooks.h>
 
 int
 main ()
 {
-  using namespace std::literals::chrono_literals;
-
-  std::mutex m;
-  std::condition_variable cv;
-  std::unique_lock<std::mutex> l(m);
-
-  std::atomic<bool> a(false);
-  std::atomic<bool> b(false);
+  std::atomic<bool> a{ true };
+  VERIFY( a.load() );
+  a.wait(false);
   std::thread t([&]
-		{
-		  {
-		    // This ensures we block until cv.wait(l) starts.
-		    std::lock_guard<std::mutex> ll(m);
-		  }
-		  cv.notify_one();
-		  a.wait(false);
-		  if (a.load())
-		    {
-		      b.store(true);
-		    }
-		});
-  cv.wait(l);
-  std::this_thread::sleep_for(100ms);
-  a.store(true);
-  a.notify_one();
+    {
+      a.store(false);
+      a.notify_one();
+    });
+  a.wait(true);
   t.join();
-  VERIFY( b.load() );
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/generic.cc b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/generic.cc
index e67ab776e71..fb68b425368 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/generic.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/generic.cc
@@ -21,12 +21,27 @@
 // with this library; see the file COPYING3.  If not see
 // <http://www.gnu.org/licenses/>.
 
-#include "atomic/wait_notify_util.h"
+#include <atomic>
+#include <thread>
+
+#include <testsuite_hooks.h>
 
 int
 main ()
 {
   struct S{ int i; };
-  check<S> check_s{S{0},S{42}};
+  S aa{ 0 };
+  S bb{ 42 };
+
+  std::atomic<S> a{ aa };
+  VERIFY( a.load().i == aa.i );
+  a.wait(bb);
+  std::thread t([&]
+    {
+      a.store(bb);
+      a.notify_one();
+    });
+  a.wait(aa);
+  t.join();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/pointers.cc b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/pointers.cc
index 023354366b3..53080bbaef0 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/pointers.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/pointers.cc
@@ -23,42 +23,24 @@
 
 #include <atomic>
 #include <thread>
-#include <mutex>
-#include <condition_variable>
-#include <type_traits>
-#include <chrono>
 
 #include <testsuite_hooks.h>
 
 int
 main ()
 {
-  using namespace std::literals::chrono_literals;
-
-  std::mutex m;
-  std::condition_variable cv;
-  std::unique_lock<std::mutex> l(m);
-
   long aa;
   long bb;
-
-  std::atomic<long*> a(nullptr);
+  std::atomic<long*> a(&aa);
+  VERIFY( a.load() == &aa );
+  a.wait(&bb);
   std::thread t([&]
-		{
-		  {
-		    // This ensures we block until cv.wait(l) starts.
-		    std::lock_guard<std::mutex> ll(m);
-		  }
-		  cv.notify_one();
-		  a.wait(nullptr);
-		  if (a.load() == &aa)
-		    a.store(&bb);
-		});
-  cv.wait(l);
-  std::this_thread::sleep_for(100ms);
-  a.store(&aa);
-  a.notify_one();
+    {
+      a.store(&bb);
+      a.notify_one();
+    });
+  a.wait(&aa);
   t.join();
-  VERIFY( a.load() == &bb);
+
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc
index 241251fc72f..9872a56a20e 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc
@@ -22,10 +22,6 @@
 // <http://www.gnu.org/licenses/>.
 
 #include <atomic>
-#include <chrono>
-#include <condition_variable>
-#include <concepts>
-#include <mutex>
 #include <thread>
 
 #include <testsuite_hooks.h>
@@ -33,34 +29,15 @@
 int
 main()
 {
-  using namespace std::literals::chrono_literals;
-
-  std::mutex m;
-  std::condition_variable cv;
-  std::unique_lock<std::mutex> l(m);
-
   std::atomic_flag a;
-  std::atomic_flag b;
+  VERIFY( !a.test() );
+  a.wait(true);
   std::thread t([&]
-		{
-		  {
-		    // This ensures we block until cv.wait(l) starts.
-		    std::lock_guard<std::mutex> ll(m);
-		  }
-		  cv.notify_one();
-		  a.wait(false);
-		  b.test_and_set();
-		  b.notify_one();
-		});
-
-  cv.wait(l);
-  std::this_thread::sleep_for(100ms);
-  a.test_and_set();
-  a.notify_one();
-  b.wait(false);
+    {
+      a.test_and_set();
+      a.notify_one();
+    });
+  a.wait(false);
   t.join();
-
-  VERIFY( a.test() );
-  VERIFY( b.test() );
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/wait_notify.cc b/libstdc++-v3/testsuite/29_atomics/atomic_float/wait_notify.cc
index d8ec5fbe24e..01768da290b 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_float/wait_notify.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/wait_notify.cc
@@ -21,12 +21,32 @@
 // with this library; see the file COPYING3.  If not see
 // <http://www.gnu.org/licenses/>.
 
-#include "atomic/wait_notify_util.h"
+
+#include <atomic>
+#include <thread>
+
+#include <testsuite_hooks.h>
+
+template<typename Tp>
+  void
+  check()
+  {
+    std::atomic<Tp> a{ 1.0 };
+    VERIFY( a.load() != 0.0 );
+    a.wait( 0.0 );
+    std::thread t([&]
+      {
+        a.store(0.0);
+        a.notify_one();
+      });
+    a.wait(1.0);
+    t.join();
+  }
 
 int
 main ()
 {
-  check<float> f;
-  check<double> d;
+  check<float>();
+  check<double>();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_integral/wait_notify.cc b/libstdc++-v3/testsuite/29_atomics/atomic_integral/wait_notify.cc
index 19c1ec4bc12..d1bf0811602 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_integral/wait_notify.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_integral/wait_notify.cc
@@ -21,46 +21,57 @@
 // with this library; see the file COPYING3.  If not see
 // <http://www.gnu.org/licenses/>.
 
-#include "atomic/wait_notify_util.h"
 
-void
-test01()
-{
-  struct S{ int i; };
-  std::atomic<S> s;
+#include <atomic>
+#include <thread>
 
-  s.wait(S{42});
-}
+#include <testsuite_hooks.h>
+
+template<typename Tp>
+  void
+  check()
+  {
+    std::atomic<Tp> a{ Tp(1) };
+    VERIFY( a.load() == Tp(1) );
+    a.wait( Tp(0) );
+    std::thread t([&]
+      {
+        a.store(Tp(0));
+        a.notify_one();
+      });
+    a.wait(Tp(1));
+    t.join();
+  }
 
 int
 main ()
 {
   // check<bool> bb;
-  check<char> ch;
-  check<signed char> sch;
-  check<unsigned char> uch;
-  check<short> s;
-  check<unsigned short> us;
-  check<int> i;
-  check<unsigned int> ui;
-  check<long> l;
-  check<unsigned long> ul;
-  check<long long> ll;
-  check<unsigned long long> ull;
+  check<char>();
+  check<signed char>();
+  check<unsigned char>();
+  check<short>();
+  check<unsigned short>();
+  check<int>();
+  check<unsigned int>();
+  check<long>();
+  check<unsigned long>();
+  check<long long>();
+  check<unsigned long long>();
 
-  check<wchar_t> wch;
-  check<char8_t> ch8;
-  check<char16_t> ch16;
-  check<char32_t> ch32;
+  check<wchar_t>();
+  check<char8_t>();
+  check<char16_t>();
+  check<char32_t>();
 
-  check<int8_t> i8;
-  check<int16_t> i16;
-  check<int32_t> i32;
-  check<int64_t> i64;
+  check<int8_t>();
+  check<int16_t>();
+  check<int32_t>();
+  check<int64_t>();
 
-  check<uint8_t> u8;
-  check<uint16_t> u16;
-  check<uint32_t> u32;
-  check<uint64_t> u64;
+  check<uint8_t>();
+  check<uint16_t>();
+  check<uint32_t>();
+  check<uint64_t>();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc b/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc
index a6740857172..2fd31304222 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/wait_notify.cc
@@ -23,73 +23,25 @@
 
 #include <atomic>
 #include <thread>
-#include <mutex>
-#include <condition_variable>
-#include <chrono>
-#include <type_traits>
 
 #include <testsuite_hooks.h>
 
-template<typename Tp>
-Tp check_wait_notify(Tp val1, Tp val2)
+int
+main ()
 {
-  using namespace std::literals::chrono_literals;
+  struct S{ int i; };
+  S aa{ 0 };
+  S bb{ 42 };
 
-  std::mutex m;
-  std::condition_variable cv;
-  std::unique_lock<std::mutex> l(m);
-
-  Tp aa = val1;
-  std::atomic_ref<Tp> a(aa);
+  std::atomic_ref<S> a{ aa };
+  VERIFY( a.load().i == aa.i );
+  a.wait(bb);
   std::thread t([&]
-		{
-		  {
-		    // This ensures we block until cv.wait(l) starts.
-		    std::lock_guard<std::mutex> ll(m);
-		  }
-		  cv.notify_one();
-		  a.wait(val1);
-		  if (a.load() != val2)
-		    a = val1;
-		});
-  cv.wait(l);
-  std::this_thread::sleep_for(100ms);
-  a.store(val2);
-  a.notify_one();
+    {
+      a.store(bb);
+      a.notify_one();
+    });
+  a.wait(aa);
   t.join();
-  return a.load();
-}
-
-template<typename Tp,
-	 bool = std::is_integral_v<Tp>
-	 || std::is_floating_point_v<Tp>>
-struct check;
-
-template<typename Tp>
-struct check<Tp, true>
-{
-  check()
-  {
-    Tp a = 0;
-    Tp b = 42;
-    VERIFY(check_wait_notify(a, b) == b);
-  }
-};
-
-template<typename Tp>
-struct check<Tp, false>
-{
-  check(Tp b)
-  {
-    Tp a;
-    VERIFY(check_wait_notify(a, b) == b);
-  }
-};
-
-int
-main ()
-{
-  check<long>();
-  check<double>();
   return 0;
 }


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-04-20 14:24 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-20 14:24 [gcc r12-10] libstdc++: Refactor/cleanup of C++20 atomic wait implementation Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).