From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jwakely@redhat.com>
Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com
 [207.211.31.81])
 by sourceware.org (Postfix) with ESMTP id 850B43858D34
 for <libstdc++@gcc.gnu.org>; Wed,  8 Jul 2020 16:43:44 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 850B43858D34
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-83-G10I85-eNxq3UYaKMEH2bA-1; Wed, 08 Jul 2020 12:43:40 -0400
X-MC-Unique: G10I85-eNxq3UYaKMEH2bA-1
Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com
 [10.5.11.13])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ADFC58015FB;
 Wed,  8 Jul 2020 16:43:39 +0000 (UTC)
Received: from localhost (unknown [10.33.36.181])
 by smtp.corp.redhat.com (Postfix) with ESMTP id 9B69C797E3;
 Wed,  8 Jul 2020 16:43:38 +0000 (UTC)
Date: Wed, 8 Jul 2020 17:43:37 +0100
From: Jonathan Wakely <jwakely@redhat.com>
To: Thomas Rodgers <rodgert@appliantology.com>
Cc: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org, trodgers@redhat.com
Subject: Re: [PATCH] Add C++2a synchronization support
Message-ID: <20200708164337.GO4137376@redhat.com>
References: <6F58268B-FA3E-48EC-8108-E16E4A8324ED@appliantology.com>
 <20200606002956.1512343-1-rodgert@appliantology.com>
MIME-Version: 1.0
In-Reply-To: <20200606002956.1512343-1-rodgert@appliantology.com>
X-Clacks-Overhead: GNU Terry Pratchett
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Disposition: inline
X-Spam-Status: No, score=-15.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: libstdc++@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libstdc++ mailing list <libstdc++.gcc.gnu.org>
List-Unsubscribe: <http://gcc.gnu.org/mailman/options/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/libstdc++/>
List-Post: <mailto:libstdc++@gcc.gnu.org>
List-Help: <mailto:libstdc++-request@gcc.gnu.org?subject=help>
List-Subscribe: <http://gcc.gnu.org/mailman/listinfo/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Jul 2020 16:43:47 -0000

On 05/06/20 17:29 -0700, Thomas Rodgers wrote:
>Add support for -
>        atomic wait/notify_one/notify_all
>        counting_semaphore
>        binary_semaphore
>        latch
>
>        * include/Makefile.am (bits_headers): Add new header.
>	* include/Makefile.in: Regenerate.
>	* include/bits/atomic_base.h (__atomic_base<_Itp>::wait): Define.
>	(__atomic_base<_Itp>::notify_one): Likewise.
>	(__atomic_base<_Itp>::notify_all): Likewise.
>	(__atomic_base<_Ptp*>::wait): Likewise.
>	(__atomic_base<_Ptp*>::notify_one): Likewise.
>	(__atomic_base<_Ptp*>::notify_all): Likewise.
>	(__atomic_impl::wait): Likewise.
>	(__atomic_impl::notify_one): Likewise.
>	(__atomic_impl::notify_all): Likewise.
>	(__atomic_float<_Fp>::wait): Likewise.
>	(__atomic_float<_Fp>::notify_one): Likewise.
>	(__atomic_float<_Fp>::notify_all): Likewise.
>	(__atomic_ref<_Tp>::wait): Likewise.
>	(__atomic_ref<_Tp>::notify_one): Likewise.
>	(__atomic_ref<_Tp>::notify_all): Likewise.
>	(atomic_wait<_Tp>): Likewise.
>	(atomic_wait_explicit<_Tp>): Likewise.
>	(atomic_notify_one<_Tp>): Likewise.
>	(atomic_notify_all<_Tp>): Likewise.
>	* include/bits/atomic_wait.h: New file.
>        * include/bits/atomic_timed_wait.h: New file.
>        * include/bits/semaphore_base.h: New file.
>	* include/std/atomic (atomic<bool>::wait): Define.
>	(atomic<bool>::wait_one): Likewise.
>	(atomic<bool>::wait_all): Likewise.
>	(atomic<_Tp>::wait): Likewise.
>	(atomic<_Tp>::wait_one): Likewise.
>	(atomic<_Tp>::wait_all): Likewise.
>	(atomic<_Tp*>::wait): Likewise.
>	(atomic<_Tp*>::wait_one): Likewise.
>	(atomic<_Tp*>::wait_all): Likewise.
>        * include/std/latch: New file.
>        * include/std/semaphore: New file.
>        * include/std/version: Add __cpp_lib_semaphore and
>        __cpp_lib_latch defines.
>	* testsuite/29_atomic/atomic/wait_notify/atomic_refs.cc: New test.
>	* testsuite/29_atomic/atomic/wait_notify/bool.cc: Likewise.
>	* testsuite/29_atomic/atomic/wait_notify/integrals.cc: Likewise.
>	* testsuite/29_atomic/atomic/wait_notify/floats.cc: Likewise.
>	* testsuite/29_atomic/atomic/wait_notify/pointers.cc: Likewise.
>	* testsuite/29_atomic/atomic/wait_notify/generic.h: New File.
>        * testsuite/30_thread/semaphore/1.cc: New test.
>        * testsuite/30_thread/semaphore/2.cc: Likewise.
>        * testsuite/30_thread/semaphore/least_max_value_neg.cc: Likewise.
>        * testsuite/30_thread/semaphore/try_acquire.cc: Likewise.
>        * testsuite/30_thread/semaphore/try_acquire_for.cc: Likewise.
>        * testsuite/30_thread/semaphore/try_acquire_futex.cc: Likewise.
>        * testsuite/30_thread/semaphore/try_acquire_posix.cc: Likewise.
>        * testsuite/30_thread/semaphore/try_acquire_until.cc: Likewise.
>        * testsuite/30_thread/latch/1.cc: New test.
>        * testsuite/30_thread/latch/2.cc: New test.
>        * testsuite/30_thread/latch/3.cc: New test.


>diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
>index 80aeb3f8959..b3ac1a3365f 100644
>--- a/libstdc++-v3/include/Makefile.am
>+++ b/libstdc++-v3/include/Makefile.am
>@@ -52,6 +52,7 @@ std_headers = \
> 	${std_srcdir}/iostream \
> 	${std_srcdir}/istream \
> 	${std_srcdir}/iterator \
>+	${std_srcdir}/latch\

Missing space before the backslash here.

> 	${std_srcdir}/limits \
> 	${std_srcdir}/list \
> 	${std_srcdir}/locale \

>--- a/libstdc++-v3/include/bits/atomic_base.h
>+++ b/libstdc++-v3/include/bits/atomic_base.h
>@@ -823,6 +851,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 					   int(__m1), int(__m2));
>       }
>
>+#if __cplusplus > 201703L
>+      _GLIBCXX_ALWAYS_INLINE void
>+      wait(__pointer_type __old, memory_order __m = memory_order_seq_cst) noexcept

This line should be < 80 cols.

>+      {
>@@ -911,6 +963,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 					 int(__success), int(__failure));
>       }
>
>+#if __cplusplus > 201703L
>+    template<typename _Tp>
>+      _GLIBCXX_ALWAYS_INLINE void
>+      wait(const _Tp* __ptr, _Val<_Tp> __old, memory_order __m = memory_order_seq_cst) noexcept

And this one.


>@@ -1164,6 +1242,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 				       __cmpexch_failure_order(__order));
>       }
>
>+      _GLIBCXX_ALWAYS_INLINE void
>+      wait(_Fp __old, memory_order __m = memory_order_seq_cst) const noexcept
>+      { __atomic_impl::wait(&_M_fp, __old, __m); }
>+
>+      // TODO add const volatile overload
>+
>+      _GLIBCXX_ALWAYS_INLINE void
>+      notify_one() const noexcept
>+      { __atomic_impl::notify_one(&_M_fp); }
>+
>+      // TODO add const volatile overload
>+
>+      _GLIBCXX_ALWAYS_INLINE void
>+      notify_all() const noexcept
>+      { __atomic_impl::notify_all(&_M_fp); }
>+
>+      // TODO add const volatile overload

Please add a newline after this comment.

>       value_type
>       fetch_add(value_type __i,
> 		memory_order __m = memory_order_seq_cst) noexcept
>@@ -1301,6 +1396,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 				       __cmpexch_failure_order(__order));
>       }
>
>+      _GLIBCXX_ALWAYS_INLINE void
>+      wait(_Tp __old, memory_order __m = memory_order_seq_cst) const noexcept
>+      { __atomic_impl::wait(_M_ptr, __old, __m); }
>+
>+      // TODO add const volatile overload
>+
>+      _GLIBCXX_ALWAYS_INLINE void
>+      notify_one() const noexcept
>+      { __atomic_impl::notify_one(_M_ptr); }
>+
>+      // TODO add const volatile overload
>+
>+      _GLIBCXX_ALWAYS_INLINE void
>+      notify_all() const noexcept
>+      { __atomic_impl::notify_all(_M_ptr); }
>+

The TODO comment seems to be missing here, and after some notify_all
cases below. Please either add one after every non-volatile function
that's missing a volatile overload, or just put one "TODO volatile
overloads of wait and notify_{one,all}" comment in each class.

>diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h b/libstdc++-v3/include/bits/atomic_timed_wait.h
>new file mode 100644
>index 00000000000..adef80aca61
>--- /dev/null
>+++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
>@@ -0,0 +1,282 @@
>+// -*- C++ -*- header.
>+
>+// Copyright (C) 2020 Free Software Foundation, Inc.
>+//
>+// This file is part of the GNU ISO C++ Library.  This library is free
>+// software; you can redistribute it and/or modify it under the
>+// terms of the GNU General Public License as published by the
>+// Free Software Foundation; either version 3, or (at your option)
>+// any later version.
>+
>+// This library is distributed in the hope that it will be useful,
>+// but WITHOUT ANY WARRANTY; without even the implied warranty of
>+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+// GNU General Public License for more details.
>+
>+// Under Section 7 of GPL version 3, you are granted additional
>+// permissions described in the GCC Runtime Library Exception, version
>+// 3.1, as published by the Free Software Foundation.
>+
>+// You should have received a copy of the GNU General Public License and
>+// a copy of the GCC Runtime Library Exception along with this program;
>+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>+// <http://www.gnu.org/licenses/>.
>+
>+/** @file bits/atomic_timed_wait.h
>+ *  This is an internal header file, included by other library headers.
>+ *  Do not attempt to use it directly. @headername{atomic}
>+ */
>+
>+#ifndef _GLIBCXX_ATOMIC_TIMED_WAIT_H
>+#define _GLIBCXX_ATOMIC_TIMED_WAIT_H 1
>+
>+#pragma GCC system_header
>+
>+#include <bits/c++config.h>
>+#include <bits/functional_hash.h>
>+#include <bits/atomic_wait.h>
>+
>+#include <chrono>
>+
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+#include <sys/time.h>
>+#endif
>+
>+namespace std _GLIBCXX_VISIBILITY(default)
>+{
>+  _GLIBCXX_BEGIN_NAMESPACE_VERSION

No indentation for these BEGIN/END macros.

>+
>+  enum class __atomic_wait_status { __no_timeout, __timeout };

It seems a shame to have yet another status macro when we already have
cv_status and future_status which define enumerators with the same
names, but I suppose it's a bit confusing to reuse them. Maybe you
could just do:

using __atomic_wait_status = cv_status;

and then refer to __atomic_wait_status::no_timeout and
__atomic_wait_status::timeout, what do you think?

Even if it's a new distinct enum type, you could use no_timeout and
timeout as the enumerator names, because those are reserved names
anyway.

>+  namespace __detail
>+  {
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+    enum

I think unnamed enum types in headers cause problems for modules.

>+    {
>+      __futex_wait_bitset_private = __futex_wait_bitset | __futex_private_flag,
>+      __futex_wake_bitset_private = __futex_wake_bitset | __futex_private_flag,
>+      __futex_bitset_match_any = 0xffffffff
>+    };
>+
>+    using __platform_wait_clock_t = chrono::steady_clock;
>+
>+    template<typename _Duration>
>+      __atomic_wait_status
>+      __platform_wait_until_impl(__platform_wait_t* __addr, __platform_wait_t __val,
>+				 const chrono::time_point<__platform_wait_clock_t, _Duration>& __atime) noexcept
>+      {
>+	auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
>+	auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
>+
>+	struct timespec __rt =
>+	{
>+	  static_cast<std::time_t>(__s.time_since_epoch().count()),
>+	  static_cast<long>(__ns.count())
>+	};
>+
>+	auto __e = syscall (SYS_futex, __addr, __futex_wait_bitset_private, __val, &__rt,
>+			    nullptr, __futex_bitset_match_any);
>+	if (__e && !(errno == EINTR || errno == EAGAIN || errno == ETIMEDOUT))
>+	    std::terminate();
>+	return (__platform_wait_clock_t::now() < __atime)
>+	       ? __atomic_wait_status::__no_timeout : __atomic_wait_status::__timeout;
>+      }
>+
>+    template<typename _Clock, typename _Duration>
>+      __atomic_wait_status
>+      __platform_wait_until(__platform_wait_t* __addr, __platform_wait_t __val,
>+			    const chrono::time_point<_Clock, _Duration>& __atime)
>+      {
>+	if constexpr (std::is_same_v<__platform_wait_clock_t, _Clock>)
>+	  {
>+	    return __platform_wait_until_impl(__addr, __val, __atime);

Since this is calling a free function with arguments of
program-defined types (the clock and duration), it needs to be
qualified (whereas the is_same_v above doesn't need to be qualified,
although that's harmless).

>+	  }
>+	else
>+	  {
>+	    const typename _Clock::time_point __c_entry = _Clock::now();
>+	    const __platform_wait_clock_t::time_point __s_entry =
>+		    __platform_wait_clock_t::now();
>+	    const auto __delta = __atime - __c_entry;
>+	    const auto __s_atime = __s_entry + __delta;
>+	    if (__platform_wait_until_impl(__addr, __val, __s_atime) == __atomic_wait_status::__no_timeout)
>+	      return __atomic_wait_status::__no_timeout;
>+
>+	    // We got a timeout when measured against __clock_t but
>+	    // we need to check against the caller-supplied clock
>+	    // to tell whether we should return a timeout.
>+	    if (_Clock::now() < __atime)
>+	      return __atomic_wait_status::__no_timeout;
>+	    return __atomic_wait_status::__timeout;
>+	  }
>+      }
>+#endif
>+
>+#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
>+    template<typename _Duration>
>+      __atomic_wait_status
>+      __cond_wait_until_impl(__gthread_cond_t* __cv,
>+	  unique_lock<mutex>& __lock,
>+	  const chrono::time_point<chrono::steady_clock, _Duration>& __atime)
>+      {
>+	auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
>+	auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
>+
>+	__gthread_time_t __ts =
>+	  {
>+	    static_cast<std::time_t>(__s.time_since_epoch().count()),
>+	    static_cast<long>(__ns.count())
>+	  };
>+
>+	pthread_cond_clockwait(__cv, __lock.mutex()->native_handle(),
>+			       CLOCK_MONOTONIC,
>+			       &__ts);
>+	return (chrono::steady_clock::now() < __atime)
>+	       ? __atomic_wait_status::__no_timeout : __atomic_wait_status::__timeout;
>+      }
>+#endif
>+
>+      template<typename _Duration>
>+	__atomic_wait_status
>+	__cond_wait_until_impl(__gthread_cond_t* __cv,
>+	    unique_lock<std::mutex>& __lock,
>+	    const chrono::time_point<chrono::system_clock, _Duration>& __atime)
>+	{
>+	  auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
>+	  auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
>+
>+	  __gthread_time_t __ts =
>+	  {
>+	    static_cast<std::time_t>(__s.time_since_epoch().count()),
>+	    static_cast<long>(__ns.count())
>+	  };
>+
>+	  __gthread_cond_timedwait(__cv, __lock.mutex()->native_handle(),
>+				   &__ts);
>+	  return (chrono::system_clock::now() < __atime)
>+		 ? __atomic_wait_status::__no_timeout
>+		 : __atomic_wait_status::__timeout;
>+	}
>+
>+      // return true if timeout
>+      template<typename _Clock, typename _Duration>
>+	__atomic_wait_status
>+	__cond_wait_until(__gthread_cond_t* __cv,
>+	    unique_lock<std::mutex>& __lock,
>+	    const chrono::time_point<_Clock, _Duration>& __atime)
>+	{
>+#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
>+	  using __clock_t = chrono::steady_clock;
>+#else
>+	  using __clock_t = chrono::system_clock;
>+#endif
>+	  const typename _Clock::time_point __c_entry = _Clock::now();
>+	  const __clock_t::time_point __s_entry = __clock_t::now();
>+	  const auto __delta = __atime - __c_entry;
>+	  const auto __s_atime = __s_entry + __delta;
>+	  if (__cond_wait_until_impl(__cv, __lock, __s_atime))

__cond_wait_until_impl should be qualified.

>+	    return __atomic_wait_status::__no_timeout;
>+	  // We got a timeout when measured against __clock_t but
>+	  // we need to check against the caller-supplied clock
>+	  // to tell whether we should return a timeout.
>+	  if (_Clock::now() < __atime)
>+	    return __atomic_wait_status::__no_timeout;
>+	  return __atomic_wait_status::__timeout;
>+	}
>+
>+    struct __timed_waiters : __waiters
>+    {
>+      template<typename _Clock, typename _Duration>
>+	__atomic_wait_status
>+	_M_do_wait_until(__platform_wait_t __version,
>+			 const chrono::time_point<_Clock, _Duration>& __atime)
>+	{
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+	  return __platform_wait_until(&_M_ver, __version, __atime);
>+#else
>+	  __platform_wait_t __cur = 0;
>+	  __waiters::__lock_t __l(_M_mtx);
>+	  while (__cur <= __version)
>+	    {
>+	      if (__cond_wait_until(&_M_cv, __l, __atime) == __atomic_wait_status::__timeout)

Qualify.

>+		return __atomic_wait_status::__timeout;
>+
>+	      __platform_wait_t __last = __cur;
>+	      __atomic_load(&_M_ver, &__cur, __ATOMIC_ACQUIRE);
>+	      if (__cur < __last)
>+		break; // break the loop if version overflows
>+	    }
>+	  return __atomic_wait_status::__no_timeout;
>+#endif
>+	}
>+
>+      static __timed_waiters&
>+      _S_timed_for(void* __t)
>+      {
>+	static_assert(sizeof(__timed_waiters) == sizeof(__waiters));
>+	return (__timed_waiters&) __waiters::_S_for(__t);

I'd be more comfortable with a static_cast here.

>+      }
>+    };
>+  } // namespace __detail
>+
>+  template<typename _Tp, typename _Pred,
>+	   typename _Clock, typename _Duration>
>+    bool
>+    __atomic_wait_until(const _Tp* __addr, _Tp __old, _Pred __pred,
>+			const chrono::time_point<_Clock, _Duration>& __atime) noexcept
>+    {
>+      using namespace __detail;
>+
>+      if (std::__atomic_spin(__pred))
>+	return true;
>+
>+      auto& __w = __timed_waiters::_S_timed_for((void*)__addr);
>+      auto __version = __w._M_enter_wait();
>+      do
>+	{
>+	  __atomic_wait_status __res;
>+	  if constexpr (__platform_wait_uses_type<_Tp>)
>+	    {
>+	      __res = __platform_wait_until((__platform_wait_t*)(void*) __addr,
>+					    __old,
>+					    __atime);
>+	    }
>+	  else
>+	    {
>+	      __res = __w._M_do_wait_until(__version, __atime);
>+	    }
>+	  if (__res == __atomic_wait_status::__timeout)
>+	    return false;
>+	}
>+      while (!__pred() && __atime < _Clock::now());
>+      __w._M_leave_wait();
>+
>+      // if timed out, return false
>+      return (_Clock::now() < __atime);
>+    }
>+
>+  template<typename _Tp, typename _Pred,
>+	   typename _Rep, typename _Period>
>+    bool
>+    __atomic_wait_for(const _Tp* __addr, _Tp __old, _Pred __pred,
>+		      const chrono::duration<_Rep, _Period>& __rtime) noexcept
>+    {
>+      using namespace __detail;
>+
>+      if (std::__atomic_spin(__pred))
>+	return true;
>+
>+      if (!__rtime.count())
>+	return false; // no rtime supplied, and spin did not acquire
>+
>+      using __dur = chrono::steady_clock::duration;
>+      auto __reltime = chrono::duration_cast<__dur>(__rtime);
>+      if (__reltime < __rtime)
>+	++__reltime;
>+
>+
>+      return __atomic_wait_until(__addr, __old, std::move(__pred),
>+				 chrono::steady_clock::now() + __reltime);
>+    }
>+_GLIBCXX_END_NAMESPACE_VERSION
>+} // namespace std
>+#endif
>diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
>new file mode 100644
>index 00000000000..92c1e2526ed
>--- /dev/null
>+++ b/libstdc++-v3/include/bits/atomic_wait.h
>@@ -0,0 +1,291 @@
>+// -*- C++ -*- header.
>+
>+// Copyright (C) 2020 Free Software Foundation, Inc.
>+//
>+// This file is part of the GNU ISO C++ Library.  This library is free
>+// software; you can redistribute it and/or modify it under the
>+// terms of the GNU General Public License as published by the
>+// Free Software Foundation; either version 3, or (at your option)
>+// any later version.
>+
>+// This library is distributed in the hope that it will be useful,
>+// but WITHOUT ANY WARRANTY; without even the implied warranty of
>+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+// GNU General Public License for more details.
>+
>+// Under Section 7 of GPL version 3, you are granted additional
>+// permissions described in the GCC Runtime Library Exception, version
>+// 3.1, as published by the Free Software Foundation.
>+
>+// You should have received a copy of the GNU General Public License and
>+// a copy of the GCC Runtime Library Exception along with this program;
>+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>+// <http://www.gnu.org/licenses/>.
>+
>+/** @file bits/atomic_wait.h
>+ *  This is an internal header file, included by other library headers.
>+ *  Do not attempt to use it directly. @headername{atomic}
>+ */
>+
>+#ifndef _GLIBCXX_ATOMIC_WAIT_H
>+#define _GLIBCXX_ATOMIC_WAIT_H 1
>+
>+#pragma GCC system_header
>+
>+#include <bits/c++config.h>
>+#include <bits/functional_hash.h>
>+#include <bits/gthr.h>
>+#include <bits/std_mutex.h>
>+#include <bits/unique_lock.h>
>+#include <ext/numeric_traits.h>
>+
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+#include <climits>
>+#include <unistd.h>
>+#include <syscall.h>
>+#endif
>+
>+#define _GLIBCXX_SPIN_COUNT_1 16
>+#define _GLIBCXX_SPIN_COUNT_2 12
>+

Yuck, do these have to be macros?

>+// TODO get this from Autoconf
>+#define _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE 1
>+
>+namespace std _GLIBCXX_VISIBILITY(default)
>+{
>+_GLIBCXX_BEGIN_NAMESPACE_VERSION
>+  namespace __detail
>+  {
>+    using __platform_wait_t = int;
>+
>+    inline constexpr
>+    auto __platform_wait_max_value =
>+		__gnu_cxx::__numeric_traits<__platform_wait_t>::__max;

You can usee the new __gnu_cxx::__int_traits alias here, since you
know __platform_wait_t is an integer, not a floating-point type. That
alias just saves instantiating __numeric_traits to figure out whether
to use __numeric_traits_integer or __numeric_traits_floating.

>+
>+    template<typename _Tp>
>+      inline constexpr bool __platform_wait_uses_type
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+	= is_same_v<remove_cv_t<_Tp>, __platform_wait_t>;
>+#else
>+	= false;
>+#endif
>+
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+    enum

See earlier comment about unnamed enum types.


>+    {
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE
>+      __futex_private_flag = 128,
>+#else
>+      __futex_private_flag = 0,
>+#endif
>+      __futex_wait = 0,
>+      __futex_wake = 1,
>+      __futex_wait_bitset = 9,
>+      __futex_wake_bitset = 10,
>+      __futex_wait_private = __futex_wait | __futex_private_flag,
>+      __futex_wake_private = __futex_wake | __futex_private_flag
>+    };
>+
>+    void
>+    __platform_wait(__platform_wait_t* __addr, __platform_wait_t __val) noexcept
>+    {
>+       auto __e = syscall (SYS_futex, __addr, __futex_wait_private, __val, nullptr);
>+       if (__e && !(errno == EINTR || errno == EAGAIN))
>+	 std::terminate();
>+    }
>+
>+    void
>+    __platform_notify(__platform_wait_t* __addr, bool __all) noexcept
>+    {
>+      syscall (SYS_futex, __addr, __futex_wake_private, __all ? INT_MAX : 1);
>+    }
>+#endif
>+
>+    struct __waiters
>+    {
>+      __platform_wait_t alignas(64) _M_ver = 0;
>+      __platform_wait_t alignas(64) _M_wait = 0;
>+
>+#ifndef _GLIBCXX_HAVE_LINUX_FUTEX
>+      using __lock_t = std::unique_lock<std::mutex>;
>+      mutable __lock_t::mutex_type _M_mtx;
>+
>+#  ifdef __GTHREAD_COND_INIT
>+      mutable __gthread_cond_t _M_cv = __GTHREAD_COND_INIT;

Not being able to use std::condition_variable here makes me sad.

>+      __waiters() noexcept = default;
>+#  else
>+      mutable __gthread_cond_t _M_cv;
>+      __waiters() noexcept
>+      {
>+	__GTHREAD_COND_INIT_FUNCTION(&_M_cond);
>+      }
>+#  endif
>+#endif
>+
>+      __platform_wait_t
>+      _M_enter_wait() noexcept
>+      {
>+	__platform_wait_t __res;
>+	__atomic_load(&_M_ver, &__res, __ATOMIC_ACQUIRE);
>+	__atomic_fetch_add(&_M_wait, 1, __ATOMIC_ACQ_REL);
>+	return __res;
>+      }
>+
>+      void
>+      _M_leave_wait() noexcept
>+      {
>+	__atomic_fetch_sub(&_M_wait, 1, __ATOMIC_ACQ_REL);
>+      }
>+
>+      void
>+      _M_do_wait(__platform_wait_t __version) noexcept
>+      {
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+	__platform_wait(&_M_ver, __version);
>+#else
>+	__platform_wait_t __cur = 0;
>+	while (__cur <= __version)
>+	  {
>+	    __waiters::__lock_t __l(_M_mtx);
>+	    auto __e = __gthread_cond_wait(&_M_cv, __l.mutex()->native_handle());
>+	    if (__e)
>+	      std::terminate();
>+	    __platform_wait_t __last = __cur;
>+	    __atomic_load(&_M_ver, &__cur, __ATOMIC_ACQUIRE);
>+	    if (__cur < __last)
>+	      break; // break the loop if version overflows
>+	  }
>+#endif
>+      }
>+
>+      __platform_wait_t
>+      _M_waiting() const noexcept
>+	{
>+	  __platform_wait_t __res;
>+	  __atomic_load(&_M_wait, &__res, __ATOMIC_ACQUIRE);
>+	  return __res;
>+	}
>+
>+      void
>+      _M_notify(bool __all) noexcept
>+      {
>+	__atomic_fetch_add(&_M_ver, 1, __ATOMIC_ACQ_REL);
>+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
>+	__platform_notify(&_M_ver, __all);
>+#else
>+	auto __e = __gthread_cond_broadcast(&_M_cv);
>+	if (__e)
>+	  __throw_system_error(__e);
>+#endif
>+      }
>+
>+      static __waiters&
>+      _S_for(void* __t)
>+      {
>+	const unsigned char __mask = 0xf;
>+	static __waiters __w[__mask + 1];
>+
>+	auto __key = _Hash_impl::hash(__t) & __mask;
>+	return __w[__key];
>+      }
>+    };
>+
>+    struct __waiter
>+    {
>+      __waiters& _M_w;
>+      __platform_wait_t _M_version;
>+
>+      template<typename _Tp>
>+	__waiter(const _Tp* __addr) noexcept
>+	  : _M_w(__waiters::_S_for((void*) __addr))
>+	  , _M_version(_M_w._M_enter_wait())
>+	{ }
>+
>+      ~__waiter()
>+      { _M_w._M_leave_wait(); }
>+
>+      void _M_do_wait() noexcept
>+      { _M_w._M_do_wait(_M_version); }
>+    };
>+
>+    void
>+    __thread_relax() noexcept
>+    {
>+#if defined __i386__ || defined __x86_64__
>+      __builtin_ia32_pause();
>+#elif defined _GLIBCXX_USE_SCHED_YIELD
>+      __gthread_yield();
>+#endif

Should this and the identical code in <stop_token> be consolidated
somewhere?

>+    }
>+
>+    void
>+    __thread_yield() noexcept
>+   {
>+#if defined _GLIBCXX_USE_SCHED_YIELD
>+     __gthread_yield();
>+#endif
>+    }
>+
>+  } // namespace __detail
>+
>+  template<typename _Pred>
>+    bool
>+    __atomic_spin(_Pred __pred) noexcept
>+    {
>+      for (auto __i = 0; __i < _GLIBCXX_SPIN_COUNT_1; ++__i)
>+	{
>+	  if (__pred())
>+	    return true;
>+
>+	  if (__i < _GLIBCXX_SPIN_COUNT_2)
>+	    __detail::__thread_relax();
>+	  else
>+	    __detail::__thread_yield();
>+	}
>+      return false;
>+    }
>+
>+  template<typename _Tp, typename _Pred>
>+    void
>+    __atomic_wait(const _Tp* __addr, _Tp __old, _Pred __pred) noexcept
>+    {
>+      using namespace __detail;
>+      if (__atomic_spin(__pred))
>+	return;
>+
>+      __waiter __w(__addr);
>+      while (!__pred())
>+	{
>+	  if constexpr (__platform_wait_uses_type<_Tp>)
>+	    {
>+	      __platform_wait((__platform_wait_t*)(void*) __addr, __old);

This has an implicit conversion from _To to __platform_wait_t, is that
safe? Can we make it explicit with a static_cast?

>+	    }
>+	  else
>+	    {
>+	      // TODO support timed backoff when this can be moved into the lib
>+	      __w._M_do_wait();
>+	    }
>+	}
>+    }
>+
>+  template<typename _Tp>
>+    void
>+    __atomic_notify(const _Tp* __addr, bool __all) noexcept
>+    {
>+      using namespace __detail;
>+      auto& __w = __waiters::_S_for((void*)__addr);
>+      if (!__w._M_waiting())
>+	return;
>+
>+      if constexpr (__platform_wait_uses_type<_Tp>)
>+	{
>+	  __platform_notify((__platform_wait_t*)(void*) __addr, __all);
>+	}
>+      else
>+	{
>+	  __w._M_notify(__all);
>+	}
>+    }
>+_GLIBCXX_END_NAMESPACE_VERSION
>+} // namespace std
>+#endif
>diff --git a/libstdc++-v3/include/bits/semaphore_base.h b/libstdc++-v3/include/bits/semaphore_base.h
>new file mode 100644
>index 00000000000..f0c4235d91c
>--- /dev/null
>+++ b/libstdc++-v3/include/bits/semaphore_base.h

I'll continue reviewing from here ASAP ...