[PATCH 0/3] Uncontroversial improvements to C++20 wait-related implementation

public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/3] Uncontroversial improvements to C++20 wait-related implementation
@ 2021-03-05 18:21 Thiago Macieira
  2021-03-05 18:21 ` [PATCH 1/3] Atomic __platform_wait: accept any 32-bit type, not just int Thiago Macieira
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Thiago Macieira @ 2021-03-05 18:21 UTC (permalink / raw)
  To: libstdc++; +Cc: gcc-patches

Discussion at:
https://gcc.gnu.org/pipermail/libstdc++/2021-February/052043.html

This patch set includes the uncontroversial parts that improve
performance but don't otherwise change ABI.

Please note we still need to decide on how to deal with the future ABI
break.

Thiago Macieira (3):
  Atomic __platform_wait: accept any 32-bit type, not just int
  std::latch: reduce internal implementation from ptrdiff_t to int
  barrier: optimise by not having the hasher in a loop

 libstdc++-v3/include/bits/atomic_wait.h |  7 ++++---
 libstdc++-v3/include/std/barrier        | 10 +++++-----
 libstdc++-v3/include/std/latch          |  4 ++--
 3 files changed, 11 insertions(+), 10 deletions(-)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] Atomic __platform_wait: accept any 32-bit type, not just int
  2021-03-05 18:21 [PATCH 0/3] Uncontroversial improvements to C++20 wait-related implementation Thiago Macieira
@ 2021-03-05 18:21 ` Thiago Macieira
  2021-03-05 18:21 ` [PATCH 2/3] std::latch: reduce internal implementation from ptrdiff_t to int Thiago Macieira
  2021-03-05 18:21 ` [PATCH 3/3] barrier: optimise by not having the hasher in a loop Thiago Macieira
  2 siblings, 0 replies; 4+ messages in thread
From: Thiago Macieira @ 2021-03-05 18:21 UTC (permalink / raw)
  To: libstdc++; +Cc: gcc-patches

The kernel doesn't care what we store in those 32 bits, only that they are
comparable. This commit adds:
 * pointers and long on 32-bit architectures
 * unsigned
 * untyped enums and typed enums on int & unsigned int
 * float

We're not using FUTEX_OP anywhere today. The kernel reserves 4 bits for
this field but has only used 6 values so far, so it can be extended to
unsigned compares in the future, if needed.

libstdc++-v3/ChangeLog:

        * include/bits/atomic_wait.h: Update __platform_wait_uses_type
---
 libstdc++-v3/include/bits/atomic_wait.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
index 424fccbe4c5..4b4573df691 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -65,7 +65,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     template<typename _Tp>
       inline constexpr bool __platform_wait_uses_type
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-	= is_same_v<remove_cv_t<_Tp>, __platform_wait_t>;
+	= sizeof(_Tp) == sizeof(__platform_wait_t) &&
+	  alignof(_Tp) >= alignof(__platform_wait_t);
 #else
 	= false;
 #endif
@@ -91,13 +92,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
     template<typename _Tp>
       void
-      __platform_wait(const _Tp* __addr, __platform_wait_t __val) noexcept
+      __platform_wait(const _Tp* __addr, _Tp __val) noexcept
       {
 	for(;;)
 	  {
 	    auto __e = syscall (SYS_futex, static_cast<const void*>(__addr),
 				  static_cast<int>(__futex_wait_flags::__wait_private),
-				    __val, nullptr);
+				  static_cast<__platform_wait_t>(__val), nullptr);
 	    if (!__e || errno == EAGAIN)
 	      break;
 	    else if (errno != EINTR)
-- 
2.30.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/3] std::latch: reduce internal implementation from ptrdiff_t to int
  2021-03-05 18:21 [PATCH 0/3] Uncontroversial improvements to C++20 wait-related implementation Thiago Macieira
  2021-03-05 18:21 ` [PATCH 1/3] Atomic __platform_wait: accept any 32-bit type, not just int Thiago Macieira
@ 2021-03-05 18:21 ` Thiago Macieira
  2021-03-05 18:21 ` [PATCH 3/3] barrier: optimise by not having the hasher in a loop Thiago Macieira
  2 siblings, 0 replies; 4+ messages in thread
From: Thiago Macieira @ 2021-03-05 18:21 UTC (permalink / raw)
  To: libstdc++; +Cc: gcc-patches

ints can be used as futex on Linux. ptrdiff_t on 64-bit Linux can't.

libstdc++-v3/ChangeLog:

        * include/std/latch: Use int instead of ptrdiff_t
---
 libstdc++-v3/include/std/latch | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
index ef8c301e5e9..f3f73360618 100644
--- a/libstdc++-v3/include/std/latch
+++ b/libstdc++-v3/include/std/latch
@@ -48,7 +48,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   public:
     static constexpr ptrdiff_t
     max() noexcept
-    { return __gnu_cxx::__int_traits<ptrdiff_t>::__max; }
+    { return __gnu_cxx::__int_traits<int>::__max; }
 
     constexpr explicit latch(ptrdiff_t __expected) noexcept
       : _M_a(__expected) { }
@@ -85,7 +85,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     }
 
   private:
-    alignas(__alignof__(ptrdiff_t)) ptrdiff_t _M_a;
+    alignas(4) int _M_a;	// kernel requires 4-byte alignment
   };
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
-- 
2.30.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] barrier: optimise by not having the hasher in a loop
  2021-03-05 18:21 [PATCH 0/3] Uncontroversial improvements to C++20 wait-related implementation Thiago Macieira
  2021-03-05 18:21 ` [PATCH 1/3] Atomic __platform_wait: accept any 32-bit type, not just int Thiago Macieira
  2021-03-05 18:21 ` [PATCH 2/3] std::latch: reduce internal implementation from ptrdiff_t to int Thiago Macieira
@ 2021-03-05 18:21 ` Thiago Macieira
  2 siblings, 0 replies; 4+ messages in thread
From: Thiago Macieira @ 2021-03-05 18:21 UTC (permalink / raw)
  To: libstdc++; +Cc: gcc-patches

Our thread's ID does not change so we don't have to get it every time
and hash it every time.

libstdc++-v3/ChangeLog:

        * include/std/barrier(arrive): move hasher one level up in the
          stack.
---
 libstdc++-v3/include/std/barrier | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index e09212dfcb9..4bb5642c164 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -94,7 +94,7 @@ It looks different from literature pseudocode for two main reasons:
       alignas(__phase_alignment) __barrier_phase_t  _M_phase;
 
       bool
-      _M_arrive(__barrier_phase_t __old_phase)
+      _M_arrive(__barrier_phase_t __old_phase, size_t __current)
       {
 	const auto __old_phase_val = static_cast<unsigned char>(__old_phase);
 	const auto __half_step =
@@ -103,9 +103,7 @@ It looks different from literature pseudocode for two main reasons:
 			   static_cast<__barrier_phase_t>(__old_phase_val + 2);
 
 	size_t __current_expected = _M_expected;
-	std::hash<std::thread::id> __hasher;
-	size_t __current = __hasher(std::this_thread::get_id())
-					  % ((_M_expected + 1) >> 1);
+        __current %= ((__current_expected + 1) >> 1);
 
 	for (int __round = 0; ; ++__round)
 	  {
@@ -163,12 +161,14 @@ It looks different from literature pseudocode for two main reasons:
       [[nodiscard]] arrival_token
       arrive(ptrdiff_t __update)
       {
+	std::hash<std::thread::id> __hasher;
+	size_t __current = __hasher(std::this_thread::get_id());
 	__atomic_phase_ref_t __phase(_M_phase);
 	const auto __old_phase = __phase.load(memory_order_relaxed);
 	const auto __cur = static_cast<unsigned char>(__old_phase);
 	for(; __update; --__update)
 	  {
-	    if(_M_arrive(__old_phase))
+	    if(_M_arrive(__old_phase, __current))
 	      {
 		_M_completion();
 		_M_expected += _M_expected_adjustment.load(memory_order_relaxed);
-- 
2.30.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-05 18:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-05 18:21 [PATCH 0/3] Uncontroversial improvements to C++20 wait-related implementation Thiago Macieira
2021-03-05 18:21 ` [PATCH 1/3] Atomic __platform_wait: accept any 32-bit type, not just int Thiago Macieira
2021-03-05 18:21 ` [PATCH 2/3] std::latch: reduce internal implementation from ptrdiff_t to int Thiago Macieira
2021-03-05 18:21 ` [PATCH 3/3] barrier: optimise by not having the hasher in a loop Thiago Macieira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).