public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
@ 2020-09-26 19:42 Jonathan Wakely
  2020-09-27 19:15 ` Florian Weimer
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jonathan Wakely @ 2020-09-26 19:42 UTC (permalink / raw)
  To: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1825 bytes --]

Glibc 2.32 adds a global variable that says whether the process is
single-threaded. We can use this to decide whether to elide atomic
operations, as a more precise and reliable indicator than
__gthread_active_p.

This means that guard variables for statics and reference counting in
shared_ptr can use less expensive, non-atomic ops even in processes that
are linked to libpthread, as long as no threads have been created yet.
It also means that we switch to using atomics if libpthread gets loaded
later via dlopen (this still isn't supported in general, for other
reasons).

We can't use __libc_single_threaded to replace __gthread_active_p
everywhere. If we replaced the uses of __gthread_active_p in std::mutex
then we would elide the pthread_mutex_lock in the code below, but not
the pthread_mutex_unlock:

  std::mutex m;
  m.lock();            // pthread_mutex_lock
  std::thread t([]{}); // __libc_single_threaded = false
  t.join();
  m.unlock();          // pthread_mutex_unlock

We need the lock and unlock to use the same "is threading enabled"
predicate, and similarly for init/destroy pairs for mutexes and
condition variables, so that we don't try to release resources that were
never acquired.

There are other places that could use __libc_single_threaded, such as
_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
they can be changed later.

libstdc++-v3/ChangeLog:

	PR libstdc++/96817
	* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
	New function wrapping __libc_single_threaded if available.
	(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
	* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
	(__cxa_guard_release): Likewise.
	* testsuite/18_support/96817.cc: New test.

Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.


[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 8128 bytes --]

commit e6923541fae5081b646f240d54de2a32e17a0382
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Sat Sep 26 20:32:36 2020

    libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
    
    Glibc 2.32 adds a global variable that says whether the process is
    single-threaded. We can use this to decide whether to elide atomic
    operations, as a more precise and reliable indicator than
    __gthread_active_p.
    
    This means that guard variables for statics and reference counting in
    shared_ptr can use less expensive, non-atomic ops even in processes that
    are linked to libpthread, as long as no threads have been created yet.
    It also means that we switch to using atomics if libpthread gets loaded
    later via dlopen (this still isn't supported in general, for other
    reasons).
    
    We can't use __libc_single_threaded to replace __gthread_active_p
    everywhere. If we replaced the uses of __gthread_active_p in std::mutex
    then we would elide the pthread_mutex_lock in the code below, but not
    the pthread_mutex_unlock:
    
      std::mutex m;
      m.lock();            // pthread_mutex_lock
      std::thread t([]{}); // __libc_single_threaded = false
      t.join();
      m.unlock();          // pthread_mutex_unlock
    
    We need the lock and unlock to use the same "is threading enabled"
    predicate, and similarly for init/destroy pairs for mutexes and
    condition variables, so that we don't try to release resources that were
    never acquired.
    
    There are other places that could use __libc_single_threaded, such as
    _Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
    they can be changed later.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/96817
            * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
            New function wrapping __libc_single_threaded if available.
            (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
            * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
            (__cxa_guard_release): Likewise.
            * testsuite/18_support/96817.cc: New test.

diff --git a/libstdc++-v3/include/ext/atomicity.h b/libstdc++-v3/include/ext/atomicity.h
index 813ceb0bbf8..2d3e5fb0904 100644
--- a/libstdc++-v3/include/ext/atomicity.h
+++ b/libstdc++-v3/include/ext/atomicity.h
@@ -34,11 +34,27 @@
 #include <bits/c++config.h>
 #include <bits/gthr.h>
 #include <bits/atomic_word.h>
+#if __has_include(<sys/single_threaded.h>)
+# include <sys/single_threaded.h>
+#endif
 
 namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+  __attribute__((__always_inline__))
+  inline bool
+  __is_single_threaded() _GLIBCXX_NOTHROW
+  {
+#ifndef __GTHREADS
+    return true;
+#elif __has_include(<sys/single_threaded.h>)
+    return ::__libc_single_threaded;
+#else
+    return !__gthread_active_p();
+#endif
+  }
+
   // Functions for portable atomic access.
   // To abstract locking primitives across all thread policies, use:
   // __exchange_and_add_dispatch
@@ -79,25 +95,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __attribute__ ((__always_inline__))
   __exchange_and_add_dispatch(_Atomic_word* __mem, int __val)
   {
-#ifdef __GTHREADS
-    if (__gthread_active_p())
+    if (__is_single_threaded())
+      return __exchange_and_add_single(__mem, __val);
+    else
       return __exchange_and_add(__mem, __val);
-#endif
-    return __exchange_and_add_single(__mem, __val);
   }
 
   inline void
   __attribute__ ((__always_inline__))
   __atomic_add_dispatch(_Atomic_word* __mem, int __val)
   {
-#ifdef __GTHREADS
-    if (__gthread_active_p())
-      {
-	__atomic_add(__mem, __val);
-	return;
-      }
-#endif
-    __atomic_add_single(__mem, __val);
+    if (__is_single_threaded())
+      __atomic_add_single(__mem, __val);
+    else
+      __atomic_add(__mem, __val);
   }
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/libsupc++/guard.cc b/libstdc++-v3/libsupc++/guard.cc
index 474af33ce83..240eda8ee71 100644
--- a/libstdc++-v3/libsupc++/guard.cc
+++ b/libstdc++-v3/libsupc++/guard.cc
@@ -252,7 +252,24 @@ namespace __cxxabiv1
 # ifdef _GLIBCXX_USE_FUTEX
     // If __atomic_* and futex syscall are supported, don't use any global
     // mutex.
-    if (__gthread_active_p ())
+
+    // Use the same bits in the guard variable whether single-threaded or not,
+    // so that __cxa_guard_release and __cxa_guard_abort match the logic here
+    // even if __libc_single_threaded becomes false between now and then.
+
+    if (__gnu_cxx::__is_single_threaded())
+      {
+	// No need to use atomics, and no need to wait for other threads.
+	int *gi = (int *) (void *) g;
+	if (*gi == 0)
+	  {
+	    *gi = _GLIBCXX_GUARD_PENDING_BIT;
+	    return 1;
+	  }
+	else
+	  throw_recursive_init_exception();
+      }
+    else
       {
 	int *gi = (int *) (void *) g;
 	const int guard_bit = _GLIBCXX_GUARD_BIT;
@@ -302,7 +319,7 @@ namespace __cxxabiv1
 	    syscall (SYS_futex, gi, _GLIBCXX_FUTEX_WAIT, expected, 0);
 	  }
       }
-# else
+# else // ! _GLIBCXX_USE_FUTEX
     if (__gthread_active_p ())
       {
 	mutex_wrapper mw;
@@ -340,18 +357,26 @@ namespace __cxxabiv1
 	  }
       }
 # endif
-#endif
+#endif // ! __GTHREADS
 
     return acquire (g);
   }
 
   extern "C"
-  void __cxa_guard_abort (__guard *g) throw ()
+  void __cxa_guard_abort (__guard *g) noexcept
   {
 #ifdef _GLIBCXX_USE_FUTEX
     // If __atomic_* and futex syscall are supported, don't use any global
     // mutex.
-    if (__gthread_active_p ())
+
+    if (__gnu_cxx::__is_single_threaded())
+      {
+	// No need to use atomics, and no other threads to wake.
+	int *gi = (int *) (void *) g;
+	*gi = 0;
+	return;
+      }
+    else
       {
 	int *gi = (int *) (void *) g;
 	const int waiting_bit = _GLIBCXX_GUARD_WAITING_BIT;
@@ -385,12 +410,19 @@ namespace __cxxabiv1
   }
 
   extern "C"
-  void __cxa_guard_release (__guard *g) throw ()
+  void __cxa_guard_release (__guard *g) noexcept
   {
 #ifdef _GLIBCXX_USE_FUTEX
     // If __atomic_* and futex syscall are supported, don't use any global
     // mutex.
-    if (__gthread_active_p ())
+
+    if (__gnu_cxx::__is_single_threaded())
+      {
+	int *gi = (int *) (void *) g;
+	*gi = _GLIBCXX_GUARD_BIT;
+	return;
+      }
+    else
       {
 	int *gi = (int *) (void *) g;
 	const int guard_bit = _GLIBCXX_GUARD_BIT;
@@ -401,6 +433,7 @@ namespace __cxxabiv1
 	  syscall (SYS_futex, gi, _GLIBCXX_FUTEX_WAKE, INT_MAX);
 	return;
       }
+
 #elif defined(__GTHREAD_HAS_COND)
     if (__gthread_active_p())
       {
diff --git a/libstdc++-v3/testsuite/18_support/96817.cc b/libstdc++-v3/testsuite/18_support/96817.cc
new file mode 100644
index 00000000000..4c4da40afa9
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/96817.cc
@@ -0,0 +1,39 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-options "-pthread"  }
+// { dg-do run { target *-*-linux-gnu } }
+// { dg-require-effective-target pthread }
+
+// PR libstdc++/96817
+
+int init()
+{
+#if __has_include(<sys/single_threaded.h>)
+  // This deadlocks unless __libc_single_threaded is available in Glibc,
+  // because __cxa_guard_acquire uses __gthread_active_p and the
+  // multithreaded init can't detect recursion (see PR 97211).
+  static int i = init();
+#endif
+  return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  init();
+}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-09-26 19:42 [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817] Jonathan Wakely
@ 2020-09-27 19:15 ` Florian Weimer
  2020-09-29 11:51 ` Christophe Lyon
  2020-10-02 21:35 ` Jonathan Wakely
  2 siblings, 0 replies; 9+ messages in thread
From: Florian Weimer @ 2020-09-27 19:15 UTC (permalink / raw)
  To: Jonathan Wakely via Libstdc++; +Cc: gcc-patches, Jonathan Wakely

* Jonathan Wakely via Libstdc:

> We can't use __libc_single_threaded to replace __gthread_active_p
> everywhere. If we replaced the uses of __gthread_active_p in std::mutex
> then we would elide the pthread_mutex_lock in the code below, but not
> the pthread_mutex_unlock:
>
>   std::mutex m;
>   m.lock();            // pthread_mutex_lock
>   std::thread t([]{}); // __libc_single_threaded = false
>   t.join();
>   m.unlock();          // pthread_mutex_unlock

Thanks for implementing this.

Eliding the mutex lock is a bit iffy because the mutex may reside in a
shared mapping.  For doing the same optimization in glibc, we will
have to check if the mutex is process-private or not.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-09-26 19:42 [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817] Jonathan Wakely
  2020-09-27 19:15 ` Florian Weimer
@ 2020-09-29 11:51 ` Christophe Lyon
  2020-09-30 15:03   ` Jonathan Wakely
  2020-10-02 21:35 ` Jonathan Wakely
  2 siblings, 1 reply; 9+ messages in thread
From: Christophe Lyon @ 2020-09-29 11:51 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++, gcc Patches

On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Glibc 2.32 adds a global variable that says whether the process is
> single-threaded. We can use this to decide whether to elide atomic
> operations, as a more precise and reliable indicator than
> __gthread_active_p.
>
> This means that guard variables for statics and reference counting in
> shared_ptr can use less expensive, non-atomic ops even in processes that
> are linked to libpthread, as long as no threads have been created yet.
> It also means that we switch to using atomics if libpthread gets loaded
> later via dlopen (this still isn't supported in general, for other
> reasons).
>
> We can't use __libc_single_threaded to replace __gthread_active_p
> everywhere. If we replaced the uses of __gthread_active_p in std::mutex
> then we would elide the pthread_mutex_lock in the code below, but not
> the pthread_mutex_unlock:
>
>   std::mutex m;
>   m.lock();            // pthread_mutex_lock
>   std::thread t([]{}); // __libc_single_threaded = false
>   t.join();
>   m.unlock();          // pthread_mutex_unlock
>
> We need the lock and unlock to use the same "is threading enabled"
> predicate, and similarly for init/destroy pairs for mutexes and
> condition variables, so that we don't try to release resources that were
> never acquired.
>
> There are other places that could use __libc_single_threaded, such as
> _Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
> they can be changed later.
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/96817
>         * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>         New function wrapping __libc_single_threaded if available.
>         (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>         * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>         (__cxa_guard_release): Likewise.
>         * testsuite/18_support/96817.cc: New test.
>
> Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.

Hi,

This patch introduced regressions on armeb-linux-gnueabhf:
--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
    g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
    g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
    g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
    g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
    g++.dg/init/init-ref2.C  -std=c++14 execution test
    g++.dg/init/init-ref2.C  -std=c++17 execution test
    g++.dg/init/init-ref2.C  -std=c++2a execution test
    g++.dg/init/init-ref2.C  -std=c++98 execution test
    g++.dg/init/ref15.C  -std=c++14 execution test
    g++.dg/init/ref15.C  -std=c++17 execution test
    g++.dg/init/ref15.C  -std=c++2a execution test
    g++.dg/init/ref15.C  -std=c++98 execution test
    g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
    g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
    g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
    g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
    g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
    g++.old-deja/g++.other/init19.C  -std=c++14 execution test
    g++.old-deja/g++.other/init19.C  -std=c++17 execution test
    g++.old-deja/g++.other/init19.C  -std=c++2a execution test
    g++.old-deja/g++.other/init19.C  -std=c++98 execution test

and probably some (280) in libstdc++ tests: (I didn't bisect those):
    19_diagnostics/error_category/generic_category.cc execution test
    19_diagnostics/error_category/system_category.cc execution test
    20_util/scoped_allocator/1.cc execution test
    20_util/scoped_allocator/2.cc execution test
    20_util/scoped_allocator/construct_pair_c++2a.cc execution test
    20_util/to_address/debug.cc execution test
    20_util/variant/run.cc execution test

Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-09-29 11:51 ` Christophe Lyon
@ 2020-09-30 15:03   ` Jonathan Wakely
  2020-09-30 20:44     ` Jonathan Wakely
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Wakely @ 2020-09-30 15:03 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: libstdc++, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 5747 bytes --]

On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
><gcc-patches@gcc.gnu.org> wrote:
>>
>> Glibc 2.32 adds a global variable that says whether the process is
>> single-threaded. We can use this to decide whether to elide atomic
>> operations, as a more precise and reliable indicator than
>> __gthread_active_p.
>>
>> This means that guard variables for statics and reference counting in
>> shared_ptr can use less expensive, non-atomic ops even in processes that
>> are linked to libpthread, as long as no threads have been created yet.
>> It also means that we switch to using atomics if libpthread gets loaded
>> later via dlopen (this still isn't supported in general, for other
>> reasons).
>>
>> We can't use __libc_single_threaded to replace __gthread_active_p
>> everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>> then we would elide the pthread_mutex_lock in the code below, but not
>> the pthread_mutex_unlock:
>>
>>   std::mutex m;
>>   m.lock();            // pthread_mutex_lock
>>   std::thread t([]{}); // __libc_single_threaded = false
>>   t.join();
>>   m.unlock();          // pthread_mutex_unlock
>>
>> We need the lock and unlock to use the same "is threading enabled"
>> predicate, and similarly for init/destroy pairs for mutexes and
>> condition variables, so that we don't try to release resources that were
>> never acquired.
>>
>> There are other places that could use __libc_single_threaded, such as
>> _Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>> they can be changed later.
>>
>> libstdc++-v3/ChangeLog:
>>
>>         PR libstdc++/96817
>>         * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>>         New function wrapping __libc_single_threaded if available.
>>         (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>>         * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>>         (__cxa_guard_release): Likewise.
>>         * testsuite/18_support/96817.cc: New test.
>>
>> Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>
>Hi,
>
>This patch introduced regressions on armeb-linux-gnueabhf:
>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>    g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>    g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
>    g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
>    g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
>    g++.dg/init/init-ref2.C  -std=c++14 execution test
>    g++.dg/init/init-ref2.C  -std=c++17 execution test
>    g++.dg/init/init-ref2.C  -std=c++2a execution test
>    g++.dg/init/init-ref2.C  -std=c++98 execution test
>    g++.dg/init/ref15.C  -std=c++14 execution test
>    g++.dg/init/ref15.C  -std=c++17 execution test
>    g++.dg/init/ref15.C  -std=c++2a execution test
>    g++.dg/init/ref15.C  -std=c++98 execution test
>    g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
>    g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
>    g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
>    g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
>    g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
>    g++.old-deja/g++.other/init19.C  -std=c++14 execution test
>    g++.old-deja/g++.other/init19.C  -std=c++17 execution test
>    g++.old-deja/g++.other/init19.C  -std=c++2a execution test
>    g++.old-deja/g++.other/init19.C  -std=c++98 execution test
>
>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>    19_diagnostics/error_category/generic_category.cc execution test
>    19_diagnostics/error_category/system_category.cc execution test
>    20_util/scoped_allocator/1.cc execution test
>    20_util/scoped_allocator/2.cc execution test
>    20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>    20_util/to_address/debug.cc execution test
>    20_util/variant/run.cc execution test

I think this is a latent bug in the static initialization code for
EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
have:

# ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE

// Test the guard variable with a memory load with
// acquire semantics.

inline bool
__test_and_acquire (__cxxabiv1::__guard *g)
{
   unsigned char __c;
   unsigned char *__p = reinterpret_cast<unsigned char *>(g);
   __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
   (void) __p;
   return _GLIBCXX_GUARD_TEST(&__c);
}
#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
# endif

That inspects the first byte of the guard variable. But for EABI the
"is initialized" bit is the least significant bit of the guard
variable. For little endian that's fine, the least significant bit is
in the first byte. But for big endian, it's not in the first byte, so
we are looking in the wrong place. This means that the initial check
in __cxa_guard_acquire is wrong:

   extern "C"
   int __cxa_guard_acquire (__guard *g)
   {
#ifdef __GTHREADS
     // If the target can reorder loads, we need to insert a read memory
     // barrier so that accesses to the guarded variable happen after the
     // guard test.
     if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
       return 0;

This will always be false for big endian EABI, which means we run the
rest of the function even when the variable is initialized. Previously
that still gave the right answer, but inefficiently. After my change
it gives the wrong answer, because the new code assumes that the check
right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.

I think the attached should fix it. This should be backported to fix
inefficient code for static init on big endian EABI targets.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 2831 bytes --]

commit 937d75fdfa9a7830e93d811e37d545897955d66b
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Sep 30 15:48:56 2020

    libstdc++: Fix test_and_acquire / set_and_release for EABI guard variables
    
    The default definitions of _GLIBCXX_GUARD_TEST_AND_ACQUIRE and
    _GLIBCXX_GUARD_SET_AND_RELEASE in libsupc++/guard.cc only work for the
    generic (IA64) ABI, because they test/set the first byte of the guard
    variable. For EABI we need to use the least significant bit, which means
    using the first byte is wrong for big endian targets.
    
    This has always been wrong, but previously it only caused poor
    performance. The _GLIBCXX_GUARD_TEST_AND_ACQUIRE at the very start of
    __cxa_guard_acquire would always return false even if the initialization
    was actually complete. Before my r11-3484 change the atomic compare
    exchange would have loaded the correct value, and then returned 0 as
    expected when the initialization is complete. After my change, in the
    single-threaded case there is no redundant check for init being
    complete, because I foolishly assumed that the check at the start of the
    function actually worked.
    
    The default definition of _GLIBCXX_GUARD_SET_AND_RELEASE is also wrong
    for big endian EABI, but appears to work because it sets the wrong bit
    but then the buggy TEST_AND_ACQUIRE tests that wrong bit as well. Also,
    the buggy SET_AND_RELEASE macro is only used for targets with threads
    enabled but no futex syscalls.
    
    This should fix the regressions introduced by my patch, by defining
    custom versions of the TEST_AND_ACQUIRE and SET_AND_RELEASE macros that
    are correct for EABI.
    
    libstdc++-v3/ChangeLog:
    
            * config/cpu/arm/cxxabi_tweaks.h (_GLIBCXX_GUARD_TEST_AND_ACQUIRE):
            (_GLIBCXX_GUARD_SET_AND_RELEASE): Define for EABI.

diff --git a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
index 4eb43c8373c..4fb34869f8a 100644
--- a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
+++ b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
@@ -39,7 +39,7 @@ namespace __cxxabiv1
 
 #ifdef __ARM_EABI__
   // The ARM EABI uses the least significant bit of a 32-bit
-  // guard variable.  */
+  // guard variable.
 #define _GLIBCXX_GUARD_TEST(x) ((*(x) & 1) != 0)
 #define _GLIBCXX_GUARD_SET(x) *(x) = 1
 #define _GLIBCXX_GUARD_BIT 1
@@ -47,6 +47,11 @@ namespace __cxxabiv1
 #define _GLIBCXX_GUARD_WAITING_BIT __guard_test_bit (2, 1)
   typedef int __guard;
 
+#define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(x) \
+  _GLIBCXX_GUARD_TEST(__atomic_load_n(x, __ATOMIC_ACQUIRE))
+#define _GLIBCXX_GUARD_SET_AND_RELEASE(x) \
+  __atomic_store_n(x, 1, __ATOMIC_RELEASE)
+
   // We also want the element size in array cookies.
 #define _GLIBCXX_ELTSIZE_IN_COOKIE 1
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-09-30 15:03   ` Jonathan Wakely
@ 2020-09-30 20:44     ` Jonathan Wakely
  2020-10-01  7:30       ` Christophe Lyon
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Wakely @ 2020-09-30 20:44 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: libstdc++, gcc Patches

On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
>On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
>><gcc-patches@gcc.gnu.org> wrote:
>>>
>>>Glibc 2.32 adds a global variable that says whether the process is
>>>single-threaded. We can use this to decide whether to elide atomic
>>>operations, as a more precise and reliable indicator than
>>>__gthread_active_p.
>>>
>>>This means that guard variables for statics and reference counting in
>>>shared_ptr can use less expensive, non-atomic ops even in processes that
>>>are linked to libpthread, as long as no threads have been created yet.
>>>It also means that we switch to using atomics if libpthread gets loaded
>>>later via dlopen (this still isn't supported in general, for other
>>>reasons).
>>>
>>>We can't use __libc_single_threaded to replace __gthread_active_p
>>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>>>then we would elide the pthread_mutex_lock in the code below, but not
>>>the pthread_mutex_unlock:
>>>
>>>  std::mutex m;
>>>  m.lock();            // pthread_mutex_lock
>>>  std::thread t([]{}); // __libc_single_threaded = false
>>>  t.join();
>>>  m.unlock();          // pthread_mutex_unlock
>>>
>>>We need the lock and unlock to use the same "is threading enabled"
>>>predicate, and similarly for init/destroy pairs for mutexes and
>>>condition variables, so that we don't try to release resources that were
>>>never acquired.
>>>
>>>There are other places that could use __libc_single_threaded, such as
>>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>>>they can be changed later.
>>>
>>>libstdc++-v3/ChangeLog:
>>>
>>>        PR libstdc++/96817
>>>        * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>>>        New function wrapping __libc_single_threaded if available.
>>>        (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>>>        * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>>>        (__cxa_guard_release): Likewise.
>>>        * testsuite/18_support/96817.cc: New test.
>>>
>>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>>
>>Hi,
>>
>>This patch introduced regressions on armeb-linux-gnueabhf:
>>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
>>   g++.dg/init/init-ref2.C  -std=c++14 execution test
>>   g++.dg/init/init-ref2.C  -std=c++17 execution test
>>   g++.dg/init/init-ref2.C  -std=c++2a execution test
>>   g++.dg/init/init-ref2.C  -std=c++98 execution test
>>   g++.dg/init/ref15.C  -std=c++14 execution test
>>   g++.dg/init/ref15.C  -std=c++17 execution test
>>   g++.dg/init/ref15.C  -std=c++2a execution test
>>   g++.dg/init/ref15.C  -std=c++98 execution test
>>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
>>
>>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>>   19_diagnostics/error_category/generic_category.cc execution test
>>   19_diagnostics/error_category/system_category.cc execution test
>>   20_util/scoped_allocator/1.cc execution test
>>   20_util/scoped_allocator/2.cc execution test
>>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>>   20_util/to_address/debug.cc execution test
>>   20_util/variant/run.cc execution test
>
>I think this is a latent bug in the static initialization code for
>EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
>have:
>
># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>
>// Test the guard variable with a memory load with
>// acquire semantics.
>
>inline bool
>__test_and_acquire (__cxxabiv1::__guard *g)
>{
>  unsigned char __c;
>  unsigned char *__p = reinterpret_cast<unsigned char *>(g);
>  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
>  (void) __p;
>  return _GLIBCXX_GUARD_TEST(&__c);
>}
>#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
># endif
>
>That inspects the first byte of the guard variable. But for EABI the
>"is initialized" bit is the least significant bit of the guard
>variable. For little endian that's fine, the least significant bit is
>in the first byte. But for big endian, it's not in the first byte, so
>we are looking in the wrong place. This means that the initial check
>in __cxa_guard_acquire is wrong:
>
>  extern "C"
>  int __cxa_guard_acquire (__guard *g)
>  {
>#ifdef __GTHREADS
>    // If the target can reorder loads, we need to insert a read memory
>    // barrier so that accesses to the guarded variable happen after the
>    // guard test.
>    if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>      return 0;
>
>This will always be false for big endian EABI, which means we run the
>rest of the function even when the variable is initialized. Previously
>that still gave the right answer, but inefficiently. After my change
>it gives the wrong answer, because the new code assumes that the check
>right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
>
>I think the attached should fix it. This should be backported to fix
>inefficient code for static init on big endian EABI targets.


Committed as r11-3572, but I haven't been able to test it on a big
endian EABI system, only little endian EABI.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-09-30 20:44     ` Jonathan Wakely
@ 2020-10-01  7:30       ` Christophe Lyon
  2020-10-01  7:50         ` Jonathan Wakely
  0 siblings, 1 reply; 9+ messages in thread
From: Christophe Lyon @ 2020-10-01  7:30 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++, gcc Patches

On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
> >On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
> >>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
> >><gcc-patches@gcc.gnu.org> wrote:
> >>>
> >>>Glibc 2.32 adds a global variable that says whether the process is
> >>>single-threaded. We can use this to decide whether to elide atomic
> >>>operations, as a more precise and reliable indicator than
> >>>__gthread_active_p.
> >>>
> >>>This means that guard variables for statics and reference counting in
> >>>shared_ptr can use less expensive, non-atomic ops even in processes that
> >>>are linked to libpthread, as long as no threads have been created yet.
> >>>It also means that we switch to using atomics if libpthread gets loaded
> >>>later via dlopen (this still isn't supported in general, for other
> >>>reasons).
> >>>
> >>>We can't use __libc_single_threaded to replace __gthread_active_p
> >>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
> >>>then we would elide the pthread_mutex_lock in the code below, but not
> >>>the pthread_mutex_unlock:
> >>>
> >>>  std::mutex m;
> >>>  m.lock();            // pthread_mutex_lock
> >>>  std::thread t([]{}); // __libc_single_threaded = false
> >>>  t.join();
> >>>  m.unlock();          // pthread_mutex_unlock
> >>>
> >>>We need the lock and unlock to use the same "is threading enabled"
> >>>predicate, and similarly for init/destroy pairs for mutexes and
> >>>condition variables, so that we don't try to release resources that were
> >>>never acquired.
> >>>
> >>>There are other places that could use __libc_single_threaded, such as
> >>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
> >>>they can be changed later.
> >>>
> >>>libstdc++-v3/ChangeLog:
> >>>
> >>>        PR libstdc++/96817
> >>>        * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
> >>>        New function wrapping __libc_single_threaded if available.
> >>>        (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
> >>>        * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
> >>>        (__cxa_guard_release): Likewise.
> >>>        * testsuite/18_support/96817.cc: New test.
> >>>
> >>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
> >>
> >>Hi,
> >>
> >>This patch introduced regressions on armeb-linux-gnueabhf:
> >>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
> >>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
> >>   g++.dg/init/init-ref2.C  -std=c++14 execution test
> >>   g++.dg/init/init-ref2.C  -std=c++17 execution test
> >>   g++.dg/init/init-ref2.C  -std=c++2a execution test
> >>   g++.dg/init/init-ref2.C  -std=c++98 execution test
> >>   g++.dg/init/ref15.C  -std=c++14 execution test
> >>   g++.dg/init/ref15.C  -std=c++17 execution test
> >>   g++.dg/init/ref15.C  -std=c++2a execution test
> >>   g++.dg/init/ref15.C  -std=c++98 execution test
> >>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
> >>
> >>and probably some (280) in libstdc++ tests: (I didn't bisect those):
> >>   19_diagnostics/error_category/generic_category.cc execution test
> >>   19_diagnostics/error_category/system_category.cc execution test
> >>   20_util/scoped_allocator/1.cc execution test
> >>   20_util/scoped_allocator/2.cc execution test
> >>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
> >>   20_util/to_address/debug.cc execution test
> >>   20_util/variant/run.cc execution test
> >
> >I think this is a latent bug in the static initialization code for
> >EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
> >have:
> >
> ># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
> >
> >// Test the guard variable with a memory load with
> >// acquire semantics.
> >
> >inline bool
> >__test_and_acquire (__cxxabiv1::__guard *g)
> >{
> >  unsigned char __c;
> >  unsigned char *__p = reinterpret_cast<unsigned char *>(g);
> >  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
> >  (void) __p;
> >  return _GLIBCXX_GUARD_TEST(&__c);
> >}
> >#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
> ># endif
> >
> >That inspects the first byte of the guard variable. But for EABI the
> >"is initialized" bit is the least significant bit of the guard
> >variable. For little endian that's fine, the least significant bit is
> >in the first byte. But for big endian, it's not in the first byte, so
> >we are looking in the wrong place. This means that the initial check
> >in __cxa_guard_acquire is wrong:
> >
> >  extern "C"
> >  int __cxa_guard_acquire (__guard *g)
> >  {
> >#ifdef __GTHREADS
> >    // If the target can reorder loads, we need to insert a read memory
> >    // barrier so that accesses to the guarded variable happen after the
> >    // guard test.
> >    if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
> >      return 0;
> >
> >This will always be false for big endian EABI, which means we run the
> >rest of the function even when the variable is initialized. Previously
> >that still gave the right answer, but inefficiently. After my change
> >it gives the wrong answer, because the new code assumes that the check
> >right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
> >
> >I think the attached should fix it. This should be backported to fix
> >inefficient code for static init on big endian EABI targets.
>
>
> Committed as r11-3572, but I haven't been able to test it on a big
> endian EABI system, only little endian EABI.
>
Hi Jonathan,

unfortunately, this breaks the build even for little-endian for me:
9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include
-I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra
-Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once
-ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2
-D_GNU_SOURCE -c
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc
 -fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o
In file included from
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50,
                 from
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28:
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:
In function 'int
__cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)':
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9:
error: invalid type argument of unary '*' (have 'int')
  249 |     if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[4]: *** [Makefile:763: guard.lo] Error 1

How did you test it? Maybe I'm missing something?

Thanks,

Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-10-01  7:30       ` Christophe Lyon
@ 2020-10-01  7:50         ` Jonathan Wakely
  2020-10-01 11:56           ` Jonathan Wakely
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Wakely @ 2020-10-01  7:50 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: libstdc++, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 8076 bytes --]

On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:
>On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
>> >On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>> >>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
>> >><gcc-patches@gcc.gnu.org> wrote:
>> >>>
>> >>>Glibc 2.32 adds a global variable that says whether the process is
>> >>>single-threaded. We can use this to decide whether to elide atomic
>> >>>operations, as a more precise and reliable indicator than
>> >>>__gthread_active_p.
>> >>>
>> >>>This means that guard variables for statics and reference counting in
>> >>>shared_ptr can use less expensive, non-atomic ops even in processes that
>> >>>are linked to libpthread, as long as no threads have been created yet.
>> >>>It also means that we switch to using atomics if libpthread gets loaded
>> >>>later via dlopen (this still isn't supported in general, for other
>> >>>reasons).
>> >>>
>> >>>We can't use __libc_single_threaded to replace __gthread_active_p
>> >>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>> >>>then we would elide the pthread_mutex_lock in the code below, but not
>> >>>the pthread_mutex_unlock:
>> >>>
>> >>>  std::mutex m;
>> >>>  m.lock();            // pthread_mutex_lock
>> >>>  std::thread t([]{}); // __libc_single_threaded = false
>> >>>  t.join();
>> >>>  m.unlock();          // pthread_mutex_unlock
>> >>>
>> >>>We need the lock and unlock to use the same "is threading enabled"
>> >>>predicate, and similarly for init/destroy pairs for mutexes and
>> >>>condition variables, so that we don't try to release resources that were
>> >>>never acquired.
>> >>>
>> >>>There are other places that could use __libc_single_threaded, such as
>> >>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>> >>>they can be changed later.
>> >>>
>> >>>libstdc++-v3/ChangeLog:
>> >>>
>> >>>        PR libstdc++/96817
>> >>>        * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>> >>>        New function wrapping __libc_single_threaded if available.
>> >>>        (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>> >>>        * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>> >>>        (__cxa_guard_release): Likewise.
>> >>>        * testsuite/18_support/96817.cc: New test.
>> >>>
>> >>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>> >>
>> >>Hi,
>> >>
>> >>This patch introduced regressions on armeb-linux-gnueabhf:
>> >>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>> >>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
>> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
>> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
>> >>   g++.dg/init/init-ref2.C  -std=c++14 execution test
>> >>   g++.dg/init/init-ref2.C  -std=c++17 execution test
>> >>   g++.dg/init/init-ref2.C  -std=c++2a execution test
>> >>   g++.dg/init/init-ref2.C  -std=c++98 execution test
>> >>   g++.dg/init/ref15.C  -std=c++14 execution test
>> >>   g++.dg/init/ref15.C  -std=c++17 execution test
>> >>   g++.dg/init/ref15.C  -std=c++2a execution test
>> >>   g++.dg/init/ref15.C  -std=c++98 execution test
>> >>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
>> >>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
>> >>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
>> >>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
>> >>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
>> >>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
>> >>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
>> >>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
>> >>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
>> >>
>> >>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>> >>   19_diagnostics/error_category/generic_category.cc execution test
>> >>   19_diagnostics/error_category/system_category.cc execution test
>> >>   20_util/scoped_allocator/1.cc execution test
>> >>   20_util/scoped_allocator/2.cc execution test
>> >>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>> >>   20_util/to_address/debug.cc execution test
>> >>   20_util/variant/run.cc execution test
>> >
>> >I think this is a latent bug in the static initialization code for
>> >EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
>> >have:
>> >
>> ># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>> >
>> >// Test the guard variable with a memory load with
>> >// acquire semantics.
>> >
>> >inline bool
>> >__test_and_acquire (__cxxabiv1::__guard *g)
>> >{
>> >  unsigned char __c;
>> >  unsigned char *__p = reinterpret_cast<unsigned char *>(g);
>> >  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
>> >  (void) __p;
>> >  return _GLIBCXX_GUARD_TEST(&__c);
>> >}
>> >#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
>> ># endif
>> >
>> >That inspects the first byte of the guard variable. But for EABI the
>> >"is initialized" bit is the least significant bit of the guard
>> >variable. For little endian that's fine, the least significant bit is
>> >in the first byte. But for big endian, it's not in the first byte, so
>> >we are looking in the wrong place. This means that the initial check
>> >in __cxa_guard_acquire is wrong:
>> >
>> >  extern "C"
>> >  int __cxa_guard_acquire (__guard *g)
>> >  {
>> >#ifdef __GTHREADS
>> >    // If the target can reorder loads, we need to insert a read memory
>> >    // barrier so that accesses to the guarded variable happen after the
>> >    // guard test.
>> >    if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>> >      return 0;
>> >
>> >This will always be false for big endian EABI, which means we run the
>> >rest of the function even when the variable is initialized. Previously
>> >that still gave the right answer, but inefficiently. After my change
>> >it gives the wrong answer, because the new code assumes that the check
>> >right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
>> >
>> >I think the attached should fix it. This should be backported to fix
>> >inefficient code for static init on big endian EABI targets.
>>
>>
>> Committed as r11-3572, but I haven't been able to test it on a big
>> endian EABI system, only little endian EABI.
>>
>Hi Jonathan,
>
>unfortunately, this breaks the build even for little-endian for me:
>9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include
>-I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
>-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra
>-Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once
>-ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2
>-D_GNU_SOURCE -c
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc
> -fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o
>In file included from
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50,
>                 from
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28:
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:
>In function 'int
>__cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)':
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9:
>error: invalid type argument of unary '*' (have 'int')
>  249 |     if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>make[4]: *** [Makefile:763: guard.lo] Error 1
>
>How did you test it? Maybe I'm missing something?

Apparently I didn't do a clean build and the guard.cc file didn't get
recompiled with the new header.

The attached works on armv7l-unknown-linux-gnueabihf, testing it now.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 918 bytes --]

commit 30c59cc6ce174a948d2968ce8d4fef6b55af1bbc
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Oct 1 08:45:02 2020

    libstdc++: Fix test_and_acquire for EABI
    
    libstdc++-v3/ChangeLog:
    
            * config/cpu/arm/cxxabi_tweaks.h (_GLIBCXX_GUARD_TEST_AND_ACQUIRE):
            Do not try to dereference return value of __atomic_load_n.

diff --git a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
index 4fb34869f8a..a08afed7d21 100644
--- a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
+++ b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
@@ -48,7 +48,7 @@ namespace __cxxabiv1
   typedef int __guard;
 
 #define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(x) \
-  _GLIBCXX_GUARD_TEST(__atomic_load_n(x, __ATOMIC_ACQUIRE))
+  ((__atomic_load_n(x, __ATOMIC_ACQUIRE) & 1) != 0)
 #define _GLIBCXX_GUARD_SET_AND_RELEASE(x) \
   __atomic_store_n(x, 1, __ATOMIC_RELEASE)
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-10-01  7:50         ` Jonathan Wakely
@ 2020-10-01 11:56           ` Jonathan Wakely
  0 siblings, 0 replies; 9+ messages in thread
From: Jonathan Wakely @ 2020-10-01 11:56 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: libstdc++, gcc Patches

On 01/10/20 08:50 +0100, Jonathan Wakely wrote:
>On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:
>>On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwakely@redhat.com> wrote:
>>>
>>>On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
>>>>On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>>>>>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
>>>>><gcc-patches@gcc.gnu.org> wrote:
>>>>>>
>>>>>>Glibc 2.32 adds a global variable that says whether the process is
>>>>>>single-threaded. We can use this to decide whether to elide atomic
>>>>>>operations, as a more precise and reliable indicator than
>>>>>>__gthread_active_p.
>>>>>>
>>>>>>This means that guard variables for statics and reference counting in
>>>>>>shared_ptr can use less expensive, non-atomic ops even in processes that
>>>>>>are linked to libpthread, as long as no threads have been created yet.
>>>>>>It also means that we switch to using atomics if libpthread gets loaded
>>>>>>later via dlopen (this still isn't supported in general, for other
>>>>>>reasons).
>>>>>>
>>>>>>We can't use __libc_single_threaded to replace __gthread_active_p
>>>>>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>>>>>>then we would elide the pthread_mutex_lock in the code below, but not
>>>>>>the pthread_mutex_unlock:
>>>>>>
>>>>>>  std::mutex m;
>>>>>>  m.lock();            // pthread_mutex_lock
>>>>>>  std::thread t([]{}); // __libc_single_threaded = false
>>>>>>  t.join();
>>>>>>  m.unlock();          // pthread_mutex_unlock
>>>>>>
>>>>>>We need the lock and unlock to use the same "is threading enabled"
>>>>>>predicate, and similarly for init/destroy pairs for mutexes and
>>>>>>condition variables, so that we don't try to release resources that were
>>>>>>never acquired.
>>>>>>
>>>>>>There are other places that could use __libc_single_threaded, such as
>>>>>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>>>>>>they can be changed later.
>>>>>>
>>>>>>libstdc++-v3/ChangeLog:
>>>>>>
>>>>>>        PR libstdc++/96817
>>>>>>        * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>>>>>>        New function wrapping __libc_single_threaded if available.
>>>>>>        (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>>>>>>        * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>>>>>>        (__cxa_guard_release): Likewise.
>>>>>>        * testsuite/18_support/96817.cc: New test.
>>>>>>
>>>>>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>>>>>
>>>>>Hi,
>>>>>
>>>>>This patch introduced regressions on armeb-linux-gnueabhf:
>>>>>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>>>>>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>>>>>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
>>>>>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
>>>>>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
>>>>>   g++.dg/init/init-ref2.C  -std=c++14 execution test
>>>>>   g++.dg/init/init-ref2.C  -std=c++17 execution test
>>>>>   g++.dg/init/init-ref2.C  -std=c++2a execution test
>>>>>   g++.dg/init/init-ref2.C  -std=c++98 execution test
>>>>>   g++.dg/init/ref15.C  -std=c++14 execution test
>>>>>   g++.dg/init/ref15.C  -std=c++17 execution test
>>>>>   g++.dg/init/ref15.C  -std=c++2a execution test
>>>>>   g++.dg/init/ref15.C  -std=c++98 execution test
>>>>>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
>>>>>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
>>>>>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
>>>>>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
>>>>>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
>>>>>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
>>>>>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
>>>>>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
>>>>>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
>>>>>
>>>>>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>>>>>   19_diagnostics/error_category/generic_category.cc execution test
>>>>>   19_diagnostics/error_category/system_category.cc execution test
>>>>>   20_util/scoped_allocator/1.cc execution test
>>>>>   20_util/scoped_allocator/2.cc execution test
>>>>>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>>>>>   20_util/to_address/debug.cc execution test
>>>>>   20_util/variant/run.cc execution test
>>>>
>>>>I think this is a latent bug in the static initialization code for
>>>>EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
>>>>have:
>>>>
>>>># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>>>>
>>>>// Test the guard variable with a memory load with
>>>>// acquire semantics.
>>>>
>>>>inline bool
>>>>__test_and_acquire (__cxxabiv1::__guard *g)
>>>>{
>>>>  unsigned char __c;
>>>>  unsigned char *__p = reinterpret_cast<unsigned char *>(g);
>>>>  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
>>>>  (void) __p;
>>>>  return _GLIBCXX_GUARD_TEST(&__c);
>>>>}
>>>>#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
>>>># endif
>>>>
>>>>That inspects the first byte of the guard variable. But for EABI the
>>>>"is initialized" bit is the least significant bit of the guard
>>>>variable. For little endian that's fine, the least significant bit is
>>>>in the first byte. But for big endian, it's not in the first byte, so
>>>>we are looking in the wrong place. This means that the initial check
>>>>in __cxa_guard_acquire is wrong:
>>>>
>>>>  extern "C"
>>>>  int __cxa_guard_acquire (__guard *g)
>>>>  {
>>>>#ifdef __GTHREADS
>>>>    // If the target can reorder loads, we need to insert a read memory
>>>>    // barrier so that accesses to the guarded variable happen after the
>>>>    // guard test.
>>>>    if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>>>>      return 0;
>>>>
>>>>This will always be false for big endian EABI, which means we run the
>>>>rest of the function even when the variable is initialized. Previously
>>>>that still gave the right answer, but inefficiently. After my change
>>>>it gives the wrong answer, because the new code assumes that the check
>>>>right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
>>>>
>>>>I think the attached should fix it. This should be backported to fix
>>>>inefficient code for static init on big endian EABI targets.
>>>
>>>
>>>Committed as r11-3572, but I haven't been able to test it on a big
>>>endian EABI system, only little endian EABI.
>>>
>>Hi Jonathan,
>>
>>unfortunately, this breaks the build even for little-endian for me:
>>9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include
>>-I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
>>-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra
>>-Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once
>>-ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2
>>-D_GNU_SOURCE -c
>>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc
>>-fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o
>>In file included from
>>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50,
>>                from
>>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28:
>>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:
>>In function 'int
>>__cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)':
>>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9:
>>error: invalid type argument of unary '*' (have 'int')
>> 249 |     if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>>     |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>make[4]: *** [Makefile:763: guard.lo] Error 1
>>
>>How did you test it? Maybe I'm missing something?
>
>Apparently I didn't do a clean build and the guard.cc file didn't get
>recompiled with the new header.
>
>The attached works on armv7l-unknown-linux-gnueabihf, testing it now.

My testing was only partially successful because the machine I'm
testing on keeps locking up and needing a reboot half way through the
testsuite. But at least it compiles now, and I didn't see any
failures. Only tested on little endian EABI though.

Pushed to master as r11-3585.


>commit 30c59cc6ce174a948d2968ce8d4fef6b55af1bbc
>Author: Jonathan Wakely <jwakely@redhat.com>
>Date:   Thu Oct 1 08:45:02 2020
>
>    libstdc++: Fix test_and_acquire for EABI
>    
>    libstdc++-v3/ChangeLog:
>    
>            * config/cpu/arm/cxxabi_tweaks.h (_GLIBCXX_GUARD_TEST_AND_ACQUIRE):
>            Do not try to dereference return value of __atomic_load_n.
>
>diff --git a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
>index 4fb34869f8a..a08afed7d21 100644
>--- a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
>+++ b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
>@@ -48,7 +48,7 @@ namespace __cxxabiv1
>   typedef int __guard;
> 
> #define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(x) \
>-  _GLIBCXX_GUARD_TEST(__atomic_load_n(x, __ATOMIC_ACQUIRE))
>+  ((__atomic_load_n(x, __ATOMIC_ACQUIRE) & 1) != 0)
> #define _GLIBCXX_GUARD_SET_AND_RELEASE(x) \
>   __atomic_store_n(x, 1, __ATOMIC_RELEASE)
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
  2020-09-26 19:42 [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817] Jonathan Wakely
  2020-09-27 19:15 ` Florian Weimer
  2020-09-29 11:51 ` Christophe Lyon
@ 2020-10-02 21:35 ` Jonathan Wakely
  2 siblings, 0 replies; 9+ messages in thread
From: Jonathan Wakely @ 2020-10-02 21:35 UTC (permalink / raw)
  To: libstdc++, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1960 bytes --]

On 26/09/20 20:42 +0100, Jonathan Wakely wrote:
>Glibc 2.32 adds a global variable that says whether the process is
>single-threaded. We can use this to decide whether to elide atomic
>operations, as a more precise and reliable indicator than
>__gthread_active_p.
>
>This means that guard variables for statics and reference counting in
>shared_ptr can use less expensive, non-atomic ops even in processes that
>are linked to libpthread, as long as no threads have been created yet.
>It also means that we switch to using atomics if libpthread gets loaded
>later via dlopen (this still isn't supported in general, for other
>reasons).
>
>We can't use __libc_single_threaded to replace __gthread_active_p
>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>then we would elide the pthread_mutex_lock in the code below, but not
>the pthread_mutex_unlock:
>
>  std::mutex m;
>  m.lock();            // pthread_mutex_lock
>  std::thread t([]{}); // __libc_single_threaded = false
>  t.join();
>  m.unlock();          // pthread_mutex_unlock
>
>We need the lock and unlock to use the same "is threading enabled"
>predicate, and similarly for init/destroy pairs for mutexes and
>condition variables, so that we don't try to release resources that were
>never acquired.
>
>There are other places that could use __libc_single_threaded, such as
>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>they can be changed later.
>
>libstdc++-v3/ChangeLog:
>
>	PR libstdc++/96817
>	* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>	New function wrapping __libc_single_threaded if available.
>	(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>	* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>	(__cxa_guard_release): Likewise.
>	* testsuite/18_support/96817.cc: New test.
>

The new test was broken, fixed with this.

Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 1279 bytes --]

commit 1ad08b64cea51d3cb989a1a176baeb8a18071231
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Oct 2 21:10:55 2020

    libstdc++: Fix testcase by using terminate handler
    
    This test was supposed to verify that when __libc_single_threaded is
    available we successfully detect recursive static initialization even
    when linked to libpthread. But I forgot to that when recursive init is
    detected, we terminate, and so the test fails.
    
    This adds a terminate handler that exits cleanly, so the test passes
    when recursive init is detected.
    
    libstdc++-v3/ChangeLog:
    
            * testsuite/18_support/96817.cc: Use terminate handler that
            calls _Exit(0).

diff --git a/libstdc++-v3/testsuite/18_support/96817.cc b/libstdc++-v3/testsuite/18_support/96817.cc
index 4c4da40afa9..19399c473ef 100644
--- a/libstdc++-v3/testsuite/18_support/96817.cc
+++ b/libstdc++-v3/testsuite/18_support/96817.cc
@@ -21,6 +21,9 @@
 
 // PR libstdc++/96817
 
+#include <exception>
+#include <stdlib.h>
+
 int init()
 {
 #if __has_include(<sys/single_threaded.h>)
@@ -32,8 +35,11 @@ int init()
   return 0;
 }
 
+void clean_terminate() { _Exit(0); }
+
 int
 main (int argc, char **argv)
 {
+  std::set_terminate(clean_terminate);
   init();
 }

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-02 21:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-26 19:42 [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817] Jonathan Wakely
2020-09-27 19:15 ` Florian Weimer
2020-09-29 11:51 ` Christophe Lyon
2020-09-30 15:03   ` Jonathan Wakely
2020-09-30 20:44     ` Jonathan Wakely
2020-10-01  7:30       ` Christophe Lyon
2020-10-01  7:50         ` Jonathan Wakely
2020-10-01 11:56           ` Jonathan Wakely
2020-10-02 21:35 ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).