public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704]
@ 2024-05-07 13:52 Jonathan Wakely
  2024-05-07 13:52 ` [PATCH 2/2] libstdc++: Fix data races in std::ctype<char> [PR77704] Jonathan Wakely
  2024-05-15 11:22 ` [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Jonathan Wakely
  0 siblings, 2 replies; 3+ messages in thread
From: Jonathan Wakely @ 2024-05-07 13:52 UTC (permalink / raw)
  To: libstdc++, gcc-patches

Tested x86_64-linux. This seems "obviously correct", and I'd like to
push it. The current code definitely has a data race, i.e. undefined
behaviour.

-- >8 --

The lazy caching in std::basic_ios::fill() updates a mutable member
without synchronization, which can cause a data race if two threads both
call fill() on the same stream object when _M_fill_init is false.

To avoid this we can just cache the _M_fill member and set _M_fill_init
early in std::basic_ios::init, instead of doing it lazily. As explained
by the comment in init, there's a good reason for doing it lazily. When
char_type is neither char nor wchar_t, the locale might not have a
std::ctype<char_type>, so getting the fill character would throw an
exception. The current lazy init allows using unformatted I/O with such
a stream, because the fill character is never needed and so it doesn't
matter if the locale doesn't have a ctype<char_type> facet. We can
maintain this property by only setting the fill character in
std::basic_ios::init if the ctype facet is present at that time. If
fill() is called later and the fill character wasn't set by init, we can
get it from the stream's current locale at the point when fill() is
called (and not try to cache it without synchronization).

This causes a change in behaviour for the following program:

  std::ostringstream out;
  out.imbue(loc);
  auto fill = out.fill();

Previously the fill character would have been set when fill() is called,
and so would have used the new locale. This commit changes it so that
the fill character is set on construction and isn't affected by the new
locale being imbued later. This new behaviour seems to be what the
standard requires, and matches MSVC.

The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still
possible to use a std::basic_ios without the ctype<char_type> facet
being present at construction.

libstdc++-v3/ChangeLog:

	PR libstdc++/77704
	* include/bits/basic_ios.h (basic_ios::fill()): Do not modify
	_M_fill and _M_fill_init in a const member function.
	(basic_ios::fill(char_type)): Use _M_fill directly instead of
	calling fill(). Set _M_fill_init to true.
	* include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and
	_M_fill_init here instead.
	* testsuite/27_io/basic_ios/fill/char/1.cc: New test.
	* testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test.
---
 libstdc++-v3/include/bits/basic_ios.h         | 10 +--
 libstdc++-v3/include/bits/basic_ios.tcc       | 15 +++-
 .../testsuite/27_io/basic_ios/fill/char/1.cc  | 78 +++++++++++++++++++
 .../27_io/basic_ios/fill/wchar_t/1.cc         | 55 +++++++++++++
 4 files changed, 148 insertions(+), 10 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
 create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc

diff --git a/libstdc++-v3/include/bits/basic_ios.h b/libstdc++-v3/include/bits/basic_ios.h
index 258e6042b8f..bc3be4d2e37 100644
--- a/libstdc++-v3/include/bits/basic_ios.h
+++ b/libstdc++-v3/include/bits/basic_ios.h
@@ -373,11 +373,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       char_type
       fill() const
       {
-	if (!_M_fill_init)
-	  {
-	    _M_fill = this->widen(' ');
-	    _M_fill_init = true;
-	  }
+	if (__builtin_expect(!_M_fill_init, false))
+	  return this->widen(' ');
 	return _M_fill;
       }
 
@@ -393,8 +390,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       char_type
       fill(char_type __ch)
       {
-	char_type __old = this->fill();
+	char_type __old = _M_fill;
 	_M_fill = __ch;
+	_M_fill_init = true;
 	return __old;
       }
 
diff --git a/libstdc++-v3/include/bits/basic_ios.tcc b/libstdc++-v3/include/bits/basic_ios.tcc
index a9313736e32..0197bdf8f67 100644
--- a/libstdc++-v3/include/bits/basic_ios.tcc
+++ b/libstdc++-v3/include/bits/basic_ios.tcc
@@ -138,13 +138,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       // return without throwing an exception. Unfortunately,
       // ctype<char_type> is not necessarily a required facet, so
       // streams with char_type != [char, wchar_t] will not have it by
-      // default. Because of this, the correct value for _M_fill is
-      // constructed on the first call of fill(). That way,
+      // default. If the ctype<char_type> facet is available now,
+      // _M_fill is set here, but otherwise no fill character will be
+      // cached and a call to fill() will check for the facet again later
+      // (and will throw if the facet is still not present). This way
       // unformatted input and output with non-required basic_ios
       // instantiations is possible even without imbuing the expected
       // ctype<char_type> facet.
-      _M_fill = _CharT();
-      _M_fill_init = false;
+      if (_M_ctype)
+	{
+	  _M_fill = _M_ctype->widen(' ');
+	  _M_fill_init = true;
+	}
+      else
+	_M_fill_init = false;
 
       _M_tie = 0;
       _M_exception = goodbit;
diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
new file mode 100644
index 00000000000..d5747c7507f
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
@@ -0,0 +1,78 @@
+// { dg-do run }
+
+#include <ios>
+#include <locale>
+#include <streambuf>
+#include <testsuite_hooks.h>
+
+typedef char C;
+
+struct tabby_mctype : std::ctype<C>
+{
+  C do_widen(char c) const { return c == ' ' ? '\t' : c; }
+
+  const char*
+  do_widen(const char* lo, const char* hi, C* to) const
+  {
+    while (lo != hi)
+      *to++ = do_widen(*lo++);
+    return hi;
+  }
+};
+
+void
+test01()
+{
+  std::basic_ios<C> out(0);
+  std::locale loc(std::locale(), new tabby_mctype);
+  out.imbue(loc);
+  VERIFY( out.fill() == ' ' ); // Imbuing a new locale doesn't affect fill().
+  out.fill('*');
+  VERIFY( out.fill() == '*' ); // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
+}
+
+void
+test02()
+{
+  std::locale loc(std::locale(), new tabby_mctype);
+  std::locale::global(loc);
+  std::basic_ios<C> out(0);
+  VERIFY( out.fill() == '\t' );
+  out.imbue(std::locale::classic());
+  VERIFY( out.fill() == '\t' ); // Imbuing a new locale doesn't affect fill().
+  out.fill('*');
+  VERIFY( out.fill() == '*' );  // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == '*' );  // Imbuing a new locale doesn't affect fill().
+}
+
+void
+test03()
+{
+  // This function tests a libstdc++ extension: if no ctype<char_type> facet
+  // is present when the stream is initialized, a fill character will not be
+  // cached. Calling fill() will obtain a fill character from the locale each
+  // time it's called.
+  typedef signed char C2;
+  std::basic_ios<C2> out(0);
+#if __cpp_exceptions
+  try {
+    (void) out.fill(); // No ctype<signed char> in the locale.
+    VERIFY( false );
+  } catch (...) {
+  }
+#endif
+  out.fill('*');
+  VERIFY( out.fill() == '*' ); // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
+}
+
+int main()
+{
+  test01();
+  test02();
+  test03();
+}
diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
new file mode 100644
index 00000000000..2d639a0844d
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
@@ -0,0 +1,55 @@
+// { dg-do run }
+
+#include <ios>
+#include <locale>
+#include <streambuf>
+#include <testsuite_hooks.h>
+
+typedef wchar_t C;
+
+struct tabby_mctype : std::ctype<C>
+{
+  C do_widen(char c) const { return c == ' ' ? L'\t' : c; }
+
+  const char*
+  do_widen(const char* lo, const char* hi, C* to) const
+  {
+    while (lo != hi)
+      *to++ = do_widen(*lo++);
+    return hi;
+  }
+};
+
+void
+test01()
+{
+  std::basic_ios<C> out(0);
+  std::locale loc(std::locale(), new tabby_mctype);
+  out.imbue(loc);
+  VERIFY( out.fill() == L' ' ); // Imbuing a new locale doesn't affect fill().
+  out.fill(L'*');
+  VERIFY( out.fill() == L'*' ); // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == L'*' ); // Imbuing a new locale doesn't affect fill().
+}
+
+void
+test02()
+{
+  std::locale loc(std::locale(), new tabby_mctype);
+  std::locale::global(loc);
+  std::basic_ios<C> out(0);
+  VERIFY( out.fill() == L'\t' );
+  out.imbue(std::locale::classic());
+  VERIFY( out.fill() == L'\t' ); // Imbuing a new locale doesn't affect fill().
+  out.fill(L'*');
+  VERIFY( out.fill() == L'*' );  // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == L'*' );  // Imbuing a new locale doesn't affect fill().
+}
+
+int main()
+{
+  test01();
+  test02();
+}
-- 
2.44.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2/2] libstdc++: Fix data races in std::ctype<char> [PR77704]
  2024-05-07 13:52 [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Jonathan Wakely
@ 2024-05-07 13:52 ` Jonathan Wakely
  2024-05-15 11:22 ` [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Jonathan Wakely
  1 sibling, 0 replies; 3+ messages in thread
From: Jonathan Wakely @ 2024-05-07 13:52 UTC (permalink / raw)
  To: libstdc++, gcc-patches

Tested x86_64-linux. This one is less "obviously correct", as calling
the single-character narrow(char, char) overload no longer lazily
populates individual characters in the cache (because doing that is
racy). And the single-character widen(char) no longer calls
_M_wide_init() to populate the whole widening cache.

The current code definitely has a data race, i.e. undefined behaviour,
so we need to do _something_. But maybe not this. Maybe it would be
better to keep calling _M_widen_init() from widen(char), so that
iostream construction will fill the cache on the first call to the
global locale's widen(' '), and then be faster after that (which is the
current behaviour). Maybe we want to add that to narrow(char, char) too
(which is not the current behaviour).

I raised the question on the LWG list of whether it's OK for calls to
ctype::narrow(char, char) to result in calls to the virtual function
ctype::do_narrow(const char*, const char*, char, char*), and for calls
to ctype::narrow(const char*, const char*, char, char*) to result in
calls to the virtual function ctype::do_narrow(char, char). If that
isn't OK then our entire caching scheme in std::ctype<char> is not
allowed, and we'd need to ensure each call to narrow results in exactly
one call to the corresponding do_narrow, and nothing else.

-- >8 --

The std::ctype<char> specialization uses mutable data members to cache
the results of do_narrow calls, to avoid virtual calls.  However, the
accesses to those mutable members are not synchronized and so there are
data races when using the facet in multiple threads.

This change ensures that the _M_narrow_ok status flag is only accessed
atomically, avoiding any races on that member.

The _M_narrow_init() member function is changed to use a mutex (with
double-checked locking), so that writing to the _M_narrow array only
happens in one thread. The function is rearranged so that the virtual
calls and comparing the arrays are done outside the critical section,
then all writes to member variables are done last, inside the critical
section.  Importantly, the _M_narrow_ok member is not set until after
the _M_narrow array has been populated.

The narrow(char, char) function will now only read from _M_narrow if
_M_narrow_ok is non-zero. This means that populating the array
happens-before reading from it. If the cache isn't available and a
virtual call to do_narrow(c, d) is needed, this function no longer
stores the result in the cache, because only _M_narrow_init() can write
to the cache now. This means that repeated calls to narrow(c, d) with
the same value of c will no longer avoid calling do_narrow(c, d). If
this impacts performance too significantly then we could make
narrow(char, char) call _M_narrow_init() to populate the cache, or just
call _M_narrow_init() on construction so the cache is always available.
In the current code widen(wchar_t) always calls _M_widen_init() to
populate that cache, but I've removed that call to be consistent with
narrow(char, char) which doesn't initialize the narrow cache. This will
impact std::basic_ios<C>::init (used when constructing any iostream
object) which calls widen(' ') on the global locale's std::ctype<C>
facet, so maybe we do want to warm up that cache still.

The narrow(const char*, const char*, char. char*) overload now re-checks
the _M_narrow_ok status flag after calling _M_narrow_init(), so that we
don't make an unnecessary virtual call if _M_narrow_init() set the
status flag to 1, meaning the base class version of do_narrow (using
memcpy) can be used. Reloading the status flag after calling
_M_narrow_init() can be a relaxed load, because _M_narrow_init() either
did a load with acquire ordering, or set the flag itself in the current
thread.

Similar changes are needed for the std::ctype<char>::widen members,
which are also defined in terms of mutable data members without
synchronization.

The 22_locale/ctype/narrow/char/19955.cc test needs to be fixed to work
with the new code, because it currently assumes that the library will
only use the array form of do_narrow, and the Ctype1::do_narrow override
is not idempotent.

libstdc++-v3/ChangeLog:

	PR libstdc++/77704
	* include/bits/locale_facets.h (ctype<char>::widen(char)): Check
	if cache is initialized before using it.
	(ctype<char>::narrow(char, char)): Likewise.
	(ctype<char>::widen(const char*, const char*, char, char*)):
	Check again if memcpy can be used after initializing the cache.
	(ctype<char>::narrow(const char*, const char*, char, char*)):
	Likewise.
	(ctype<char>::_M_narrow_cache_status(int)): New member function.
	(ctype<char>::_M_widen_cache_status(int)): New member function.
	* src/c++11/ctype.cc (ctype<char>::_M_narrow_init) [__GTHREADS]:
	Use atomics and a mutex to synchronize accesses to _M_narrow_ok
	and _M_narrow.
	(ctype<char>::_M_widen_init) [__GTHREADS]: Likewise.
	* testsuite/22_locale/ctype/narrow/char/19955.cc: Fix test
	facets so that the array form of do_narrow is equivalent to the
	non-array form.
---
 libstdc++-v3/include/bits/locale_facets.h     | 78 +++++++++++-----
 libstdc++-v3/src/c++11/ctype.cc               | 90 +++++++++++++------
 .../22_locale/ctype/narrow/char/19955.cc      | 26 +++---
 3 files changed, 134 insertions(+), 60 deletions(-)

diff --git a/libstdc++-v3/include/bits/locale_facets.h b/libstdc++-v3/include/bits/locale_facets.h
index 53bb108e3ea..d17b2e630f4 100644
--- a/libstdc++-v3/include/bits/locale_facets.h
+++ b/libstdc++-v3/include/bits/locale_facets.h
@@ -879,10 +879,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       char_type
       widen(char __c) const
       {
-	if (_M_widen_ok)
+	if (_M_widen_cache_status(true))
 	  return _M_widen[static_cast<unsigned char>(__c)];
-	this->_M_widen_init();
-	return this->do_widen(__c);
+	else // Cache not initialized, make a virtual call.
+	  return this->do_widen(__c);
       }
 
       /**
@@ -906,15 +906,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       const char*
       widen(const char* __lo, const char* __hi, char_type* __to) const
       {
-	if (_M_widen_ok == 1)
+	char __cached = _M_widen_cache_status();
+	if (__builtin_expect(!__cached, false))
 	  {
-	    if (__builtin_expect(__hi != __lo, true))
-	      __builtin_memcpy(__to, __lo, __hi - __lo);
-	    return __hi;
+	    _M_widen_init();
+	    __cached = _M_widen_cache_status();
 	  }
-	if (!_M_widen_ok)
-	  _M_widen_init();
-	return this->do_widen(__lo, __hi, __to);
+
+	if (__builtin_expect(__cached == 1, true))
+	  return ctype<char>::do_widen(__lo, __hi, __to);
+	else // do_widen is not the identity function, make a virtual call.
+	  return this->do_widen(__lo, __hi, __to);
       }
 
       /**
@@ -938,12 +940,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       char
       narrow(char_type __c, char __dfault) const
       {
-	if (_M_narrow[static_cast<unsigned char>(__c)])
+	if (_M_narrow_cache_status(true))
 	  return _M_narrow[static_cast<unsigned char>(__c)];
-	const char __t = do_narrow(__c, __dfault);
-	if (__t != __dfault)
-	  _M_narrow[static_cast<unsigned char>(__c)] = __t;
-	return __t;
+	else // Cache not initialized, make a virtual call.
+	  return do_narrow(__c, __dfault);
       }
 
       /**
@@ -972,15 +972,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       narrow(const char_type* __lo, const char_type* __hi,
 	     char __dfault, char* __to) const
       {
-	if (__builtin_expect(_M_narrow_ok == 1, true))
+	char __cached = _M_narrow_cache_status();
+	if (__builtin_expect(!__cached, false))
 	  {
-	    if (__builtin_expect(__hi != __lo, true))
-	      __builtin_memcpy(__to, __lo, __hi - __lo);
-	    return __hi;
+	    _M_narrow_init();
+	    __cached = _M_narrow_cache_status();
 	  }
-	if (!_M_narrow_ok)
-	  _M_narrow_init();
-	return this->do_narrow(__lo, __hi, __dfault, __to);
+
+	if (__builtin_expect(__cached == 1, true))
+	  return ctype<char>::do_narrow(__lo, __hi, __dfault, __to);
+	else // do_narrow is not the identity function, make a virtual call.
+	  return this->do_narrow(__lo, __hi, __dfault, __to);
       }
 
       // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -994,8 +996,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       /// Returns a pointer to the C locale mask table.
       static const mask*
       classic_table() throw();
-    protected:
 
+    protected:
       /**
        *  @brief  Destructor.
        *
@@ -1176,6 +1178,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     private:
       void _M_narrow_init() const;
       void _M_widen_init() const;
+
+      // Atomically check the _M_narrow_ok data member.
+      // Returns 0 if the _M_narrow[] cache has not been initialized.
+      // Returns 1 if do_narrow(c, d) == c for all c, which means that the
+      // array form of do_narrow(lo, hi, d, to) is equivalent to memcpy.
+      // Returns 2 otherwise, which means that the cache is initialized,
+      // but the array form of do_narrow cannot use memcpy.
+      char
+      _M_narrow_cache_status(bool __acq __attribute__((__unused__)) = 0) const
+      {
+#if __GTHREADS
+	return __atomic_load_n(&_M_narrow_ok,
+			       __acq ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED);
+#else
+	return _M_narrow_ok;
+#endif
+      }
+
+      // Atomically check the _M_widen_ok data member.
+      // Semantics are the same as _M_narrow_cache_status().
+      char
+      _M_widen_cache_status(bool __acq __attribute__((__unused__)) = 0) const
+      {
+#if __GTHREADS
+	return __atomic_load_n(&_M_widen_ok,
+			       __acq ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED);
+#else
+	return _M_widen_ok;
+#endif
+      }
     };
 
 #ifdef _GLIBCXX_USE_WCHAR_T
diff --git a/libstdc++-v3/src/c++11/ctype.cc b/libstdc++-v3/src/c++11/ctype.cc
index dfc27227110..d866ebb21a7 100644
--- a/libstdc++-v3/src/c++11/ctype.cc
+++ b/libstdc++-v3/src/c++11/ctype.cc
@@ -23,6 +23,7 @@
 #include <locale>
 #include <cstdlib>
 #include <cstring>
+#include <mutex>
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -58,45 +59,80 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       delete[] this->table();
   }
 
-  // Fill in the narrowing cache and flag whether all values are
-  // valid or not.  _M_narrow_ok is set to 2 if memcpy can't
-  // be used.
+#if __GTHREADS
+  static std::mutex cache_mtx;
+#endif
+
+  // Fill in the narrowing cache and status flag.
+  // _M_narrow_ok is set to 1 if do_narrow is an identity transformation,
+  // so that memcpy(to, lo, hi-lo) can be used of do_narrow(lo, hi, d, to).
+  // Otherwise, _M_narrow_ok is set to 2 and memcpy cannot be used.
+  // In either case, when _M_narrow_ok is non-zero it means that _M_narrow[c]
+  // can be used by instead of calling do_narrow(c, d).
   void
   ctype<char>::
   _M_narrow_init() const
   {
-    char __tmp[sizeof(_M_narrow)];
-    for (size_t __i = 0; __i < sizeof(_M_narrow); ++__i)
-      __tmp[__i] = __i;
-    do_narrow(__tmp, __tmp + sizeof(__tmp), 0, _M_narrow);
+    if (_M_narrow_cache_status(true))
+      return;
 
-    _M_narrow_ok = 1;
-    if (__builtin_memcmp(__tmp, _M_narrow, sizeof(_M_narrow)))
-      _M_narrow_ok = 2;
-    else
-      {
-	// Deal with the special case of zero: renarrow with a
-	// different default and compare.
-	char __c;
-	do_narrow(__tmp, __tmp + 1, 1, &__c);
-	if (__c == 1)
-	  _M_narrow_ok = 2;
-      }
+    constexpr size_t N = sizeof(_M_narrow);
+
+    char noconv[N];
+    for (size_t i = 0; i < N; ++i)
+      noconv[i] = i;
+
+    char result[N];
+    result[0] = do_narrow(char(0), char(1));
+    do_narrow(noconv + 1, noconv + N, char(0), result + 1);
+
+    char narrow_ok = 1;
+    if (__builtin_memcmp(noconv, result, N))
+      narrow_ok = 2; // do_narrow(c, d) != c for some values of c.
+
+#if __GTHREADS
+    lock_guard<mutex> l(cache_mtx);
+    if (_M_narrow_cache_status())
+      return;
+    __builtin_memcpy(_M_narrow, result, N);
+    __atomic_store_n(&_M_narrow_ok, narrow_ok, __ATOMIC_RELEASE);
+#else
+    __builtin_memcpy(_M_narrow, result, N);
+    _M_narrow_ok = narrow_ok;
+#endif
   }
 
+  // See comment on _M_narrow_init.
   void
   ctype<char>::
   _M_widen_init() const
   {
-    char __tmp[sizeof(_M_widen)];
-    for (size_t __i = 0; __i < sizeof(_M_widen); ++__i)
-      __tmp[__i] = __i;
-    do_widen(__tmp, __tmp + sizeof(__tmp), _M_widen);
+    if (_M_widen_cache_status(true))
+      return;
 
-    _M_widen_ok = 1;
-    // Set _M_widen_ok to 2 if memcpy can't be used.
-    if (__builtin_memcmp(__tmp, _M_widen, sizeof(_M_widen)))
-      _M_widen_ok = 2;
+    constexpr size_t N = sizeof(_M_widen);
+
+    char noconv[N];
+    for (size_t i = 0; i < N; ++i)
+      noconv[i] = i;
+
+    char result[N];
+    do_widen(noconv, noconv + N, result);
+
+    char widen_ok = 1;
+    if (__builtin_memcmp(noconv, result, N))
+      widen_ok = 2; // do_widen(c) != c for some values of c.
+
+#if __GTHREADS
+    lock_guard<mutex> l(cache_mtx);
+    if (_M_widen_cache_status())
+      return;
+    __builtin_memcpy(_M_widen, result, N);
+    __atomic_store_n(&_M_widen_ok, widen_ok, __ATOMIC_RELEASE);
+#else
+    __builtin_memcpy(_M_widen, result, N);
+    _M_widen_ok = widen_ok;
+#endif
   }
 
 #ifdef _GLIBCXX_USE_WCHAR_T
diff --git a/libstdc++-v3/testsuite/22_locale/ctype/narrow/char/19955.cc b/libstdc++-v3/testsuite/22_locale/ctype/narrow/char/19955.cc
index ecbfee5e576..9834b70f47a 100644
--- a/libstdc++-v3/testsuite/22_locale/ctype/narrow/char/19955.cc
+++ b/libstdc++-v3/testsuite/22_locale/ctype/narrow/char/19955.cc
@@ -26,12 +26,16 @@ class Ctype1
 : public std::ctype<char> 
 {
 protected:
+  char
+  do_narrow(char c, char) const
+  { return ~c; }
+
   const char*
   do_narrow(const char* lo, const char* hi,
-	    char, char* to) const 
+	    char dflt, char* to) const
   {
-    for (int i = 0; lo != hi; ++lo, ++to, ++i)
-      *to = *lo + i;
+    while (lo != hi)
+      *to++ = do_narrow(*lo++, dflt);
     return hi;
   }
 };
@@ -40,15 +44,16 @@ class Ctype2
 : public std::ctype<char> 
 {
 protected:
+  char
+  do_narrow(char c, char dflt) const
+  { return c == '\000' ? dflt : c; }
+
   const char*
   do_narrow(const char* lo, const char* hi,
-	    char dflt, char* to) const 
+	    char dflt, char* to) const
   {
-    for (int i = 0; lo != hi; ++lo, ++to, ++i)
-      if (*lo == '\000')
-	*to = dflt;
-      else
-	*to = *lo;
+    while (lo != hi)
+      *to++ = do_narrow(*lo++, dflt);
     return hi;
   }
 };
@@ -71,7 +76,8 @@ void test01()
   mc1.narrow(src, src + sizeof(src), '*', dst1);
   mc1.narrow(src, src + sizeof(src), '*', dst2);
 
-  VERIFY( !memcmp(dst1, "aceg\004", 5) );
+  const char expected[] = { ~'a', ~'b', ~'c', ~'d', ~0 };
+  VERIFY( !memcmp(dst1, expected, 5) );
   VERIFY( !memcmp(dst1, dst2, 5) );
 
   locale mylocale2(locale::classic(), new Ctype2);
-- 
2.44.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704]
  2024-05-07 13:52 [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Jonathan Wakely
  2024-05-07 13:52 ` [PATCH 2/2] libstdc++: Fix data races in std::ctype<char> [PR77704] Jonathan Wakely
@ 2024-05-15 11:22 ` Jonathan Wakely
  1 sibling, 0 replies; 3+ messages in thread
From: Jonathan Wakely @ 2024-05-15 11:22 UTC (permalink / raw)
  To: libstdc++, gcc-patches

Pushed to trunk.

On Tue, 7 May 2024 at 15:04, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> Tested x86_64-linux. This seems "obviously correct", and I'd like to
> push it. The current code definitely has a data race, i.e. undefined
> behaviour.
>
> -- >8 --
>
> The lazy caching in std::basic_ios::fill() updates a mutable member
> without synchronization, which can cause a data race if two threads both
> call fill() on the same stream object when _M_fill_init is false.
>
> To avoid this we can just cache the _M_fill member and set _M_fill_init
> early in std::basic_ios::init, instead of doing it lazily. As explained
> by the comment in init, there's a good reason for doing it lazily. When
> char_type is neither char nor wchar_t, the locale might not have a
> std::ctype<char_type>, so getting the fill character would throw an
> exception. The current lazy init allows using unformatted I/O with such
> a stream, because the fill character is never needed and so it doesn't
> matter if the locale doesn't have a ctype<char_type> facet. We can
> maintain this property by only setting the fill character in
> std::basic_ios::init if the ctype facet is present at that time. If
> fill() is called later and the fill character wasn't set by init, we can
> get it from the stream's current locale at the point when fill() is
> called (and not try to cache it without synchronization).
>
> This causes a change in behaviour for the following program:
>
>   std::ostringstream out;
>   out.imbue(loc);
>   auto fill = out.fill();
>
> Previously the fill character would have been set when fill() is called,
> and so would have used the new locale. This commit changes it so that
> the fill character is set on construction and isn't affected by the new
> locale being imbued later. This new behaviour seems to be what the
> standard requires, and matches MSVC.
>
> The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still
> possible to use a std::basic_ios without the ctype<char_type> facet
> being present at construction.
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/77704
>         * include/bits/basic_ios.h (basic_ios::fill()): Do not modify
>         _M_fill and _M_fill_init in a const member function.
>         (basic_ios::fill(char_type)): Use _M_fill directly instead of
>         calling fill(). Set _M_fill_init to true.
>         * include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and
>         _M_fill_init here instead.
>         * testsuite/27_io/basic_ios/fill/char/1.cc: New test.
>         * testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test.
> ---
>  libstdc++-v3/include/bits/basic_ios.h         | 10 +--
>  libstdc++-v3/include/bits/basic_ios.tcc       | 15 +++-
>  .../testsuite/27_io/basic_ios/fill/char/1.cc  | 78 +++++++++++++++++++
>  .../27_io/basic_ios/fill/wchar_t/1.cc         | 55 +++++++++++++
>  4 files changed, 148 insertions(+), 10 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
>  create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
>
> diff --git a/libstdc++-v3/include/bits/basic_ios.h b/libstdc++-v3/include/bits/basic_ios.h
> index 258e6042b8f..bc3be4d2e37 100644
> --- a/libstdc++-v3/include/bits/basic_ios.h
> +++ b/libstdc++-v3/include/bits/basic_ios.h
> @@ -373,11 +373,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        char_type
>        fill() const
>        {
> -       if (!_M_fill_init)
> -         {
> -           _M_fill = this->widen(' ');
> -           _M_fill_init = true;
> -         }
> +       if (__builtin_expect(!_M_fill_init, false))
> +         return this->widen(' ');
>         return _M_fill;
>        }
>
> @@ -393,8 +390,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        char_type
>        fill(char_type __ch)
>        {
> -       char_type __old = this->fill();
> +       char_type __old = _M_fill;
>         _M_fill = __ch;
> +       _M_fill_init = true;
>         return __old;
>        }
>
> diff --git a/libstdc++-v3/include/bits/basic_ios.tcc b/libstdc++-v3/include/bits/basic_ios.tcc
> index a9313736e32..0197bdf8f67 100644
> --- a/libstdc++-v3/include/bits/basic_ios.tcc
> +++ b/libstdc++-v3/include/bits/basic_ios.tcc
> @@ -138,13 +138,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        // return without throwing an exception. Unfortunately,
>        // ctype<char_type> is not necessarily a required facet, so
>        // streams with char_type != [char, wchar_t] will not have it by
> -      // default. Because of this, the correct value for _M_fill is
> -      // constructed on the first call of fill(). That way,
> +      // default. If the ctype<char_type> facet is available now,
> +      // _M_fill is set here, but otherwise no fill character will be
> +      // cached and a call to fill() will check for the facet again later
> +      // (and will throw if the facet is still not present). This way
>        // unformatted input and output with non-required basic_ios
>        // instantiations is possible even without imbuing the expected
>        // ctype<char_type> facet.
> -      _M_fill = _CharT();
> -      _M_fill_init = false;
> +      if (_M_ctype)
> +       {
> +         _M_fill = _M_ctype->widen(' ');
> +         _M_fill_init = true;
> +       }
> +      else
> +       _M_fill_init = false;
>
>        _M_tie = 0;
>        _M_exception = goodbit;
> diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
> new file mode 100644
> index 00000000000..d5747c7507f
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
> @@ -0,0 +1,78 @@
> +// { dg-do run }
> +
> +#include <ios>
> +#include <locale>
> +#include <streambuf>
> +#include <testsuite_hooks.h>
> +
> +typedef char C;
> +
> +struct tabby_mctype : std::ctype<C>
> +{
> +  C do_widen(char c) const { return c == ' ' ? '\t' : c; }
> +
> +  const char*
> +  do_widen(const char* lo, const char* hi, C* to) const
> +  {
> +    while (lo != hi)
> +      *to++ = do_widen(*lo++);
> +    return hi;
> +  }
> +};
> +
> +void
> +test01()
> +{
> +  std::basic_ios<C> out(0);
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  out.imbue(loc);
> +  VERIFY( out.fill() == ' ' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill('*');
> +  VERIFY( out.fill() == '*' ); // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
> +}
> +
> +void
> +test02()
> +{
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  std::locale::global(loc);
> +  std::basic_ios<C> out(0);
> +  VERIFY( out.fill() == '\t' );
> +  out.imbue(std::locale::classic());
> +  VERIFY( out.fill() == '\t' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill('*');
> +  VERIFY( out.fill() == '*' );  // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == '*' );  // Imbuing a new locale doesn't affect fill().
> +}
> +
> +void
> +test03()
> +{
> +  // This function tests a libstdc++ extension: if no ctype<char_type> facet
> +  // is present when the stream is initialized, a fill character will not be
> +  // cached. Calling fill() will obtain a fill character from the locale each
> +  // time it's called.
> +  typedef signed char C2;
> +  std::basic_ios<C2> out(0);
> +#if __cpp_exceptions
> +  try {
> +    (void) out.fill(); // No ctype<signed char> in the locale.
> +    VERIFY( false );
> +  } catch (...) {
> +  }
> +#endif
> +  out.fill('*');
> +  VERIFY( out.fill() == '*' ); // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
> +}
> +
> +int main()
> +{
> +  test01();
> +  test02();
> +  test03();
> +}
> diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
> new file mode 100644
> index 00000000000..2d639a0844d
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
> @@ -0,0 +1,55 @@
> +// { dg-do run }
> +
> +#include <ios>
> +#include <locale>
> +#include <streambuf>
> +#include <testsuite_hooks.h>
> +
> +typedef wchar_t C;
> +
> +struct tabby_mctype : std::ctype<C>
> +{
> +  C do_widen(char c) const { return c == ' ' ? L'\t' : c; }
> +
> +  const char*
> +  do_widen(const char* lo, const char* hi, C* to) const
> +  {
> +    while (lo != hi)
> +      *to++ = do_widen(*lo++);
> +    return hi;
> +  }
> +};
> +
> +void
> +test01()
> +{
> +  std::basic_ios<C> out(0);
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  out.imbue(loc);
> +  VERIFY( out.fill() == L' ' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill(L'*');
> +  VERIFY( out.fill() == L'*' ); // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == L'*' ); // Imbuing a new locale doesn't affect fill().
> +}
> +
> +void
> +test02()
> +{
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  std::locale::global(loc);
> +  std::basic_ios<C> out(0);
> +  VERIFY( out.fill() == L'\t' );
> +  out.imbue(std::locale::classic());
> +  VERIFY( out.fill() == L'\t' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill(L'*');
> +  VERIFY( out.fill() == L'*' );  // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == L'*' );  // Imbuing a new locale doesn't affect fill().
> +}
> +
> +int main()
> +{
> +  test01();
> +  test02();
> +}
> --
> 2.44.0
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-05-15 11:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-07 13:52 [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Jonathan Wakely
2024-05-07 13:52 ` [PATCH 2/2] libstdc++: Fix data races in std::ctype<char> [PR77704] Jonathan Wakely
2024-05-15 11:22 ` [PATCH 1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).