public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [committed] libstdc++: Optimize ref-count updates in COW std::string
@ 2021-12-01 15:08 Jonathan Wakely
  2021-12-01 18:16 ` Florian Weimer
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Wakely @ 2021-12-01 15:08 UTC (permalink / raw)
  To: libstdc++, gcc-patches

Tested x86_64-linux, pushed to trunk.


Most ref-count updates in the COW string are done via the functions in
<ext/atomicity.h>, which will use non-atomic ops when the program is
known to be single-threaded. The _M_is_leaked() and _M_is_shared()
functions use __atomic_load_n directly, because <ext/atomicity.h>
doesn't provide a load operation. Those functions can check the
__is_single_threaded() predicate to avoid using __atomic_load_n when not
needed.

The move constructor for the fully-dynamic-string increments the
ref-count by either 2 or 1, for leaked or non-leaked strings
respectively. That can be changed to use a non-atomic store of 1 for all
non-shared strings. It can be non-atomic because even if the program is
multi-threaded, conflicting access to the rvalue object while it's being
moved from would be data race anyway. It can store 1 directly for all
non-shared strings because it doesn't matter whether the initial
refcount was -1 or 0, it should be 1 after the move constructor creates
a second owner.

libstdc++-v3/ChangeLog:

	* include/bits/cow_string.h (basic_string::_M_is_leaked): Use
	non-atomic load when __is_single_threaded() is true.
	(basic_string::_M_is_shared): Likewise.
	(basic_string::(basic_string&&)) [_GLIBCXX_FULLY_DYNAMIC_STRING]:
	Use non-atomic store when rvalue is not shared.
---
 libstdc++-v3/include/bits/cow_string.h | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h
index ced395b80b8..4fae1d02981 100644
--- a/libstdc++-v3/include/bits/cow_string.h
+++ b/libstdc++-v3/include/bits/cow_string.h
@@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    *  destroy the empty-string _Rep object.
    *
    *  All but the last paragraph is considered pretty conventional
-   *  for a C++ string implementation.
+   *  for a Copy-On-Write C++ string implementation.
   */
   // 21.3  Template class basic_string
   template<typename _CharT, typename _Traits, typename _Alloc>
@@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  // so we need to use an atomic load. However, _M_is_leaked
 	  // predicate does not change concurrently (i.e. the string is either
 	  // leaked or not), so a relaxed load is enough.
-	  return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
-#else
-	  return this->_M_refcount < 0;
+	  if (!__gnu_cxx::__is_single_threaded())
+	    return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
 #endif
+	  return this->_M_refcount < 0;
 	}
 
 	bool
@@ -222,10 +222,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  // but one reference concurrently with this check, so we need this
 	  // load to be acquire to synchronize with release fetch_and_add in
 	  // _M_dispose.
-	  return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0;
-#else
-	  return this->_M_refcount > 0;
+	  if (!__gnu_cxx::__is_single_threaded())
+	    return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0;
 #endif
+	  return this->_M_refcount > 0;
 	}
 
 	void
@@ -629,12 +629,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #else
 	// Rather than allocate an empty string for the rvalue string,
 	// just share ownership with it by incrementing the reference count.
-	// If the rvalue string was "leaked" then it was the unique owner,
-	// so need an extra increment to indicate shared ownership.
-	if (_M_rep()->_M_is_leaked())
-	  __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 2);
-	else
+	// If the rvalue string was the unique owner then there are exactly
+	// two owners now.
+	if (_M_rep()->_M_is_shared())
 	  __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1);
+	else
+	  _M_rep()->_M_refcount = 1;
 #endif
       }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [committed] libstdc++: Optimize ref-count updates in COW std::string
  2021-12-01 15:08 [committed] libstdc++: Optimize ref-count updates in COW std::string Jonathan Wakely
@ 2021-12-01 18:16 ` Florian Weimer
  2021-12-01 18:24   ` Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Weimer @ 2021-12-01 18:16 UTC (permalink / raw)
  To: Jonathan Wakely via Libstdc++; +Cc: gcc-patches, Jonathan Wakely

* Jonathan Wakely via Libstdc:

> diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h
> index ced395b80b8..4fae1d02981 100644
> --- a/libstdc++-v3/include/bits/cow_string.h
> +++ b/libstdc++-v3/include/bits/cow_string.h
> @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>     *  destroy the empty-string _Rep object.
>     *
>     *  All but the last paragraph is considered pretty conventional
> -   *  for a C++ string implementation.
> +   *  for a Copy-On-Write C++ string implementation.
>    */
>    // 21.3  Template class basic_string
>    template<typename _CharT, typename _Traits, typename _Alloc>
> @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	  // so we need to use an atomic load. However, _M_is_leaked
>  	  // predicate does not change concurrently (i.e. the string is either
>  	  // leaked or not), so a relaxed load is enough.
> -	  return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> -#else
> -	  return this->_M_refcount < 0;
> +	  if (!__gnu_cxx::__is_single_threaded())
> +	    return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
>  #endif
> +	  return this->_M_refcount < 0;
>  	}

Relaxed MO loads of word-size values on all current architectures only
have a compiler barrier, so I think the optimization makes things worse?
(I doubt the conditional lack of a compiler barrier leads to
optimization improvements elsewhere.)

Thanks,
Florian


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [committed] libstdc++: Optimize ref-count updates in COW std::string
  2021-12-01 18:16 ` Florian Weimer
@ 2021-12-01 18:24   ` Jonathan Wakely
  2021-12-02 16:55     ` [committed] libstdc++: Restore unconditional atomic load " Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Wakely @ 2021-12-01 18:24 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Jonathan Wakely via Libstdc++, gcc Patches

On Wed, 1 Dec 2021 at 18:16, Florian Weimer wrote:
>
> * Jonathan Wakely via Libstdc:
>
> > diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h
> > index ced395b80b8..4fae1d02981 100644
> > --- a/libstdc++-v3/include/bits/cow_string.h
> > +++ b/libstdc++-v3/include/bits/cow_string.h
> > @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >     *  destroy the empty-string _Rep object.
> >     *
> >     *  All but the last paragraph is considered pretty conventional
> > -   *  for a C++ string implementation.
> > +   *  for a Copy-On-Write C++ string implementation.
> >    */
> >    // 21.3  Template class basic_string
> >    template<typename _CharT, typename _Traits, typename _Alloc>
> > @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >         // so we need to use an atomic load. However, _M_is_leaked
> >         // predicate does not change concurrently (i.e. the string is either
> >         // leaked or not), so a relaxed load is enough.
> > -       return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> > -#else
> > -       return this->_M_refcount < 0;
> > +       if (!__gnu_cxx::__is_single_threaded())
> > +         return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> >  #endif
> > +       return this->_M_refcount < 0;
> >       }
>
> Relaxed MO loads of word-size values on all current architectures only
> have a compiler barrier, so I think the optimization makes things worse?

Hmm, yes.

> (I doubt the conditional lack of a compiler barrier leads to
> optimization improvements elsewhere.)

Probably not. I'll revert the change to _M_is_leaked() and just keep
it for _M_is_shared().

Thanks for pointing that out.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [committed] libstdc++: Restore unconditional atomic load in COW std::string
  2021-12-01 18:24   ` Jonathan Wakely
@ 2021-12-02 16:55     ` Jonathan Wakely
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Wakely @ 2021-12-02 16:55 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Jonathan Wakely via Libstdc++, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 1919 bytes --]

On Wed, 1 Dec 2021 at 18:24, Jonathan Wakely wrote:
>
> On Wed, 1 Dec 2021 at 18:16, Florian Weimer wrote:
> >
> > * Jonathan Wakely via Libstdc:
> >
> > > diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h
> > > index ced395b80b8..4fae1d02981 100644
> > > --- a/libstdc++-v3/include/bits/cow_string.h
> > > +++ b/libstdc++-v3/include/bits/cow_string.h
> > > @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >     *  destroy the empty-string _Rep object.
> > >     *
> > >     *  All but the last paragraph is considered pretty conventional
> > > -   *  for a C++ string implementation.
> > > +   *  for a Copy-On-Write C++ string implementation.
> > >    */
> > >    // 21.3  Template class basic_string
> > >    template<typename _CharT, typename _Traits, typename _Alloc>
> > > @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >         // so we need to use an atomic load. However, _M_is_leaked
> > >         // predicate does not change concurrently (i.e. the string is either
> > >         // leaked or not), so a relaxed load is enough.
> > > -       return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> > > -#else
> > > -       return this->_M_refcount < 0;
> > > +       if (!__gnu_cxx::__is_single_threaded())
> > > +         return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> > >  #endif
> > > +       return this->_M_refcount < 0;
> > >       }
> >
> > Relaxed MO loads of word-size values on all current architectures only
> > have a compiler barrier, so I think the optimization makes things worse?
>
> Hmm, yes.
>
> > (I doubt the conditional lack of a compiler barrier leads to
> > optimization improvements elsewhere.)
>
> Probably not. I'll revert the change to _M_is_leaked() and just keep
> it for _M_is_shared().
>
> Thanks for pointing that out.

Reverted by the attached patch, tested powerpc64le-linux.

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 1263 bytes --]

commit b5a568683f71b4a8b1e4e45a43484398e9a66ff2
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Dec 1 20:58:58 2021

    libstdc++: Restore unconditional atomic load in COW std::string
    
    The relaxed load is already optimal, checking the __single_threaded
    global before doing a non-atomic load isn't an optimization.
    
    libstdc++-v3/ChangeLog:
    
            * include/bits/cow_string.h (basic_string::_M_is_leaked()):
            Revert change to check __is_single_threaded() before using
            atomic load.

diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h
index d6ddf3489d1..389b39583e4 100644
--- a/libstdc++-v3/include/bits/cow_string.h
+++ b/libstdc++-v3/include/bits/cow_string.h
@@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  // so we need to use an atomic load. However, _M_is_leaked
 	  // predicate does not change concurrently (i.e. the string is either
 	  // leaked or not), so a relaxed load is enough.
-	  if (!__gnu_cxx::__is_single_threaded())
-	    return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
-#endif
+	  return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
+#else
 	  return this->_M_refcount < 0;
+#endif
 	}
 
 	bool

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-12-02 16:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01 15:08 [committed] libstdc++: Optimize ref-count updates in COW std::string Jonathan Wakely
2021-12-01 18:16 ` Florian Weimer
2021-12-01 18:24   ` Jonathan Wakely
2021-12-02 16:55     ` [committed] libstdc++: Restore unconditional atomic load " Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).