public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: Jonathan Wakely <jwakely.gcc@gmail.com>
Cc: Nathaniel Shead <nathanieloshead@gmail.com>,
	Jason Merrill <jason@redhat.com>,
	 "libstdc++" <libstdc++@gcc.gnu.org>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] libstdc++: Ensure active union member is correctly set
Date: Sat, 21 Oct 2023 15:45:03 +0100	[thread overview]
Message-ID: <CACb0b4kcGnogY2ite0BLgd=t2W+BJ1+qVVQG=50mcpJAWKirhQ@mail.gmail.com> (raw)
In-Reply-To: <CAH6eHdQDTGTt72bHY8uEjFNRgbYEDzn9G=fuT2T0_Rxu9FKSRw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 11322 bytes --]

On Fri, 29 Sept 2023 at 17:46, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
> On Fri, 29 Sept 2023 at 17:29, Nathaniel Shead
> <nathanieloshead@gmail.com> wrote:
> >
> > On Fri, Sep 29, 2023 at 04:06:33PM +0100, Jonathan Wakely wrote:
> > > On Fri, 29 Sept 2023 at 10:32, Jonathan Wakely <jwakely@redhat.com> wrote:
> > > > > Thanks for the comments, here's an updated version of the patch.
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu.
> > > >
> > > > Great, I'll get this committed today - thanks!
> > >
> > > That's done now.
> > >
> >
> > Thanks!
> >
> > > > >
> > > > > I'll note that there are some existing calls to `_M_use_local_data()`
> > > > > already used only for their side effects without a cast to void, e.g.
> > > > >
> > > > >       /**
> > > > >        *  @brief  Default constructor creates an empty string.
> > > > >        */
> > > > >       _GLIBCXX20_CONSTEXPR
> > > > >       basic_string()
> > > > >       _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
> > > > >       : _M_dataplus(_M_local_data())
> > > > >       {
> > > > >         _M_use_local_data();
> > > > >         _M_set_length(0);
> > > > >       }
> > > > >
> > > > > I haven't updated these, but should this be changed for consistency?
> > > >
> > > > Yes, good idea. I can do that.
> > >
> > > I started to do that, and decided it made more sense to split out the
> > > constexpr loop from _M_use_local_data() into a separate function,
> > > _M_init_local_buf(). Then we can use that for all the places where we
> > > don't care about the return value. That avoids the overhead of using
> > > pointer_traits::pointer_to when we don't need the return value (which
> > > is admittedly only going to be an issue for -O0 code, but I think it's
> > > cleaner this way anyway).
> > >
> > > Please see the attached patch and let me know what you think.
> >
> > I agree, and it also looks clearer to me what is happening.
>
> Good, I'll make this change next week then.
>
>
> >
> > >
> > > > Thanks again for fixing these. I think this might fix some bug reports
> > > > about clang rejecting our std::string in constant expressions, so I'll
> > > > check those.
> > >
> > > Your patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900
> > > (so we should backport it to gcc-13 and gcc-12 too).
> >
> > > commit 2668979d3206ff6c039ac0165aae29377a15666c
> > > Author: Jonathan Wakely <jwakely@redhat.com>
> > > Date:   Fri Sep 29 12:12:22 2023
> > >
> > >     libstdc++: Split std::basic_string::_M_use_local_data into two functions
> > >
> > >     This splits out the activate-the-union-member-for-constexpr logic from
> > >     _M_use_local_data, so that it can be used separately in cases that don't
> > >     need to use std::pointer_traits<pointer>::pointer_to to obtain the
> > >     return value.
> > >
> > >     This leaves only three uses of _M_use_local_data() which are all the
> > >     same form:
> > >
> > >       __s._M_data(_M_use_local_data());
> > >       __s._M_set_length(0);
> > >
> > >     We could remove _M_use_local_data() and change those three places to use
> > >     a new _M_reset() function that does:
> > >
> > >       _M_init_local_buf();
> > >       _M_data(_M_local_data());
> > >       _M_set_length(0);
> > >
> > >     This is left for a future change.
> > >
> > >     libstdc++-v3/ChangeLog:
> > >
> > >             * include/bits/basic_string.h (_M_init_local_buf()): New
> > >             function.
> > >             (_M_use_local_data()): Use _M_init_local_buf.
> > >             (basic_string(), basic_string(const Alloc&))
> > >             (basic_string(basic_string&&))
> > >             (basic_string(basic_string&&, const Alloc&)): Use
> > >             _M_init_local_buf instead of _M_use_local_data().
> > >             * include/bits/basic_string.tcc (swap(basic_string&))
> > >             (_M_construct(InIter, InIter, forward_iterator_tag))
> > >             (_M_construct(size_type, CharT), reserve()): Likewise.
> > >             (_M_construct(InIter, InIter, input_iterator_tag)): Remove call
> > >             to _M_use_local_data() and initialize the local buffer directly.
> > >
> > > diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
> > > index 4f94cd967cf..18a19b8dcbc 100644
> > > --- a/libstdc++-v3/include/bits/basic_string.h
> > > +++ b/libstdc++-v3/include/bits/basic_string.h
> > > @@ -353,13 +353,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> > >        // Ensure that _M_local_buf is the active member of the union.
> > >        __attribute__((__always_inline__))
> > >        _GLIBCXX14_CONSTEXPR
> > > -      pointer
> > > -      _M_use_local_data() _GLIBCXX_NOEXCEPT
> > > +      void
> > > +      _M_init_local_buf() _GLIBCXX_NOEXCEPT
> > >        {
> > >  #if __cpp_lib_is_constant_evaluated
> > >       if (std::is_constant_evaluated())
> > >         for (size_type __i = 0; __i <= _S_local_capacity; ++__i)
> > >           _M_local_buf[__i] = _CharT();
> > > +#endif
> > > +      }
> > > +
> > > +      __attribute__((__always_inline__))
> > > +      _GLIBCXX14_CONSTEXPR
> > > +      pointer
> > > +      _M_use_local_data() _GLIBCXX_NOEXCEPT
> > > +      {
> > > +#if __glibcxx_is_constant_evaluated
> > > +     _M_init_local_buf();
> > >  #endif
> > >       return _M_local_data();
> > >        }
> >
> > What's the difference between __cpp_lib_is_constant_evaluated and
> > __glibcxx_is_constant_evaluated? Should these lines be using the same
> > macro here?
>
> Ah, yeah, they could be the same.
>
> The difference is that (for most feature test macros) the
> __glibcxx_ftm form is defined after including <bits/version.h>, but
> the __cpp_lib_ftm form is only defined only by headers that do:
>
> #define __glibcxx_want_ftm
> #include <bits/version.h>
>
> This means we can ensure that the __cpp_lib_ftm form is only defined
> by headers where we actually want to define it, e.g. in the headers
> that [version.syn] in the standard says should define the macro. So
> for our own internal uses, we should generally rely on the
> __glibcxx_ftm one. Users should rely on __cpp_lib_ftm after including
> the correct header. For example, __glibcxx_atomic_wait is defined
> after including <bits/atomic_wait.h> and so is available for our own
> uses, but __cpp_lib_atomic_wait is only defined after including
> <atomic>, not just <bits/atomic_wait.h>. Or at least, that's the plan
> - I have a pending patch to make everything I just said true :-)
> Currently we "leak" the __cpp_lib_ftm forms into lots of <bits/xxx.h>
> internal headers. That will change next week.
>
> In this specific case, it doesn't matter, because
> __cpp_lib_is_constant_evaluated is defined by <type_traits>, and every
> header includes that. So it doesn't really matter whether our internal
> uses are __cpp_lib_is_constant_evaluated or
> __glibcxx_is_constant_evaluated. But it would be good to be
> consistent.
>
> >
> > > @@ -522,7 +532,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> > >        _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
> > >        : _M_dataplus(_M_local_data())
> > >        {
> > > -     _M_use_local_data();
> > > +     _M_init_local_buf();
> > >       _M_set_length(0);
> > >        }
> > >
> > > @@ -534,7 +544,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> > >        basic_string(const _Alloc& __a) _GLIBCXX_NOEXCEPT
> > >        : _M_dataplus(_M_local_data(), __a)
> > >        {
> > > -     _M_use_local_data();
> > > +     _M_init_local_buf();
> > >       _M_set_length(0);
> > >        }
> > >
> > > @@ -678,7 +688,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> > >        {
> > >       if (__str._M_is_local())
> > >         {
> > > -         (void)_M_use_local_data();
> > > +         _M_init_local_buf();
> > >           traits_type::copy(_M_local_buf, __str._M_local_buf,
> > >                             __str.length() + 1);
> > >         }
> > > @@ -718,7 +728,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> > >        {
> > >       if (__str._M_is_local())
> > >         {
> > > -         (void)_M_use_local_data();
> > > +         _M_init_local_buf();
> > >           traits_type::copy(_M_local_buf, __str._M_local_buf,
> > >                             __str.length() + 1);
> > >           _M_length(__str.length());
> > > diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
> > > index 4bc98f2aea7..052eeb9e846 100644
> > > --- a/libstdc++-v3/include/bits/basic_string.tcc
> > > +++ b/libstdc++-v3/include/bits/basic_string.tcc
> > > @@ -79,7 +79,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >             }
> > >           else if (__s.length())
> > >             {
> > > -             (void)_M_use_local_data();
> > > +             _M_init_local_buf();
> > >               traits_type::copy(_M_local_buf, __s._M_local_buf,
> > >                                 __s.length() + 1);
> > >               _M_length(__s.length());
> > > @@ -88,7 +88,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >             }
> > >           else if (length())
> > >             {
> > > -             (void)__s._M_use_local_data();
> > > +             __s._M_init_local_buf();
> > >               traits_type::copy(__s._M_local_buf, _M_local_buf,
> > >                                 length() + 1);
> > >               __s._M_length(length());
> > > @@ -99,7 +99,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >       else
> > >         {
> > >           const size_type __tmp_capacity = __s._M_allocated_capacity;
> > > -         (void)__s._M_use_local_data();
> > > +         __s._M_init_local_buf();
> > >           traits_type::copy(__s._M_local_buf, _M_local_buf,
> > >                             length() + 1);
> > >           _M_data(__s._M_data());
> > > @@ -111,7 +111,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >         const size_type __tmp_capacity = _M_allocated_capacity;
> > >         if (__s._M_is_local())
> > >           {
> > > -           (void)_M_use_local_data();
> > > +           _M_init_local_buf();
> > >             traits_type::copy(_M_local_buf, __s._M_local_buf,
> > >                               __s.length() + 1);
> > >             __s._M_data(_M_data());
> > > @@ -174,14 +174,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >       size_type __len = 0;
> > >       size_type __capacity = size_type(_S_local_capacity);
> > >
> > > -     pointer __p = _M_use_local_data();
> > > -
> > >       while (__beg != __end && __len < __capacity)
> > >         {
> > > -         __p[__len++] = *__beg;
> > > +         _M_local_buf[__len++] = *__beg;
> > >           ++__beg;
> > >         }
> > >
> > > +#if __glibcxx_is_constant_evaluated
> > > +     if (std::is_constant_evaluated())
> > > +       for (size_type __i = __len; __i <= __capacity; ++__i)
> > > +         _M_local_buf[__i] = _CharT();
> > > +#endif
> > > +
> >
> > I wonder if maybe this should still be a call to `_M_init_local_buf()`
> > above, where the `_M_use_local_data()` used to be? That way the logic
> > stays in one place, and I don't imagine the compile time savings of not
> > immediately overwriting the first __len characters would be significant.
>
>
> Yeah, I went back and forth on that, but you're probably right. I'll
> do as you suggested.

Here's what I've pushed to trunk after testing on x86_64-linux.

Thanks again for the fixes.

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 6006 bytes --]

commit 405a4140fc30bce86b1ec885a98bb17704e0c8c6
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Sep 29 12:12:22 2023

    libstdc++: Split std::basic_string::_M_use_local_data into two functions
    
    This splits out the activate-the-union-member-for-constexpr logic from
    _M_use_local_data, so that it can be used separately in cases that don't
    need to use std::pointer_traits<pointer>::pointer_to to obtain the
    return value.
    
    This leaves only three uses of _M_use_local_data() which are all of the
    same form:
    
      __s._M_data(_M_use_local_data());
      __s._M_set_length(0);
    
    We could remove _M_use_local_data() and change those three places to use
    a new _M_reset() function that does:
    
      _M_init_local_buf();
      _M_data(_M_local_data());
      _M_set_length(0);
    
    This is left for a future change.
    
    libstdc++-v3/ChangeLog:
    
            * include/bits/basic_string.h (_M_init_local_buf()): New
            function.
            (_M_use_local_data()): Use _M_init_local_buf.
            (basic_string(), basic_string(const Alloc&))
            (basic_string(basic_string&&))
            (basic_string(basic_string&&, const Alloc&)): Use
            _M_init_local_buf instead of _M_use_local_data().
            * include/bits/basic_string.tcc (swap(basic_string&))
            (_M_construct(InIter, InIter, input_iterator_tag))
            (_M_construct(InIter, InIter, forward_iterator_tag))
            (_M_construct(size_type, CharT), reserve()): Likewise.

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index 4f94cd967cf..0fa32afeb84 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -353,13 +353,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
       // Ensure that _M_local_buf is the active member of the union.
       __attribute__((__always_inline__))
       _GLIBCXX14_CONSTEXPR
-      pointer
-      _M_use_local_data() _GLIBCXX_NOEXCEPT
+      void
+      _M_init_local_buf() _GLIBCXX_NOEXCEPT
       {
 #if __cpp_lib_is_constant_evaluated
 	if (std::is_constant_evaluated())
 	  for (size_type __i = 0; __i <= _S_local_capacity; ++__i)
 	    _M_local_buf[__i] = _CharT();
+#endif
+      }
+
+      __attribute__((__always_inline__))
+      _GLIBCXX14_CONSTEXPR
+      pointer
+      _M_use_local_data() _GLIBCXX_NOEXCEPT
+      {
+#if __cpp_lib_is_constant_evaluated
+	_M_init_local_buf();
 #endif
 	return _M_local_data();
       }
@@ -522,7 +532,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
       _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
       : _M_dataplus(_M_local_data())
       {
-	_M_use_local_data();
+	_M_init_local_buf();
 	_M_set_length(0);
       }
 
@@ -534,7 +544,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
       basic_string(const _Alloc& __a) _GLIBCXX_NOEXCEPT
       : _M_dataplus(_M_local_data(), __a)
       {
-	_M_use_local_data();
+	_M_init_local_buf();
 	_M_set_length(0);
       }
 
@@ -678,7 +688,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
       {
 	if (__str._M_is_local())
 	  {
-	    (void)_M_use_local_data();
+	    _M_init_local_buf();
 	    traits_type::copy(_M_local_buf, __str._M_local_buf,
 			      __str.length() + 1);
 	  }
@@ -718,7 +728,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
       {
 	if (__str._M_is_local())
 	  {
-	    (void)_M_use_local_data();
+	    _M_init_local_buf();
 	    traits_type::copy(_M_local_buf, __str._M_local_buf,
 			      __str.length() + 1);
 	    _M_length(__str.length());
diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
index 4bc98f2aea7..f0a44e5e881 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -79,7 +79,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	      }
 	    else if (__s.length())
 	      {
-		(void)_M_use_local_data();
+		_M_init_local_buf();
 		traits_type::copy(_M_local_buf, __s._M_local_buf,
 				  __s.length() + 1);
 		_M_length(__s.length());
@@ -88,7 +88,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	      }
 	    else if (length())
 	      {
-		(void)__s._M_use_local_data();
+		__s._M_init_local_buf();
 		traits_type::copy(__s._M_local_buf, _M_local_buf,
 				  length() + 1);
 		__s._M_length(length());
@@ -99,7 +99,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	else
 	  {
 	    const size_type __tmp_capacity = __s._M_allocated_capacity;
-	    (void)__s._M_use_local_data();
+	    __s._M_init_local_buf();
 	    traits_type::copy(__s._M_local_buf, _M_local_buf,
 			      length() + 1);
 	    _M_data(__s._M_data());
@@ -111,7 +111,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  const size_type __tmp_capacity = _M_allocated_capacity;
 	  if (__s._M_is_local())
 	    {
-	      (void)_M_use_local_data();
+	      _M_init_local_buf();
 	      traits_type::copy(_M_local_buf, __s._M_local_buf,
 				__s.length() + 1);
 	      __s._M_data(_M_data());
@@ -174,11 +174,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	size_type __len = 0;
 	size_type __capacity = size_type(_S_local_capacity);
 
-	pointer __p = _M_use_local_data();
+	_M_init_local_buf();
 
 	while (__beg != __end && __len < __capacity)
 	  {
-	    __p[__len++] = *__beg;
+	    _M_local_buf[__len++] = *__beg;
 	    ++__beg;
 	  }
 
@@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    _M_capacity(__dnew);
 	  }
 	else
-	  _M_use_local_data();
+	  _M_init_local_buf();
 
 	// Check for out_of_range and length_error exceptions.
 	struct _Guard
@@ -263,7 +263,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  _M_capacity(__n);
 	}
       else
-	_M_use_local_data();
+	_M_init_local_buf();
 
       if (__n)
 	this->_S_assign(_M_data(), __n, __c);
@@ -372,7 +372,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       if (__length <= size_type(_S_local_capacity))
 	{
-	  this->_S_copy(_M_use_local_data(), _M_data(), __length + 1);
+	  _M_init_local_buf();
+	  this->_S_copy(_M_local_buf, _M_data(), __length + 1);
 	  _M_destroy(__capacity);
 	  _M_data(_M_local_data());
 	}

      reply	other threads:[~2023-10-21 14:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <ZO30PQql2TablzpJ@Thaum.localdomain>
     [not found] ` <053faf76-f918-7527-4a41-755a18d0018a@redhat.com>
     [not found]   ` <ZPHXiqquRCNCREoX@Thaum.localdomain>
     [not found]     ` <e5fb1597-28a9-1be6-f914-5ed475732da2@redhat.com>
     [not found]       ` <ZQpDAkaSSdkc0Q+R@Thaum.localdomain>
     [not found]         ` <e11829bb-68c2-98b6-c7d0-9f9dbd58cb25@redhat.com>
     [not found]           ` <ZQxIGK4oIxNMun2i@Thaum.localdomain>
     [not found]             ` <bfe816fe-5ce5-8dc8-3d00-455ef37d67df@redhat.com>
     [not found]               ` <ZQ2sQNoVz0rblzDN@Thaum.localdomain>
2023-09-23  0:38                 ` [PATCH v4] c++: Check for indirect change of active union member in constexpr [PR101631,PR102286] Nathaniel Shead
2023-09-23  6:40                   ` Jonathan Wakely
2023-09-23  7:30                     ` [PATCH] libstdc++: Ensure active union member is correctly set Nathaniel Shead
2023-09-23 10:52                       ` Jonathan Wakely
2023-09-27 14:13                       ` Jonathan Wakely
2023-09-28 23:25                         ` Nathaniel Shead
2023-09-29  9:32                           ` Jonathan Wakely
2023-09-29 15:06                             ` Jonathan Wakely
2023-09-29 16:29                               ` Nathaniel Shead
2023-09-29 16:46                                 ` Jonathan Wakely
2023-10-21 14:45                                   ` Jonathan Wakely [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACb0b4kcGnogY2ite0BLgd=t2W+BJ1+qVVQG=50mcpJAWKirhQ@mail.gmail.com' \
    --to=jwakely@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=jwakely.gcc@gmail.com \
    --cc=libstdc++@gcc.gnu.org \
    --cc=nathanieloshead@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).