public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] libstdc++: optimize bit iterators assuming normalization [PR110807]
@ 2023-11-08 16:10 Alexandre Oliva
  2023-11-08 19:32 ` Jonathan Wakely
  0 siblings, 1 reply; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-08 16:10 UTC (permalink / raw)
  To: gcc-patches, libstdc++


The representation of bit iterators, using a pointer into an array of
words, and an unsigned bit offset into that word, makes for some
optimization challenges: because the compiler doesn't know that the
offset is always in a certain narrow range, beginning at zero and
ending before the word bitwidth, when a function loads an offset that
it hasn't normalized itself, it may fail to derive certain reasonable
conclusions, even to the point of retaining useless calls that elicit
incorrect warnings.

Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
list to a global bit vector variable.  Based on the compile-time
constant length of the list, we decide in _M_insert_range whether to
use the existing storage or to allocate new storage for the vector.
After allocation, we decide in _M_copy_aligned how to copy any
preexisting portions of the vector to the newly-allocated storage.
When copying two or more words, we use __builtin_memmove.

However, because we compute the available room using bit offsets
without range information, even comparing them with constants, we fail
to infer ranges for the preexisting vector depending on word size, and
may thus retain the memmove call despite knowing we've only allocated
one word.

Other parts of the compiler then detect the mismatch between the
constant allocation size and the much larger range that could
theoretically be copied into the newly-allocated storage if we could
reach the call.

Ensuring the compiler is aware of the constraints on the offset range
enables it to do a much better job at optimizing.  The challenge is to
do so without runtime overhead, because this is not about checking
that it's in range, it's only about telling the compiler about it.

This patch introduces a __GLIBCXX_BUILTIN_ASSUME macro that, when
optimizing, expands to code that invokes undefined behavior in case
the expression doesn't hold, so that the compiler optimizes out the
test and the entire branch containing, but retaining enough
information about the paths that shouldn't be taken, so that at
remaining paths it optimizes based on the assumption.

I also introduce a member function in bit iterators that conveys to
the compiler the information that the assumption is supposed to hold,
and various calls throughout member functions of bit iterators that
might not otherwise know that the offsets have to be in range,
making pessimistic decisions and failing to optimize out cases that it
could.

With the explicit assumptions, the compiler can correlate the test for
available storage in the vector with the test for how much storage
might need to be copied, and determine that, if we're not asking for
enough room for two or more words, we can omit entirely the code to
copy two or more words, without any runtime overhead whatsoever: no
traces remain of the undefined behavior or of the tests that inform
the compiler about the assumptions that must hold.

Regstrapped on x86_64-linux-gnu, also tested with gcc-13 on i686- and
x86_64-.  Ok to install?

(It was later found to fix 23_containers/vector/bool/allocator/copy_cc
on x86_64-linux-gnu as well, that fails on gcc-13 with the same warning.)

(The constant_evaluated/static_assert bit is disabled because expr is
not a constant according to some libstdc++ build errors, but there
doesn't seem to be a problem with the other bits.  I haven't really
thought that bit through, it was something I started out as potentially
desirable, but that turned out to be not required.  I suppose I could
just drop it.)

(I suppose __GLIBCXX_BUILTIN_ASSUME could be moved to a more general
place and put to more general uses, but I didn't feel that bold ;-)


for  libstdc++-v3/ChangeLog

	PR libstdc++/110807
	* include/bits/stl_bvector.h (__GLIBCXX_BUILTIN_ASSUME): New.
	(_Bit_iterator_base): Add _M_normalized_p and
	_M_assume_normalized.  Use them in _M_bump_up, _M_bump_down,
	_M_incr, operator==, operator<=>, operator<, and operator-.
	(_Bit_iterator): Also use them in operator*.
	(_Bit_const_iterator): Likewise.
---
 libstdc++-v3/include/bits/stl_bvector.h |   75 ++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 8d18bcaffd434..81b316846454b 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -177,6 +177,55 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _Bit_type * _M_p;
     unsigned int _M_offset;
 
+#if __OPTIMIZE__ && !__GLIBCXX_DISABLE_ASSUMPTIONS
+// If the assumption (EXPR) fails, invoke undefined behavior, so that
+// the test and the failure block gets optimized out, but the compiler
+// still recalls that (expr) can be taken for granted.  Use this only
+// for expressions that are simple and explicit enough that the
+// compiler can optimize based on them.  When not optimizing, the
+// expression is still compiled, but it's never executed.
+#if 0 /* ??? */ && __cpp_lib_is_constant_evaluated
+#define __GLIBCXX_BUILTIN_ASSUME(expr)		\
+    do						\
+      if (std::is_constant_evaluated ())	\
+	static_assert(expr);			\
+      else if (!(expr))				\
+	{					\
+	  void **__assert_failed = 0;		\
+	  *__assert_failed = 0;			\
+	  __builtin_unreachable ();		\
+	}					\
+    while (0)
+#else
+#define __GLIBCXX_BUILTIN_ASSUME(expr)		\
+    do						\
+      if (!(expr))				\
+	{					\
+	  void **__assert_failed = 0;		\
+	  *__assert_failed = 0;			\
+	  __builtin_unreachable ();		\
+	}					\
+    while (0)
+#endif
+#else
+#define __GLIBCXX_BUILTIN_ASSUME(expr)		\
+    (void)(false && (expr))
+#endif
+
+    _GLIBCXX20_CONSTEXPR
+    bool
+    _M_normalized_p() const
+    {
+      return (_M_offset < unsigned(_S_word_bit));
+    }
+
+    _GLIBCXX20_CONSTEXPR
+    void
+    _M_assume_normalized() const
+    {
+      __GLIBCXX_BUILTIN_ASSUME (_M_normalized_p ());
+    }
+
     _GLIBCXX20_CONSTEXPR
     _Bit_iterator_base(_Bit_type * __x, unsigned int __y)
     : _M_p(__x), _M_offset(__y) { }
@@ -185,6 +234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_bump_up()
     {
+      _M_assume_normalized();
       if (_M_offset++ == int(_S_word_bit) - 1)
 	{
 	  _M_offset = 0;
@@ -196,6 +246,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_bump_down()
     {
+      _M_assume_normalized();
       if (_M_offset-- == 0)
 	{
 	  _M_offset = int(_S_word_bit) - 1;
@@ -207,6 +258,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_incr(ptrdiff_t __i)
     {
+      _M_assume_normalized();
       difference_type __n = __i + _M_offset;
       _M_p += __n / int(_S_word_bit);
       __n = __n % int(_S_word_bit);
@@ -221,7 +273,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD
     friend _GLIBCXX20_CONSTEXPR bool
     operator==(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
-    { return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }
+    {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
+      return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset;
+    }
 
 #if __cpp_lib_three_way_comparison
     [[nodiscard]]
@@ -229,6 +285,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     operator<=>(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     noexcept
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       if (const auto __cmp = __x._M_p <=> __y._M_p; __cmp != 0)
 	return __cmp;
       return __x._M_offset <=> __y._M_offset;
@@ -238,6 +296,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     friend bool
     operator<(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       return __x._M_p < __y._M_p
 	    || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
     }
@@ -266,6 +326,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     friend _GLIBCXX20_CONSTEXPR ptrdiff_t
     operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     {
+      // Make _M_offset's range visible to optimizers.
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       return (int(_S_word_bit) * (__x._M_p - __y._M_p)
 	      + __x._M_offset - __y._M_offset);
     }
@@ -297,7 +360,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
     reference
     operator*() const
-    { return reference(_M_p, 1UL << _M_offset); }
+    {
+      _M_assume_normalized();
+      return reference(_M_p, 1UL << _M_offset);
+    }
 
     _GLIBCXX20_CONSTEXPR
     iterator&
@@ -408,7 +474,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
     const_reference
     operator*() const
-    { return _Bit_reference(_M_p, 1UL << _M_offset); }
+    {
+      _M_assume_normalized();
+      return _Bit_reference(_M_p, 1UL << _M_offset);
+    }
 
     _GLIBCXX20_CONSTEXPR
     const_iterator&

-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-08 16:10 [PATCH] libstdc++: optimize bit iterators assuming normalization [PR110807] Alexandre Oliva
@ 2023-11-08 19:32 ` Jonathan Wakely
  2023-11-09  1:17   ` [PATCH v2] " Alexandre Oliva
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Wakely @ 2023-11-08 19:32 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, libstdc++

On 08/11/23 13:10 -0300, Alexandre Oliva wrote:
>The representation of bit iterators, using a pointer into an array of
>words, and an unsigned bit offset into that word, makes for some
>optimization challenges: because the compiler doesn't know that the
>offset is always in a certain narrow range, beginning at zero and
>ending before the word bitwidth, when a function loads an offset that
>it hasn't normalized itself, it may fail to derive certain reasonable
>conclusions, even to the point of retaining useless calls that elicit
>incorrect warnings.
>
>Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
>list to a global bit vector variable.  Based on the compile-time
>constant length of the list, we decide in _M_insert_range whether to
>use the existing storage or to allocate new storage for the vector.
>After allocation, we decide in _M_copy_aligned how to copy any
>preexisting portions of the vector to the newly-allocated storage.
>When copying two or more words, we use __builtin_memmove.
>
>However, because we compute the available room using bit offsets
>without range information, even comparing them with constants, we fail
>to infer ranges for the preexisting vector depending on word size, and
>may thus retain the memmove call despite knowing we've only allocated
>one word.
>
>Other parts of the compiler then detect the mismatch between the
>constant allocation size and the much larger range that could
>theoretically be copied into the newly-allocated storage if we could
>reach the call.
>
>Ensuring the compiler is aware of the constraints on the offset range
>enables it to do a much better job at optimizing.  The challenge is to
>do so without runtime overhead, because this is not about checking
>that it's in range, it's only about telling the compiler about it.
>
>This patch introduces a __GLIBCXX_BUILTIN_ASSUME macro that, when
>optimizing, expands to code that invokes undefined behavior in case
>the expression doesn't hold, so that the compiler optimizes out the
>test and the entire branch containing, but retaining enough
>information about the paths that shouldn't be taken, so that at
>remaining paths it optimizes based on the assumption.
>
>I also introduce a member function in bit iterators that conveys to
>the compiler the information that the assumption is supposed to hold,
>and various calls throughout member functions of bit iterators that
>might not otherwise know that the offsets have to be in range,
>making pessimistic decisions and failing to optimize out cases that it
>could.
>
>With the explicit assumptions, the compiler can correlate the test for
>available storage in the vector with the test for how much storage
>might need to be copied, and determine that, if we're not asking for
>enough room for two or more words, we can omit entirely the code to
>copy two or more words, without any runtime overhead whatsoever: no
>traces remain of the undefined behavior or of the tests that inform
>the compiler about the assumptions that must hold.
>
>Regstrapped on x86_64-linux-gnu, also tested with gcc-13 on i686- and
>x86_64-.  Ok to install?
>
>(It was later found to fix 23_containers/vector/bool/allocator/copy_cc
>on x86_64-linux-gnu as well, that fails on gcc-13 with the same warning.)
>
>(The constant_evaluated/static_assert bit is disabled because expr is
>not a constant according to some libstdc++ build errors, but there
>doesn't seem to be a problem with the other bits.  I haven't really
>thought that bit through, it was something I started out as potentially
>desirable, but that turned out to be not required.  I suppose I could
>just drop it.)
>
>(I suppose __GLIBCXX_BUILTIN_ASSUME could be moved to a more general
>place and put to more general uses, but I didn't feel that bold ;-)
>
>
>for  libstdc++-v3/ChangeLog
>
>	PR libstdc++/110807
>	* include/bits/stl_bvector.h (__GLIBCXX_BUILTIN_ASSUME): New.
>	(_Bit_iterator_base): Add _M_normalized_p and
>	_M_assume_normalized.  Use them in _M_bump_up, _M_bump_down,
>	_M_incr, operator==, operator<=>, operator<, and operator-.
>	(_Bit_iterator): Also use them in operator*.
>	(_Bit_const_iterator): Likewise.
>---
> libstdc++-v3/include/bits/stl_bvector.h |   75 ++++++++++++++++++++++++++++++-
> 1 file changed, 72 insertions(+), 3 deletions(-)
>
>diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
>index 8d18bcaffd434..81b316846454b 100644
>--- a/libstdc++-v3/include/bits/stl_bvector.h
>+++ b/libstdc++-v3/include/bits/stl_bvector.h
>@@ -177,6 +177,55 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     _Bit_type * _M_p;
>     unsigned int _M_offset;
>
>+#if __OPTIMIZE__ && !__GLIBCXX_DISABLE_ASSUMPTIONS
>+// If the assumption (EXPR) fails, invoke undefined behavior, so that
>+// the test and the failure block gets optimized out, but the compiler
>+// still recalls that (expr) can be taken for granted.  Use this only
>+// for expressions that are simple and explicit enough that the
>+// compiler can optimize based on them.  When not optimizing, the
>+// expression is still compiled, but it's never executed.
>+#if 0 /* ??? */ && __cpp_lib_is_constant_evaluated
>+#define __GLIBCXX_BUILTIN_ASSUME(expr)		\

A single underscore prefix on __GLIBCXX_BUILTIN_ASSUME and
__GLIBCXX_DISABLE_ASSUMPTIONS please.

>+    do						\
>+      if (std::is_constant_evaluated ())	\
>+	static_assert(expr);			\

This can never be valid. The true branch of the
if (is_constant_evaluated) statement must always compile, which means
expr must be a valid constant expression to be usable in a
static_assert. It can't be a runtime condition like
"_M_offset < _S_word_bit".


>+      else if (!(expr))				\
>+	{					\
>+	  void **__assert_failed = 0;		\
>+	  *__assert_failed = 0;			\
>+	  __builtin_unreachable ();		\

This already works fine in constant evaluation anyway. If a
__builtin_unreachable() is reached during constant evaluation, the
compilation fails. So this already serves as static_assert(expr),
except it actually works :-)

But what's the null dereference for?

>+	}					\
>+    while (0)
>+#else
>+#define __GLIBCXX_BUILTIN_ASSUME(expr)		\
>+    do						\
>+      if (!(expr))				\
>+	{					\
>+	  void **__assert_failed = 0;		\
>+	  *__assert_failed = 0;			\
>+	  __builtin_unreachable ();		\
>+	}					\
>+    while (0)
>+#endif
>+#else
>+#define __GLIBCXX_BUILTIN_ASSUME(expr)		\
>+    (void)(false && (expr))

What's the point of this, just to verify that (expr) is contextually
convertible to bool?

>+#endif
>+
>+    _GLIBCXX20_CONSTEXPR
>+    bool
>+    _M_normalized_p() const

We don't use the _p suffix for predicates in the library.
Please use just _M_normalized or _M_is_normalized.

But do we even need this function? It's not used anywhere else, can we
just inline the condition into _M_assume_normalized() ?


>+    {
>+      return (_M_offset < unsigned(_S_word_bit));
>+    }
>+
>+    _GLIBCXX20_CONSTEXPR
>+    void
>+    _M_assume_normalized() const

I think this should use _GLIBCXX_ALWAYS_INLINE

>+    {
>+      __GLIBCXX_BUILTIN_ASSUME (_M_normalized_p ());

Is there even any benefit to this macro?

Can we just define this function as:

         if (_M_offset >= unsigned(_S_word_bit))
           __builtin_unreachable();

Or better still:

        __attribute__((__assume__(_M_offset < unsigned(_S_word_bit))));

Maybe even get rid of _M_assume_normalized() as a function and just
put that attribute everywhere you currently use _M_assume_normalized.


>+    }
>+
>     _GLIBCXX20_CONSTEXPR
>     _Bit_iterator_base(_Bit_type * __x, unsigned int __y)
>     : _M_p(__x), _M_offset(__y) { }
>@@ -185,6 +234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     void
>     _M_bump_up()
>     {
>+      _M_assume_normalized();
>       if (_M_offset++ == int(_S_word_bit) - 1)
> 	{
> 	  _M_offset = 0;
>@@ -196,6 +246,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     void
>     _M_bump_down()
>     {
>+      _M_assume_normalized();
>       if (_M_offset-- == 0)
> 	{
> 	  _M_offset = int(_S_word_bit) - 1;
>@@ -207,6 +258,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     void
>     _M_incr(ptrdiff_t __i)
>     {
>+      _M_assume_normalized();
>       difference_type __n = __i + _M_offset;
>       _M_p += __n / int(_S_word_bit);
>       __n = __n % int(_S_word_bit);
>@@ -221,7 +273,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     _GLIBCXX_NODISCARD
>     friend _GLIBCXX20_CONSTEXPR bool
>     operator==(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
>-    { return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }
>+    {
>+      __x._M_assume_normalized();
>+      __y._M_assume_normalized();
>+      return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset;
>+    }
>
> #if __cpp_lib_three_way_comparison
>     [[nodiscard]]
>@@ -229,6 +285,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     operator<=>(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
>     noexcept
>     {
>+      __x._M_assume_normalized();
>+      __y._M_assume_normalized();
>       if (const auto __cmp = __x._M_p <=> __y._M_p; __cmp != 0)
> 	return __cmp;
>       return __x._M_offset <=> __y._M_offset;
>@@ -238,6 +296,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     friend bool
>     operator<(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
>     {
>+      __x._M_assume_normalized();
>+      __y._M_assume_normalized();
>       return __x._M_p < __y._M_p
> 	    || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
>     }
>@@ -266,6 +326,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     friend _GLIBCXX20_CONSTEXPR ptrdiff_t
>     operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
>     {
>+      // Make _M_offset's range visible to optimizers.
>+      __x._M_assume_normalized();
>+      __y._M_assume_normalized();
>       return (int(_S_word_bit) * (__x._M_p - __y._M_p)
> 	      + __x._M_offset - __y._M_offset);
>     }
>@@ -297,7 +360,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>     reference
>     operator*() const
>-    { return reference(_M_p, 1UL << _M_offset); }
>+    {
>+      _M_assume_normalized();
>+      return reference(_M_p, 1UL << _M_offset);
>+    }
>
>     _GLIBCXX20_CONSTEXPR
>     iterator&
>@@ -408,7 +474,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>     const_reference
>     operator*() const
>-    { return _Bit_reference(_M_p, 1UL << _M_offset); }
>+    {
>+      _M_assume_normalized();
>+      return _Bit_reference(_M_p, 1UL << _M_offset);
>+    }
>
>     _GLIBCXX20_CONSTEXPR
>     const_iterator&


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-08 19:32 ` Jonathan Wakely
@ 2023-11-09  1:17   ` Alexandre Oliva
  2023-11-09  1:22     ` Jonathan Wakely
  0 siblings, 1 reply; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-09  1:17 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-patches, libstdc++

On Nov  8, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:

> A single underscore prefix on __GLIBCXX_BUILTIN_ASSUME and
> __GLIBCXX_DISABLE_ASSUMPTIONS please.

That's entirely gone now.

>> +    do						\
>> +      if (std::is_constant_evaluated ())	\
>> +	static_assert(expr);			\

> This can never be valid.

*nod*

> This already works fine in constant evaluation anyway.

Yeah, that's what I figured.

> But what's the null dereference for?

The idea was to clearly trigger undefined behavior.  Maybe it wasn't
needed, it didn't occur to me that __builtin_unreachable() would be
enough.  I realize I was really trying to emulate attribute assume, even
without knowing it existed ;-)

>> +#define __GLIBCXX_BUILTIN_ASSUME(expr)		\
>> +    (void)(false && (expr))

> What's the point of this, just to verify that (expr) is contextually
> convertible to bool?

I'd have phrased it as "avoid the case in which something compiles with
-O0 but not with -O", but yeah ;-)

> We don't use the _p suffix for predicates in the library.
> Please use just _M_normalized or _M_is_normalized.

ACK.  It's also gone now.

> But do we even need this function? It's not used anywhere else, can we
> just inline the condition into _M_assume_normalized() ?

I had other uses for it in earlier versions of the patch, but it makes
no sense any more indeed.

>> +    _GLIBCXX20_CONSTEXPR
>> +    void
>> +    _M_assume_normalized() const

> I think this should use _GLIBCXX_ALWAYS_INLINE

*nod*, thanks

>> +    {
>> +      __GLIBCXX_BUILTIN_ASSUME (_M_normalized_p ());

> Is there even any benefit to this macro?

I just thought it could have other uses, without being aware that the
entire concept was available as a statement attribute.  Funny, I'd even
searched for it among the existing attributes and builtins, but somehow
I managed to miss it.  Thanks for getting me back down that path.

>        __attribute__((__assume__(_M_offset < unsigned(_S_word_bit))));

That unfortunately doesn't work, because the assume lowering doesn't go
as far as dereferencing the implicit this and making an SSA_NAME out of
the loaded _M_offset, which we'd need to be able to optimize based on
it.  But that only took me a while to figure out and massage into
something that had the desired effect.  Now, maybe the above *should*
have that effect already, but unfortunately it doesn't.

> Maybe even get rid of _M_assume_normalized() as a function and just
> put that attribute everywhere you currently use _M_assume_normalized.

Because of the slight kludge required to make the attribute have the
desired effect (namely ensuring the _M_offset reference is evaluated),
I've retained it as an inline function.

Here's what I'm retesting now.  WDYT?


libstdc++: optimize bit iterators assuming normalization [PR110807]

The representation of bit iterators, using a pointer into an array of
words, and an unsigned bit offset into that word, makes for some
optimization challenges: because the compiler doesn't know that the
offset is always in a certain narrow range, beginning at zero and
ending before the word bitwidth, when a function loads an offset that
it hasn't normalized itself, it may fail to derive certain reasonable
conclusions, even to the point of retaining useless calls that elicit
incorrect warnings.

Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
list to a global bit vector variable.  Based on the compile-time
constant length of the list, we decide in _M_insert_range whether to
use the existing storage or to allocate new storage for the vector.
After allocation, we decide in _M_copy_aligned how to copy any
preexisting portions of the vector to the newly-allocated storage.
When copying two or more words, we use __builtin_memmove.

However, because we compute the available room using bit offsets
without range information, even comparing them with constants, we fail
to infer ranges for the preexisting vector depending on word size, and
may thus retain the memmove call despite knowing we've only allocated
one word.

Other parts of the compiler then detect the mismatch between the
constant allocation size and the much larger range that could
theoretically be copied into the newly-allocated storage if we could
reach the call.

Ensuring the compiler is aware of the constraints on the offset range
enables it to do a much better job at optimizing.  Using attribute
assume (_M_offset <= ...) didn't work, because gimple lowered that to
something that vrp could only use to ensure 'this' was non-NULL.
Exposing _M_offset as an automatic variable/gimple register outside
the unevaluated assume operand enabled the optimizer to do its job.

Rather than placing such load-then-assume constructs all over, I
introduced an always-inline member function in bit iterators that does
the job of conveying to the compiler the information that the
assumption is supposed to hold, and various calls throughout functions
pertaining to bit iterators that might not otherwise know that the
offsets have to be in range, so that the compiler no longer needs to
make conservative assumptions that prevent optimizations.

With the explicit assumptions, the compiler can correlate the test for
available storage in the vector with the test for how much storage
might need to be copied, and determine that, if we're not asking for
enough room for two or more words, we can omit entirely the code to
copy two or more words, without any runtime overhead whatsoever: no
traces remain of the undefined behavior or of the tests that inform
the compiler about the assumptions that must hold.


for  libstdc++-v3/ChangeLog

	PR libstdc++/110807
	* include/bits/stl_bvector.h (_Bit_iterator_base): Add
	_M_assume_normalized member function.  Call it in _M_bump_up,
	_M_bump_down, _M_incr, operator==, operator<=>, operator<, and
	operator-.
	(_Bit_iterator): Also call it in operator*.
	(_Bit_const_iterator): Likewise.
---
 libstdc++-v3/include/bits/stl_bvector.h |   37 ++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 8d18bcaffd434..bb9f38cdf2d59 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -56,6 +56,10 @@
 #ifndef _STL_BVECTOR_H
 #define _STL_BVECTOR_H 1
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 #if __cplusplus >= 201103L
 #include <initializer_list>
 #include <bits/functional_hash.h>
@@ -177,6 +181,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _Bit_type * _M_p;
     unsigned int _M_offset;
 
+    _GLIBCXX20_CONSTEXPR _GLIBCXX_ALWAYS_INLINE
+    void
+    _M_assume_normalized() const
+    {
+      unsigned int ofst = _M_offset;
+      __attribute__ ((__assume__ (ofst < unsigned(_S_word_bit))));
+    }
+
     _GLIBCXX20_CONSTEXPR
     _Bit_iterator_base(_Bit_type * __x, unsigned int __y)
     : _M_p(__x), _M_offset(__y) { }
@@ -185,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_bump_up()
     {
+      _M_assume_normalized();
       if (_M_offset++ == int(_S_word_bit) - 1)
 	{
 	  _M_offset = 0;
@@ -196,6 +209,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_bump_down()
     {
+      _M_assume_normalized();
       if (_M_offset-- == 0)
 	{
 	  _M_offset = int(_S_word_bit) - 1;
@@ -207,6 +221,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_incr(ptrdiff_t __i)
     {
+      _M_assume_normalized();
       difference_type __n = __i + _M_offset;
       _M_p += __n / int(_S_word_bit);
       __n = __n % int(_S_word_bit);
@@ -221,7 +236,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD
     friend _GLIBCXX20_CONSTEXPR bool
     operator==(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
-    { return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }
+    {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
+      return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset;
+    }
 
 #if __cpp_lib_three_way_comparison
     [[nodiscard]]
@@ -229,6 +248,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     operator<=>(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     noexcept
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       if (const auto __cmp = __x._M_p <=> __y._M_p; __cmp != 0)
 	return __cmp;
       return __x._M_offset <=> __y._M_offset;
@@ -238,6 +259,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     friend bool
     operator<(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       return __x._M_p < __y._M_p
 	    || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
     }
@@ -266,6 +289,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     friend _GLIBCXX20_CONSTEXPR ptrdiff_t
     operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       return (int(_S_word_bit) * (__x._M_p - __y._M_p)
 	      + __x._M_offset - __y._M_offset);
     }
@@ -297,7 +322,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
     reference
     operator*() const
-    { return reference(_M_p, 1UL << _M_offset); }
+    {
+      _M_assume_normalized();
+      return reference(_M_p, 1UL << _M_offset);
+    }
 
     _GLIBCXX20_CONSTEXPR
     iterator&
@@ -408,7 +436,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
     const_reference
     operator*() const
-    { return _Bit_reference(_M_p, 1UL << _M_offset); }
+    {
+      _M_assume_normalized();
+      return _Bit_reference(_M_p, 1UL << _M_offset);
+    }
 
     _GLIBCXX20_CONSTEXPR
     const_iterator&


-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-09  1:17   ` [PATCH v2] " Alexandre Oliva
@ 2023-11-09  1:22     ` Jonathan Wakely
  2023-11-09  3:36       ` [PATCH v3] " Alexandre Oliva
  2024-02-07 16:25       ` [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807] Torbjorn SVENSSON
  0 siblings, 2 replies; 18+ messages in thread
From: Jonathan Wakely @ 2023-11-09  1:22 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Jonathan Wakely, gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 11089 bytes --]

On Thu, 9 Nov 2023, 01:17 Alexandre Oliva, <oliva@adacore.com> wrote:

> On Nov  8, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> > A single underscore prefix on __GLIBCXX_BUILTIN_ASSUME and
> > __GLIBCXX_DISABLE_ASSUMPTIONS please.
>
> That's entirely gone now.
>
> >> +    do                                              \
> >> +      if (std::is_constant_evaluated ())    \
> >> +    static_assert(expr);                    \
>
> > This can never be valid.
>
> *nod*
>
> > This already works fine in constant evaluation anyway.
>
> Yeah, that's what I figured.
>
> > But what's the null dereference for?
>
> The idea was to clearly trigger undefined behavior.  Maybe it wasn't
> needed, it didn't occur to me that __builtin_unreachable() would be
> enough.  I realize I was really trying to emulate attribute assume, even
> without knowing it existed ;-)
>
> >> +#define __GLIBCXX_BUILTIN_ASSUME(expr)              \
> >> +    (void)(false && (expr))
>
> > What's the point of this, just to verify that (expr) is contextually
> > convertible to bool?
>
> I'd have phrased it as "avoid the case in which something compiles with
> -O0 but not with -O", but yeah ;-)
>
> > We don't use the _p suffix for predicates in the library.
> > Please use just _M_normalized or _M_is_normalized.
>
> ACK.  It's also gone now.
>
> > But do we even need this function? It's not used anywhere else, can we
> > just inline the condition into _M_assume_normalized() ?
>
> I had other uses for it in earlier versions of the patch, but it makes
> no sense any more indeed.
>
> >> +    _GLIBCXX20_CONSTEXPR
> >> +    void
> >> +    _M_assume_normalized() const
>
> > I think this should use _GLIBCXX_ALWAYS_INLINE
>
> *nod*, thanks
>
> >> +    {
> >> +      __GLIBCXX_BUILTIN_ASSUME (_M_normalized_p ());
>
> > Is there even any benefit to this macro?
>
> I just thought it could have other uses, without being aware that the
> entire concept was available as a statement attribute.  Funny, I'd even
> searched for it among the existing attributes and builtins, but somehow
> I managed to miss it.  Thanks for getting me back down that path.
>
> >        __attribute__((__assume__(_M_offset < unsigned(_S_word_bit))));
>
> That unfortunately doesn't work, because the assume lowering doesn't go
> as far as dereferencing the implicit this and making an SSA_NAME out of
> the loaded _M_offset, which we'd need to be able to optimize based on
> it.  But that only took me a while to figure out and massage into
> something that had the desired effect.  Now, maybe the above *should*
> have that effect already, but unfortunately it doesn't.
>
> > Maybe even get rid of _M_assume_normalized() as a function and just
> > put that attribute everywhere you currently use _M_assume_normalized.
>
> Because of the slight kludge required to make the attribute have the
> desired effect (namely ensuring the _M_offset reference is evaluated),
> I've retained it as an inline function.
>
> Here's what I'm retesting now.  WDYT?
>

ofst needs to be __ofst but OK for trunk with that change.

We probably want this on the gcc-13 branch too, but let's give it some time
on trunk in case the assume attribute isn't quite ready for prime time.



>
> libstdc++: optimize bit iterators assuming normalization [PR110807]
>
> The representation of bit iterators, using a pointer into an array of
> words, and an unsigned bit offset into that word, makes for some
> optimization challenges: because the compiler doesn't know that the
> offset is always in a certain narrow range, beginning at zero and
> ending before the word bitwidth, when a function loads an offset that
> it hasn't normalized itself, it may fail to derive certain reasonable
> conclusions, even to the point of retaining useless calls that elicit
> incorrect warnings.
>
> Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
> list to a global bit vector variable.  Based on the compile-time
> constant length of the list, we decide in _M_insert_range whether to
> use the existing storage or to allocate new storage for the vector.
> After allocation, we decide in _M_copy_aligned how to copy any
> preexisting portions of the vector to the newly-allocated storage.
> When copying two or more words, we use __builtin_memmove.
>
> However, because we compute the available room using bit offsets
> without range information, even comparing them with constants, we fail
> to infer ranges for the preexisting vector depending on word size, and
> may thus retain the memmove call despite knowing we've only allocated
> one word.
>
> Other parts of the compiler then detect the mismatch between the
> constant allocation size and the much larger range that could
> theoretically be copied into the newly-allocated storage if we could
> reach the call.
>
> Ensuring the compiler is aware of the constraints on the offset range
> enables it to do a much better job at optimizing.  Using attribute
> assume (_M_offset <= ...) didn't work, because gimple lowered that to
> something that vrp could only use to ensure 'this' was non-NULL.
> Exposing _M_offset as an automatic variable/gimple register outside
> the unevaluated assume operand enabled the optimizer to do its job.
>
> Rather than placing such load-then-assume constructs all over, I
> introduced an always-inline member function in bit iterators that does
> the job of conveying to the compiler the information that the
> assumption is supposed to hold, and various calls throughout functions
> pertaining to bit iterators that might not otherwise know that the
> offsets have to be in range, so that the compiler no longer needs to
> make conservative assumptions that prevent optimizations.
>
> With the explicit assumptions, the compiler can correlate the test for
> available storage in the vector with the test for how much storage
> might need to be copied, and determine that, if we're not asking for
> enough room for two or more words, we can omit entirely the code to
> copy two or more words, without any runtime overhead whatsoever: no
> traces remain of the undefined behavior or of the tests that inform
> the compiler about the assumptions that must hold.
>
>
> for  libstdc++-v3/ChangeLog
>
>         PR libstdc++/110807
>         * include/bits/stl_bvector.h (_Bit_iterator_base): Add
>         _M_assume_normalized member function.  Call it in _M_bump_up,
>         _M_bump_down, _M_incr, operator==, operator<=>, operator<, and
>         operator-.
>         (_Bit_iterator): Also call it in operator*.
>         (_Bit_const_iterator): Likewise.
> ---
>  libstdc++-v3/include/bits/stl_bvector.h |   37
> ++++++++++++++++++++++++++++---
>  1 file changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/stl_bvector.h
> b/libstdc++-v3/include/bits/stl_bvector.h
> index 8d18bcaffd434..bb9f38cdf2d59 100644
> --- a/libstdc++-v3/include/bits/stl_bvector.h
> +++ b/libstdc++-v3/include/bits/stl_bvector.h
> @@ -56,6 +56,10 @@
>  #ifndef _STL_BVECTOR_H
>  #define _STL_BVECTOR_H 1
>
> +#ifndef _GLIBCXX_ALWAYS_INLINE
> +#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
> +#endif
> +
>  #if __cplusplus >= 201103L
>  #include <initializer_list>
>  #include <bits/functional_hash.h>
> @@ -177,6 +181,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      _Bit_type * _M_p;
>      unsigned int _M_offset;
>
> +    _GLIBCXX20_CONSTEXPR _GLIBCXX_ALWAYS_INLINE
> +    void
> +    _M_assume_normalized() const
> +    {
> +      unsigned int ofst = _M_offset;
> +      __attribute__ ((__assume__ (ofst < unsigned(_S_word_bit))));
> +    }
> +
>      _GLIBCXX20_CONSTEXPR
>      _Bit_iterator_base(_Bit_type * __x, unsigned int __y)
>      : _M_p(__x), _M_offset(__y) { }
> @@ -185,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      void
>      _M_bump_up()
>      {
> +      _M_assume_normalized();
>        if (_M_offset++ == int(_S_word_bit) - 1)
>         {
>           _M_offset = 0;
> @@ -196,6 +209,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      void
>      _M_bump_down()
>      {
> +      _M_assume_normalized();
>        if (_M_offset-- == 0)
>         {
>           _M_offset = int(_S_word_bit) - 1;
> @@ -207,6 +221,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      void
>      _M_incr(ptrdiff_t __i)
>      {
> +      _M_assume_normalized();
>        difference_type __n = __i + _M_offset;
>        _M_p += __n / int(_S_word_bit);
>        __n = __n % int(_S_word_bit);
> @@ -221,7 +236,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      _GLIBCXX_NODISCARD
>      friend _GLIBCXX20_CONSTEXPR bool
>      operator==(const _Bit_iterator_base& __x, const _Bit_iterator_base&
> __y)
> -    { return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }
> +    {
> +      __x._M_assume_normalized();
> +      __y._M_assume_normalized();
> +      return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset;
> +    }
>
>  #if __cpp_lib_three_way_comparison
>      [[nodiscard]]
> @@ -229,6 +248,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      operator<=>(const _Bit_iterator_base& __x, const _Bit_iterator_base&
> __y)
>      noexcept
>      {
> +      __x._M_assume_normalized();
> +      __y._M_assume_normalized();
>        if (const auto __cmp = __x._M_p <=> __y._M_p; __cmp != 0)
>         return __cmp;
>        return __x._M_offset <=> __y._M_offset;
> @@ -238,6 +259,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      friend bool
>      operator<(const _Bit_iterator_base& __x, const _Bit_iterator_base&
> __y)
>      {
> +      __x._M_assume_normalized();
> +      __y._M_assume_normalized();
>        return __x._M_p < __y._M_p
>             || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
>      }
> @@ -266,6 +289,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      friend _GLIBCXX20_CONSTEXPR ptrdiff_t
>      operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base&
> __y)
>      {
> +      __x._M_assume_normalized();
> +      __y._M_assume_normalized();
>        return (int(_S_word_bit) * (__x._M_p - __y._M_p)
>               + __x._M_offset - __y._M_offset);
>      }
> @@ -297,7 +322,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>      reference
>      operator*() const
> -    { return reference(_M_p, 1UL << _M_offset); }
> +    {
> +      _M_assume_normalized();
> +      return reference(_M_p, 1UL << _M_offset);
> +    }
>
>      _GLIBCXX20_CONSTEXPR
>      iterator&
> @@ -408,7 +436,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>      _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>      const_reference
>      operator*() const
> -    { return _Bit_reference(_M_p, 1UL << _M_offset); }
> +    {
> +      _M_assume_normalized();
> +      return _Bit_reference(_M_p, 1UL << _M_offset);
> +    }
>
>      _GLIBCXX20_CONSTEXPR
>      const_iterator&
>
>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-09  1:22     ` Jonathan Wakely
@ 2023-11-09  3:36       ` Alexandre Oliva
  2023-11-09  5:57         ` François Dumont
  2024-02-07 16:25       ` [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807] Torbjorn SVENSSON
  1 sibling, 1 reply; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-09  3:36 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jonathan Wakely, gcc-patches, libstdc++

On Nov  8, 2023, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> ofst needs to be __ofst but OK for trunk with that change.

Oh, doh, thanks for catching that last-minute tweak.

Retesting with that change completed successfully, so I've just pushed
the following:


libstdc++: optimize bit iterators assuming normalization [PR110807]

The representation of bit iterators, using a pointer into an array of
words, and an unsigned bit offset into that word, makes for some
optimization challenges: because the compiler doesn't know that the
offset is always in a certain narrow range, beginning at zero and
ending before the word bitwidth, when a function loads an offset that
it hasn't normalized itself, it may fail to derive certain reasonable
conclusions, even to the point of retaining useless calls that elicit
incorrect warnings.

Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
list to a global bit vector variable.  Based on the compile-time
constant length of the list, we decide in _M_insert_range whether to
use the existing storage or to allocate new storage for the vector.
After allocation, we decide in _M_copy_aligned how to copy any
preexisting portions of the vector to the newly-allocated storage.
When copying two or more words, we use __builtin_memmove.

However, because we compute the available room using bit offsets
without range information, even comparing them with constants, we fail
to infer ranges for the preexisting vector depending on word size, and
may thus retain the memmove call despite knowing we've only allocated
one word.

Other parts of the compiler then detect the mismatch between the
constant allocation size and the much larger range that could
theoretically be copied into the newly-allocated storage if we could
reach the call.

Ensuring the compiler is aware of the constraints on the offset range
enables it to do a much better job at optimizing.  Using attribute
assume (_M_offset <= ...) didn't work, because gimple lowered that to
something that vrp could only use to ensure 'this' was non-NULL.
Exposing _M_offset as an automatic variable/gimple register outside
the unevaluated assume operand enabled the optimizer to do its job.

Rather than placing such load-then-assume constructs all over, I
introduced an always-inline member function in bit iterators that does
the job of conveying to the compiler the information that the
assumption is supposed to hold, and various calls throughout functions
pertaining to bit iterators that might not otherwise know that the
offsets have to be in range, so that the compiler no longer needs to
make conservative assumptions that prevent optimizations.

With the explicit assumptions, the compiler can correlate the test for
available storage in the vector with the test for how much storage
might need to be copied, and determine that, if we're not asking for
enough room for two or more words, we can omit entirely the code to
copy two or more words, without any runtime overhead whatsoever: no
traces remain of the undefined behavior or of the tests that inform
the compiler about the assumptions that must hold.


for  libstdc++-v3/ChangeLog

	PR libstdc++/110807
	* include/bits/stl_bvector.h (_Bit_iterator_base): Add
	_M_assume_normalized member function.  Call it in _M_bump_up,
	_M_bump_down, _M_incr, operator==, operator<=>, operator<, and
	operator-.
	(_Bit_iterator): Also call it in operator*.
	(_Bit_const_iterator): Likewise.
---
 libstdc++-v3/include/bits/stl_bvector.h |   37 ++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 8d18bcaffd434..2b91af2005f2d 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -56,6 +56,10 @@
 #ifndef _STL_BVECTOR_H
 #define _STL_BVECTOR_H 1
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 #if __cplusplus >= 201103L
 #include <initializer_list>
 #include <bits/functional_hash.h>
@@ -177,6 +181,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _Bit_type * _M_p;
     unsigned int _M_offset;
 
+    _GLIBCXX20_CONSTEXPR _GLIBCXX_ALWAYS_INLINE
+    void
+    _M_assume_normalized() const
+    {
+      unsigned int __ofst = _M_offset;
+      __attribute__ ((__assume__ (__ofst < unsigned(_S_word_bit))));
+    }
+
     _GLIBCXX20_CONSTEXPR
     _Bit_iterator_base(_Bit_type * __x, unsigned int __y)
     : _M_p(__x), _M_offset(__y) { }
@@ -185,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_bump_up()
     {
+      _M_assume_normalized();
       if (_M_offset++ == int(_S_word_bit) - 1)
 	{
 	  _M_offset = 0;
@@ -196,6 +209,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_bump_down()
     {
+      _M_assume_normalized();
       if (_M_offset-- == 0)
 	{
 	  _M_offset = int(_S_word_bit) - 1;
@@ -207,6 +221,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     void
     _M_incr(ptrdiff_t __i)
     {
+      _M_assume_normalized();
       difference_type __n = __i + _M_offset;
       _M_p += __n / int(_S_word_bit);
       __n = __n % int(_S_word_bit);
@@ -221,7 +236,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD
     friend _GLIBCXX20_CONSTEXPR bool
     operator==(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
-    { return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }
+    {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
+      return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset;
+    }
 
 #if __cpp_lib_three_way_comparison
     [[nodiscard]]
@@ -229,6 +248,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     operator<=>(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     noexcept
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       if (const auto __cmp = __x._M_p <=> __y._M_p; __cmp != 0)
 	return __cmp;
       return __x._M_offset <=> __y._M_offset;
@@ -238,6 +259,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     friend bool
     operator<(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       return __x._M_p < __y._M_p
 	    || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
     }
@@ -266,6 +289,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     friend _GLIBCXX20_CONSTEXPR ptrdiff_t
     operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
     {
+      __x._M_assume_normalized();
+      __y._M_assume_normalized();
       return (int(_S_word_bit) * (__x._M_p - __y._M_p)
 	      + __x._M_offset - __y._M_offset);
     }
@@ -297,7 +322,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
     reference
     operator*() const
-    { return reference(_M_p, 1UL << _M_offset); }
+    {
+      _M_assume_normalized();
+      return reference(_M_p, 1UL << _M_offset);
+    }
 
     _GLIBCXX20_CONSTEXPR
     iterator&
@@ -408,7 +436,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
     const_reference
     operator*() const
-    { return _Bit_reference(_M_p, 1UL << _M_offset); }
+    {
+      _M_assume_normalized();
+      return _Bit_reference(_M_p, 1UL << _M_offset);
+    }
 
     _GLIBCXX20_CONSTEXPR
     const_iterator&


-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-09  3:36       ` [PATCH v3] " Alexandre Oliva
@ 2023-11-09  5:57         ` François Dumont
  2023-11-09  8:16           ` Jonathan Wakely
  0 siblings, 1 reply; 18+ messages in thread
From: François Dumont @ 2023-11-09  5:57 UTC (permalink / raw)
  To: Alexandre Oliva, Jonathan Wakely; +Cc: Jonathan Wakely, libstdc++


On 09/11/2023 04:36, Alexandre Oliva wrote:
> On Nov  8, 2023, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
>> ofst needs to be __ofst but OK for trunk with that change.
> Oh, doh, thanks for catching that last-minute tweak.
>
> Retesting with that change completed successfully, so I've just pushed
> the following:
>
>
> libstdc++: optimize bit iterators assuming normalization [PR110807]
>
> The representation of bit iterators, using a pointer into an array of
> words, and an unsigned bit offset into that word, makes for some
> optimization challenges: because the compiler doesn't know that the
> offset is always in a certain narrow range, beginning at zero and
> ending before the word bitwidth, when a function loads an offset that
> it hasn't normalized itself, it may fail to derive certain reasonable
> conclusions, even to the point of retaining useless calls that elicit
> incorrect warnings.
>
> Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
> list to a global bit vector variable.  Based on the compile-time
> constant length of the list, we decide in _M_insert_range whether to
> use the existing storage or to allocate new storage for the vector.
> After allocation, we decide in _M_copy_aligned how to copy any
> preexisting portions of the vector to the newly-allocated storage.
> When copying two or more words, we use __builtin_memmove.
>
> However, because we compute the available room using bit offsets
> without range information, even comparing them with constants, we fail
> to infer ranges for the preexisting vector depending on word size, and
> may thus retain the memmove call despite knowing we've only allocated
> one word.
>
> Other parts of the compiler then detect the mismatch between the
> constant allocation size and the much larger range that could
> theoretically be copied into the newly-allocated storage if we could
> reach the call.
>
> Ensuring the compiler is aware of the constraints on the offset range
> enables it to do a much better job at optimizing.  Using attribute
> assume (_M_offset <= ...) didn't work, because gimple lowered that to
> something that vrp could only use to ensure 'this' was non-NULL.
> Exposing _M_offset as an automatic variable/gimple register outside
> the unevaluated assume operand enabled the optimizer to do its job.
>
> Rather than placing such load-then-assume constructs all over, I
> introduced an always-inline member function in bit iterators that does
> the job of conveying to the compiler the information that the
> assumption is supposed to hold, and various calls throughout functions
> pertaining to bit iterators that might not otherwise know that the
> offsets have to be in range, so that the compiler no longer needs to
> make conservative assumptions that prevent optimizations.
>
> With the explicit assumptions, the compiler can correlate the test for
> available storage in the vector with the test for how much storage
> might need to be copied, and determine that, if we're not asking for
> enough room for two or more words, we can omit entirely the code to
> copy two or more words, without any runtime overhead whatsoever: no
> traces remain of the undefined behavior or of the tests that inform
> the compiler about the assumptions that must hold.
>
>
> for  libstdc++-v3/ChangeLog
>
> 	PR libstdc++/110807
> 	* include/bits/stl_bvector.h (_Bit_iterator_base): Add
> 	_M_assume_normalized member function.  Call it in _M_bump_up,
> 	_M_bump_down, _M_incr, operator==, operator<=>, operator<, and
> 	operator-.
> 	(_Bit_iterator): Also call it in operator*.
> 	(_Bit_const_iterator): Likewise.
> ---
>   libstdc++-v3/include/bits/stl_bvector.h |   37 ++++++++++++++++++++++++++++---
>   1 file changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
> index 8d18bcaffd434..2b91af2005f2d 100644
> --- a/libstdc++-v3/include/bits/stl_bvector.h
> +++ b/libstdc++-v3/include/bits/stl_bvector.h
> @@ -56,6 +56,10 @@
>   #ifndef _STL_BVECTOR_H
>   #define _STL_BVECTOR_H 1
>   
> +#ifndef _GLIBCXX_ALWAYS_INLINE
> +#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
> +#endif
> +

IMHO using [[__gnu__::__always_inline__]] would give the same result, 
but without the macro !



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-09  5:57         ` François Dumont
@ 2023-11-09  8:16           ` Jonathan Wakely
  2023-11-09 19:49             ` [PATCH] libstdc++: bvector: undef always_inline macro Alexandre Oliva
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Wakely @ 2023-11-09  8:16 UTC (permalink / raw)
  To: François Dumont; +Cc: Alexandre Oliva, Jonathan Wakely, libstdc++

On Thu, 9 Nov 2023 at 05:57, François Dumont <frs.dumont@gmail.com> wrote:
>
>
> On 09/11/2023 04:36, Alexandre Oliva wrote:
> > On Nov  8, 2023, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> >
> >> ofst needs to be __ofst but OK for trunk with that change.
> > Oh, doh, thanks for catching that last-minute tweak.
> >
> > Retesting with that change completed successfully, so I've just pushed
> > the following:
> >
> >
> > libstdc++: optimize bit iterators assuming normalization [PR110807]
> >
> > The representation of bit iterators, using a pointer into an array of
> > words, and an unsigned bit offset into that word, makes for some
> > optimization challenges: because the compiler doesn't know that the
> > offset is always in a certain narrow range, beginning at zero and
> > ending before the word bitwidth, when a function loads an offset that
> > it hasn't normalized itself, it may fail to derive certain reasonable
> > conclusions, even to the point of retaining useless calls that elicit
> > incorrect warnings.
> >
> > Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
> > list to a global bit vector variable.  Based on the compile-time
> > constant length of the list, we decide in _M_insert_range whether to
> > use the existing storage or to allocate new storage for the vector.
> > After allocation, we decide in _M_copy_aligned how to copy any
> > preexisting portions of the vector to the newly-allocated storage.
> > When copying two or more words, we use __builtin_memmove.
> >
> > However, because we compute the available room using bit offsets
> > without range information, even comparing them with constants, we fail
> > to infer ranges for the preexisting vector depending on word size, and
> > may thus retain the memmove call despite knowing we've only allocated
> > one word.
> >
> > Other parts of the compiler then detect the mismatch between the
> > constant allocation size and the much larger range that could
> > theoretically be copied into the newly-allocated storage if we could
> > reach the call.
> >
> > Ensuring the compiler is aware of the constraints on the offset range
> > enables it to do a much better job at optimizing.  Using attribute
> > assume (_M_offset <= ...) didn't work, because gimple lowered that to
> > something that vrp could only use to ensure 'this' was non-NULL.
> > Exposing _M_offset as an automatic variable/gimple register outside
> > the unevaluated assume operand enabled the optimizer to do its job.
> >
> > Rather than placing such load-then-assume constructs all over, I
> > introduced an always-inline member function in bit iterators that does
> > the job of conveying to the compiler the information that the
> > assumption is supposed to hold, and various calls throughout functions
> > pertaining to bit iterators that might not otherwise know that the
> > offsets have to be in range, so that the compiler no longer needs to
> > make conservative assumptions that prevent optimizations.
> >
> > With the explicit assumptions, the compiler can correlate the test for
> > available storage in the vector with the test for how much storage
> > might need to be copied, and determine that, if we're not asking for
> > enough room for two or more words, we can omit entirely the code to
> > copy two or more words, without any runtime overhead whatsoever: no
> > traces remain of the undefined behavior or of the tests that inform
> > the compiler about the assumptions that must hold.
> >
> >
> > for  libstdc++-v3/ChangeLog
> >
> >       PR libstdc++/110807
> >       * include/bits/stl_bvector.h (_Bit_iterator_base): Add
> >       _M_assume_normalized member function.  Call it in _M_bump_up,
> >       _M_bump_down, _M_incr, operator==, operator<=>, operator<, and
> >       operator-.
> >       (_Bit_iterator): Also call it in operator*.
> >       (_Bit_const_iterator): Likewise.
> > ---
> >   libstdc++-v3/include/bits/stl_bvector.h |   37 ++++++++++++++++++++++++++++---
> >   1 file changed, 34 insertions(+), 3 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
> > index 8d18bcaffd434..2b91af2005f2d 100644
> > --- a/libstdc++-v3/include/bits/stl_bvector.h
> > +++ b/libstdc++-v3/include/bits/stl_bvector.h
> > @@ -56,6 +56,10 @@
> >   #ifndef _STL_BVECTOR_H
> >   #define _STL_BVECTOR_H 1
> >
> > +#ifndef _GLIBCXX_ALWAYS_INLINE
> > +#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
> > +#endif
> > +
>
> IMHO using [[__gnu__::__always_inline__]] would give the same result,
> but without the macro !

But only valid for C++11 and later, not C++98.

I think the reason we use _GLIBCXX_ALWAYS_INLINE like that is in case
it causes compiler errors ("'always_inline' function might not be
inlinable"), so that users can define the macro themselves to disable
the attribute. At least, that's my assumption of why we do it via
macros in several places.

But I've just realised we probably want to #undef the macro at the end
of bits/stl_bvector.h too.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-09  8:16           ` Jonathan Wakely
@ 2023-11-09 19:49             ` Alexandre Oliva
  2023-11-09 20:18               ` Jonathan Wakely
  0 siblings, 1 reply; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-09 19:49 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: François Dumont, Jonathan Wakely, libstdc++

On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:

> But I've just realised we probably want to #undef the macro at the end
> of bits/stl_bvector.h too.

I'm not sure why (what if another libstdc++ header were to define the
macro, includes stl_bvector.h, and then use the macro expecting it to
still be there?), but I suppose this is what you mean.  Regstrapped on
x86_64-linux-gnu just to be sure.  Ok to install?


From: Alexandre Oliva <oliva@adacore.com>

It's customary to undefine temporary internal macros at the end of the
header that defines them, even such widely-usable ones as
_GLIBCXX_ALWAYS_INLINE, so do so in the header where the define was
recently introduced.


for  libstdc++-v3/ChangeLog

	* include/bits/stl_bvector.h (_GLIBCXX_ALWAYS_INLINE): Undef.
---
 libstdc++-v3/include/bits/stl_bvector.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 2b91af2005f2d..1b7648535c523 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -1628,4 +1628,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif

-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-09 19:49             ` [PATCH] libstdc++: bvector: undef always_inline macro Alexandre Oliva
@ 2023-11-09 20:18               ` Jonathan Wakely
  2023-11-15  2:20                 ` Patrick Palka
  2023-11-15  2:44                 ` Alexandre Oliva
  0 siblings, 2 replies; 18+ messages in thread
From: Jonathan Wakely @ 2023-11-09 20:18 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: François Dumont, Jonathan Wakely, libstdc++

On Thu, 9 Nov 2023 at 19:49, Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> > But I've just realised we probably want to #undef the macro at the end
> > of bits/stl_bvector.h too.
>
> I'm not sure why (what if another libstdc++ header were to define the
> macro, includes stl_bvector.h, and then use the macro expecting it to
> still be there?), but I suppose this is what you mean.

It's consistent with all the other definitions of the macro in our
headers. We always define it locally and then undef it again at the
end of the header. You're right that that makes it rather hard to use
reliably.

>  Regstrapped on
> x86_64-linux-gnu just to be sure.  Ok to install?

OK thanks.


>
>
> From: Alexandre Oliva <oliva@adacore.com>
>
> It's customary to undefine temporary internal macros at the end of the
> header that defines them, even such widely-usable ones as
> _GLIBCXX_ALWAYS_INLINE, so do so in the header where the define was
> recently introduced.
>
>
> for  libstdc++-v3/ChangeLog
>
>         * include/bits/stl_bvector.h (_GLIBCXX_ALWAYS_INLINE): Undef.
> ---
>  libstdc++-v3/include/bits/stl_bvector.h |    2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
> index 2b91af2005f2d..1b7648535c523 100644
> --- a/libstdc++-v3/include/bits/stl_bvector.h
> +++ b/libstdc++-v3/include/bits/stl_bvector.h
> @@ -1628,4 +1628,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>  _GLIBCXX_END_NAMESPACE_VERSION
>  } // namespace std
>
> +#undef _GLIBCXX_ALWAYS_INLINE
> +
>  #endif
>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-09 20:18               ` Jonathan Wakely
@ 2023-11-15  2:20                 ` Patrick Palka
  2023-11-15  5:53                   ` Alexandre Oliva
  2023-11-15  2:44                 ` Alexandre Oliva
  1 sibling, 1 reply; 18+ messages in thread
From: Patrick Palka @ 2023-11-15  2:20 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Alexandre Oliva, François Dumont, Jonathan Wakely, libstdc++

On Thu, 9 Nov 2023, Jonathan Wakely wrote:

> On Thu, 9 Nov 2023 at 19:49, Alexandre Oliva <oliva@adacore.com> wrote:
> >
> > On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:
> >
> > > But I've just realised we probably want to #undef the macro at the end
> > > of bits/stl_bvector.h too.
> >
> > I'm not sure why (what if another libstdc++ header were to define the
> > macro, includes stl_bvector.h, and then use the macro expecting it to
> > still be there?), but I suppose this is what you mean.
> 
> It's consistent with all the other definitions of the macro in our
> headers. We always define it locally and then undef it again at the
> end of the header. You're right that that makes it rather hard to use
> reliably.

Hmm, in which headers do we currently undef _GLIBCXX_ALWAYS_INLINE?  A
bunch of headers seem to conditionally define it, e.g. bits/atomic_futex.h,
but none seem to undef it.  I wonder why we don't just conditionally
define this macro once in c++config?

> 
> >  Regstrapped on
> > x86_64-linux-gnu just to be sure.  Ok to install?
> 
> OK thanks.
> 
> 
> >
> >
> > From: Alexandre Oliva <oliva@adacore.com>
> >
> > It's customary to undefine temporary internal macros at the end of the
> > header that defines them, even such widely-usable ones as
> > _GLIBCXX_ALWAYS_INLINE, so do so in the header where the define was
> > recently introduced.
> >
> >
> > for  libstdc++-v3/ChangeLog
> >
> >         * include/bits/stl_bvector.h (_GLIBCXX_ALWAYS_INLINE): Undef.
> > ---
> >  libstdc++-v3/include/bits/stl_bvector.h |    2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
> > index 2b91af2005f2d..1b7648535c523 100644
> > --- a/libstdc++-v3/include/bits/stl_bvector.h
> > +++ b/libstdc++-v3/include/bits/stl_bvector.h
> > @@ -1628,4 +1628,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> >  _GLIBCXX_END_NAMESPACE_VERSION
> >  } // namespace std
> >
> > +#undef _GLIBCXX_ALWAYS_INLINE
> > +
> >  #endif
> >
> > --
> > Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
> >    Free Software Activist                   GNU Toolchain Engineer
> > More tolerance and less prejudice are key for inclusion and diversity
> > Excluding neuro-others for not behaving ""normal"" is *not* inclusive
> >
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-09 20:18               ` Jonathan Wakely
  2023-11-15  2:20                 ` Patrick Palka
@ 2023-11-15  2:44                 ` Alexandre Oliva
  2023-11-15  5:08                   ` Alexandre Oliva
  2023-11-15  8:22                   ` Jonathan Wakely
  1 sibling, 2 replies; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-15  2:44 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: François Dumont, Jonathan Wakely, libstdc++

On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:

> On Thu, 9 Nov 2023 at 19:49, Alexandre Oliva <oliva@adacore.com> wrote:
>> 
>> On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:
>> 
>> > But I've just realised we probably want to #undef the macro at the end
>> > of bits/stl_bvector.h too.
>> 
>> I'm not sure why (what if another libstdc++ header were to define the
>> macro, includes stl_bvector.h, and then use the macro expecting it to
>> still be there?), but I suppose this is what you mean.

> It's consistent with all the other definitions of the macro in our
> headers. We always define it locally and then undef it again at the
> end of the header. You're right that that makes it rather hard to use
> reliably.

Not only that.  It also seems to cause failures, but I messed up in my
testing and didn't catch them.

We've seen gcc-13 regressions in g++.dg/modules/xtreme-header* and
27_io/basic_syncbuf/sync_ops/1.cc.  I suspect they're going to show up
in trunk as well starting today, since I've just installed this bit
(unfortunately, shortly before I learned about the regressions)

I'm about to rerun trunk testing and, if I confirm the regressions, I'll
revert this followup patch.

The symptom in gcc-13 is that bits/semaphore_base.h fails to compile
because _GLIBCXX_ALWAYS_INLINE is no longer available.  AFAICT it relied
on the define from bits/atomic_base.h included by <atomic>, but <chrono>
includes <vector> that, in stl_bvector.h, undefines the macro, and then
when bits/semaphore_base.h includes bits/atomic_base.h again, the macro
doesn't get defined again.  We should probably define and undefine it
explicitly in semaphore_base.h as well, and everywhere else that uses
it, if the current policy is to be maintained.

Backporters beware ;-)

>> * include/bits/stl_bvector.h (_GLIBCXX_ALWAYS_INLINE): Undef.

-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-15  2:44                 ` Alexandre Oliva
@ 2023-11-15  5:08                   ` Alexandre Oliva
  2023-11-15  8:22                   ` Jonathan Wakely
  1 sibling, 0 replies; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-15  5:08 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: François Dumont, Jonathan Wakely, libstdc++

On Nov 14, 2023, Alexandre Oliva <oliva@adacore.com> wrote:

> Not only that.  It also seems to cause failures, but I messed up in my
> testing and didn't catch them.

Confirmed, at least xtreme-header* regresses, so I reverted it.

What now?  Do we (a) stick to the policy, and amend headers that use
_GLIBCXX_ALWAYS_INLINE to (a.i) define it after any other headers, and
(a.ii) undefine it at the end, (b) use Patrick's suggestion of
implementing it in c++config.h, (c) formally relax the current
(incompletely-implemented) policy and allow headers to leave the macro
defined, so that others can rely on it, or (d) introduce a new header
with the sole purpose of getting this macro defined?

-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-15  2:20                 ` Patrick Palka
@ 2023-11-15  5:53                   ` Alexandre Oliva
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-15  5:53 UTC (permalink / raw)
  To: Patrick Palka
  Cc: Jonathan Wakely, François Dumont, Jonathan Wakely, libstdc++

On Nov 14, 2023, Patrick Palka <ppalka@redhat.com> wrote:

> Hmm, in which headers do we currently undef _GLIBCXX_ALWAYS_INLINE?

None whatsoever, AFAICT.  Of the 7 headers that mention
_GLIBCXX_ALWAYS_INLINE, some (but not all) #define it, and none #undef it.

Bringing them in line with the alleged custom would require an
(untested) patch like this.  (It also makes sure that no #include stands
between the #define and the uses)


libstdc++: define and undefine always_inline macro wherever used

---
 libstdc++-v3/include/bits/atomic_base.h    |   10 ++++++----
 libstdc++-v3/include/bits/atomic_futex.h   |    2 ++
 libstdc++-v3/include/bits/char_traits.h    |    8 +++++---
 libstdc++-v3/include/bits/semaphore_base.h |    7 +++++++
 libstdc++-v3/include/bits/stl_bvector.h    |   10 ++++++----
 libstdc++-v3/include/std/barrier           |    8 ++++++++
 libstdc++-v3/include/std/latch             |    7 +++++++
 7 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index 974872ad7a630..5562365c74c74 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -42,14 +42,14 @@
 #include <bits/atomic_wait.h>
 #endif
 
-#ifndef _GLIBCXX_ALWAYS_INLINE
-#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
-#endif
-
 #define __glibcxx_want_atomic_value_initialization
 #define __glibcxx_want_atomic_flag_test
 #include <bits/version.h>
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -2067,4 +2067,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif
diff --git a/libstdc++-v3/include/bits/atomic_futex.h b/libstdc++-v3/include/bits/atomic_futex.h
index 6b37ce5ed120e..0f370fa82de06 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -357,4 +357,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif
diff --git a/libstdc++-v3/include/bits/char_traits.h b/libstdc++-v3/include/bits/char_traits.h
index e9b4e84af994f..e56818b254b67 100644
--- a/libstdc++-v3/include/bits/char_traits.h
+++ b/libstdc++-v3/include/bits/char_traits.h
@@ -57,13 +57,13 @@
 # include <bits/stl_construct.h>
 #endif
 
+#define __glibcxx_want_constexpr_char_traits
+#include <bits/version.h>
+
 #ifndef _GLIBCXX_ALWAYS_INLINE
 # define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
 #endif
 
-#define __glibcxx_want_constexpr_char_traits
-#include <bits/version.h>
-
 namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -1027,4 +1027,6 @@ _GLIBCXX_END_NAMESPACE_VERSION
 
 #endif  // C++11
 
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif // _CHAR_TRAITS_H
diff --git a/libstdc++-v3/include/bits/semaphore_base.h b/libstdc++-v3/include/bits/semaphore_base.h
index 2b3c196040baa..06a45f6876a26 100644
--- a/libstdc++-v3/include/bits/semaphore_base.h
+++ b/libstdc++-v3/include/bits/semaphore_base.h
@@ -45,6 +45,10 @@
 # include <semaphore.h>	// sem_t, sem_init, sem_wait, sem_post etc.
 #endif
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -274,4 +278,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
+
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif // _GLIBCXX_SEMAPHORE_BASE_H
diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 64f04c1f4f599..c1aa3396414d2 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -56,15 +56,15 @@
 #ifndef _STL_BVECTOR_H
 #define _STL_BVECTOR_H 1
 
-#ifndef _GLIBCXX_ALWAYS_INLINE
-#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
-#endif
-
 #if __cplusplus >= 201103L
 #include <initializer_list>
 #include <bits/functional_hash.h>
 #endif
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -1630,4 +1630,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif
diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index 8e03a5820c6e5..5aaffa62086c4 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -52,6 +52,10 @@
 
 #include <array>
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -261,5 +265,9 @@ It looks different from literature pseudocode for two main reasons:
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
+
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif // __cpp_lib_barrier
+
 #endif // _GLIBCXX_BARRIER
diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
index 27cd80d71005f..0548ddd67ae57 100644
--- a/libstdc++-v3/include/std/latch
+++ b/libstdc++-v3/include/std/latch
@@ -40,6 +40,10 @@
 #include <bits/atomic_base.h>
 #include <ext/numeric_traits.h>
 
+#ifndef _GLIBCXX_ALWAYS_INLINE
+#define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__))
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -90,5 +94,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   };
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
+
+#undef _GLIBCXX_ALWAYS_INLINE
+
 #endif // __cpp_lib_latch
 #endif // _GLIBCXX_LATCH


-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-15  2:44                 ` Alexandre Oliva
  2023-11-15  5:08                   ` Alexandre Oliva
@ 2023-11-15  8:22                   ` Jonathan Wakely
  2023-11-16  4:40                     ` Alexandre Oliva
  1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Wakely @ 2023-11-15  8:22 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: François Dumont, Jonathan Wakely, libstdc++

On Wed, 15 Nov 2023 at 02:44, Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> > On Thu, 9 Nov 2023 at 19:49, Alexandre Oliva <oliva@adacore.com> wrote:
> >>
> >> On Nov  9, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:
> >>
> >> > But I've just realised we probably want to #undef the macro at the end
> >> > of bits/stl_bvector.h too.
> >>
> >> I'm not sure why (what if another libstdc++ header were to define the
> >> macro, includes stl_bvector.h, and then use the macro expecting it to
> >> still be there?), but I suppose this is what you mean.
>
> > It's consistent with all the other definitions of the macro in our
> > headers. We always define it locally and then undef it again at the
> > end of the header. You're right that that makes it rather hard to use
> > reliably.
>
> Not only that.  It also seems to cause failures, but I messed up in my
> testing and didn't catch them.
>
> We've seen gcc-13 regressions in g++.dg/modules/xtreme-header* and
> 27_io/basic_syncbuf/sync_ops/1.cc.  I suspect they're going to show up
> in trunk as well starting today, since I've just installed this bit
> (unfortunately, shortly before I learned about the regressions)
>
> I'm about to rerun trunk testing and, if I confirm the regressions, I'll
> revert this followup patch.

Yes, please revert. I misremembered.

I thought I even checked my assumption, but I don't see it in my shell
history now.

Sorry for the mixup.

>
> The symptom in gcc-13 is that bits/semaphore_base.h fails to compile
> because _GLIBCXX_ALWAYS_INLINE is no longer available.  AFAICT it relied
> on the define from bits/atomic_base.h included by <atomic>, but <chrono>
> includes <vector> that, in stl_bvector.h, undefines the macro, and then
> when bits/semaphore_base.h includes bits/atomic_base.h again, the macro
> doesn't get defined again.  We should probably define and undefine it
> explicitly in semaphore_base.h as well, and everywhere else that uses
> it, if the current policy is to be maintained.
>
> Backporters beware ;-)
>
> >> * include/bits/stl_bvector.h (_GLIBCXX_ALWAYS_INLINE): Undef.
>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] libstdc++: bvector: undef always_inline macro
  2023-11-15  8:22                   ` Jonathan Wakely
@ 2023-11-16  4:40                     ` Alexandre Oliva
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Oliva @ 2023-11-16  4:40 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: François Dumont, Jonathan Wakely, libstdc++

On Nov 15, 2023, Jonathan Wakely <jwakely@redhat.com> wrote:

> I thought I even checked my assumption, but I don't see it in my shell
> history now.

Heh, I could promise I checked it myself.  ISTR even locating and
copying it from another header, but maybe it was the define rather than
the undef.  Oh well.  Sorry about the temporary regression.

> Sorry for the mixup.

No worries there

-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2023-11-09  1:22     ` Jonathan Wakely
  2023-11-09  3:36       ` [PATCH v3] " Alexandre Oliva
@ 2024-02-07 16:25       ` Torbjorn SVENSSON
  2024-02-07 16:36         ` Jonathan Wakely
  1 sibling, 1 reply; 18+ messages in thread
From: Torbjorn SVENSSON @ 2024-02-07 16:25 UTC (permalink / raw)
  To: Jonathan Wakely, Alexandre Oliva
  Cc: Jonathan Wakely, gcc-patches, libstdc++, Yvan Roux

Hi,

Is it okay to backport e39b3e02c27bd771a07e385f9672ecf1a45ced77 to 
releases/gcc-13?

Without this backport, I see this failure on arm-none-eabi:

FAIL: 23_containers/vector/bool/110807.cc (test for excess errors)

Kind regards,
Torbjörn


On 2023-11-09 02:22, Jonathan Wakely wrote:
> 
> 
> On Thu, 9 Nov 2023, 01:17 Alexandre Oliva, <oliva@adacore.com 
> <mailto:oliva@adacore.com>> wrote:
> 
>     On Nov  8, 2023, Jonathan Wakely <jwakely@redhat.com
>     <mailto:jwakely@redhat.com>> wrote:
> 
>      > A single underscore prefix on __GLIBCXX_BUILTIN_ASSUME and
>      > __GLIBCXX_DISABLE_ASSUMPTIONS please.
> 
>     That's entirely gone now.
> 
>      >> +    do                                              \
>      >> +      if (std::is_constant_evaluated ())    \
>      >> +    static_assert(expr);                    \
> 
>      > This can never be valid.
> 
>     *nod*
> 
>      > This already works fine in constant evaluation anyway.
> 
>     Yeah, that's what I figured.
> 
>      > But what's the null dereference for?
> 
>     The idea was to clearly trigger undefined behavior.  Maybe it wasn't
>     needed, it didn't occur to me that __builtin_unreachable() would be
>     enough.  I realize I was really trying to emulate attribute assume, even
>     without knowing it existed ;-)
> 
>      >> +#define __GLIBCXX_BUILTIN_ASSUME(expr)              \
>      >> +    (void)(false && (expr))
> 
>      > What's the point of this, just to verify that (expr) is contextually
>      > convertible to bool?
> 
>     I'd have phrased it as "avoid the case in which something compiles with
>     -O0 but not with -O", but yeah ;-)
> 
>      > We don't use the _p suffix for predicates in the library.
>      > Please use just _M_normalized or _M_is_normalized.
> 
>     ACK.  It's also gone now.
> 
>      > But do we even need this function? It's not used anywhere else,
>     can we
>      > just inline the condition into _M_assume_normalized() ?
> 
>     I had other uses for it in earlier versions of the patch, but it makes
>     no sense any more indeed.
> 
>      >> +    _GLIBCXX20_CONSTEXPR
>      >> +    void
>      >> +    _M_assume_normalized() const
> 
>      > I think this should use _GLIBCXX_ALWAYS_INLINE
> 
>     *nod*, thanks
> 
>      >> +    {
>      >> +      __GLIBCXX_BUILTIN_ASSUME (_M_normalized_p ());
> 
>      > Is there even any benefit to this macro?
> 
>     I just thought it could have other uses, without being aware that the
>     entire concept was available as a statement attribute.  Funny, I'd even
>     searched for it among the existing attributes and builtins, but somehow
>     I managed to miss it.  Thanks for getting me back down that path.
> 
>      >        __attribute__((__assume__(_M_offset <
>     unsigned(_S_word_bit))));
> 
>     That unfortunately doesn't work, because the assume lowering doesn't go
>     as far as dereferencing the implicit this and making an SSA_NAME out of
>     the loaded _M_offset, which we'd need to be able to optimize based on
>     it.  But that only took me a while to figure out and massage into
>     something that had the desired effect.  Now, maybe the above *should*
>     have that effect already, but unfortunately it doesn't.
> 
>      > Maybe even get rid of _M_assume_normalized() as a function and just
>      > put that attribute everywhere you currently use _M_assume_normalized.
> 
>     Because of the slight kludge required to make the attribute have the
>     desired effect (namely ensuring the _M_offset reference is evaluated),
>     I've retained it as an inline function.
> 
>     Here's what I'm retesting now.  WDYT?
> 
> 
> ofst needs to be __ofst but OK for trunk with that change.
> 
> We probably want this on the gcc-13 branch too, but let's give it some 
> time on trunk in case the assume attribute isn't quite ready for prime time.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2024-02-07 16:25       ` [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807] Torbjorn SVENSSON
@ 2024-02-07 16:36         ` Jonathan Wakely
  2024-02-09  8:49           ` Torbjorn SVENSSON
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Wakely @ 2024-02-07 16:36 UTC (permalink / raw)
  To: Torbjorn SVENSSON
  Cc: Jonathan Wakely, Alexandre Oliva, gcc-patches, libstdc++, Yvan Roux

[-- Attachment #1: Type: text/plain, Size: 4430 bytes --]

On Wed, 7 Feb 2024 at 16:25, Torbjorn SVENSSON <
torbjorn.svensson@foss.st.com> wrote:

> Hi,
>
> Is it okay to backport e39b3e02c27bd771a07e385f9672ecf1a45ced77 to
> releases/gcc-13?
>

It would also need 807f47497f17ed50be91f0f879308cb6fa063966

Please test with that as well, and OK for both if all goes well.



> Without this backport, I see this failure on arm-none-eabi:
>
> FAIL: 23_containers/vector/bool/110807.cc (test for excess errors)
>
> Kind regards,
> Torbjörn
>
>
> On 2023-11-09 02:22, Jonathan Wakely wrote:
> >
> >
> > On Thu, 9 Nov 2023, 01:17 Alexandre Oliva, <oliva@adacore.com
> > <mailto:oliva@adacore.com>> wrote:
> >
> >     On Nov  8, 2023, Jonathan Wakely <jwakely@redhat.com
> >     <mailto:jwakely@redhat.com>> wrote:
> >
> >      > A single underscore prefix on __GLIBCXX_BUILTIN_ASSUME and
> >      > __GLIBCXX_DISABLE_ASSUMPTIONS please.
> >
> >     That's entirely gone now.
> >
> >      >> +    do                                              \
> >      >> +      if (std::is_constant_evaluated ())    \
> >      >> +    static_assert(expr);                    \
> >
> >      > This can never be valid.
> >
> >     *nod*
> >
> >      > This already works fine in constant evaluation anyway.
> >
> >     Yeah, that's what I figured.
> >
> >      > But what's the null dereference for?
> >
> >     The idea was to clearly trigger undefined behavior.  Maybe it wasn't
> >     needed, it didn't occur to me that __builtin_unreachable() would be
> >     enough.  I realize I was really trying to emulate attribute assume,
> even
> >     without knowing it existed ;-)
> >
> >      >> +#define __GLIBCXX_BUILTIN_ASSUME(expr)              \
> >      >> +    (void)(false && (expr))
> >
> >      > What's the point of this, just to verify that (expr) is
> contextually
> >      > convertible to bool?
> >
> >     I'd have phrased it as "avoid the case in which something compiles
> with
> >     -O0 but not with -O", but yeah ;-)
> >
> >      > We don't use the _p suffix for predicates in the library.
> >      > Please use just _M_normalized or _M_is_normalized.
> >
> >     ACK.  It's also gone now.
> >
> >      > But do we even need this function? It's not used anywhere else,
> >     can we
> >      > just inline the condition into _M_assume_normalized() ?
> >
> >     I had other uses for it in earlier versions of the patch, but it
> makes
> >     no sense any more indeed.
> >
> >      >> +    _GLIBCXX20_CONSTEXPR
> >      >> +    void
> >      >> +    _M_assume_normalized() const
> >
> >      > I think this should use _GLIBCXX_ALWAYS_INLINE
> >
> >     *nod*, thanks
> >
> >      >> +    {
> >      >> +      __GLIBCXX_BUILTIN_ASSUME (_M_normalized_p ());
> >
> >      > Is there even any benefit to this macro?
> >
> >     I just thought it could have other uses, without being aware that the
> >     entire concept was available as a statement attribute.  Funny, I'd
> even
> >     searched for it among the existing attributes and builtins, but
> somehow
> >     I managed to miss it.  Thanks for getting me back down that path.
> >
> >      >        __attribute__((__assume__(_M_offset <
> >     unsigned(_S_word_bit))));
> >
> >     That unfortunately doesn't work, because the assume lowering doesn't
> go
> >     as far as dereferencing the implicit this and making an SSA_NAME out
> of
> >     the loaded _M_offset, which we'd need to be able to optimize based on
> >     it.  But that only took me a while to figure out and massage into
> >     something that had the desired effect.  Now, maybe the above *should*
> >     have that effect already, but unfortunately it doesn't.
> >
> >      > Maybe even get rid of _M_assume_normalized() as a function and
> just
> >      > put that attribute everywhere you currently use
> _M_assume_normalized.
> >
> >     Because of the slight kludge required to make the attribute have the
> >     desired effect (namely ensuring the _M_offset reference is
> evaluated),
> >     I've retained it as an inline function.
> >
> >     Here's what I'm retesting now.  WDYT?
> >
> >
> > ofst needs to be __ofst but OK for trunk with that change.
> >
> > We probably want this on the gcc-13 branch too, but let's give it some
> > time on trunk in case the assume attribute isn't quite ready for prime
> time.
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807]
  2024-02-07 16:36         ` Jonathan Wakely
@ 2024-02-09  8:49           ` Torbjorn SVENSSON
  0 siblings, 0 replies; 18+ messages in thread
From: Torbjorn SVENSSON @ 2024-02-09  8:49 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Jonathan Wakely, Alexandre Oliva, gcc-patches, libstdc++, Yvan Roux



On 2024-02-07 17:36, Jonathan Wakely wrote:
> 
> 
> On Wed, 7 Feb 2024 at 16:25, Torbjorn SVENSSON 
> <torbjorn.svensson@foss.st.com <mailto:torbjorn.svensson@foss.st.com>> 
> wrote:
> 
>     Hi,
> 
>     Is it okay to backport e39b3e02c27bd771a07e385f9672ecf1a45ced77 to
>     releases/gcc-13?
> 
> 
> It would also need 807f47497f17ed50be91f0f879308cb6fa063966
> 
> Please test with that as well, and OK for both if all goes well.

Test was passed on arm-none-eabi for both these.
Pushed as 5d684a5f7f82b1425cac5eaa11dccc3b51a62d44 and 
adef1e0ebeb5055ed4e8776fd3f73c9d84821eaa.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-02-09  8:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-08 16:10 [PATCH] libstdc++: optimize bit iterators assuming normalization [PR110807] Alexandre Oliva
2023-11-08 19:32 ` Jonathan Wakely
2023-11-09  1:17   ` [PATCH v2] " Alexandre Oliva
2023-11-09  1:22     ` Jonathan Wakely
2023-11-09  3:36       ` [PATCH v3] " Alexandre Oliva
2023-11-09  5:57         ` François Dumont
2023-11-09  8:16           ` Jonathan Wakely
2023-11-09 19:49             ` [PATCH] libstdc++: bvector: undef always_inline macro Alexandre Oliva
2023-11-09 20:18               ` Jonathan Wakely
2023-11-15  2:20                 ` Patrick Palka
2023-11-15  5:53                   ` Alexandre Oliva
2023-11-15  2:44                 ` Alexandre Oliva
2023-11-15  5:08                   ` Alexandre Oliva
2023-11-15  8:22                   ` Jonathan Wakely
2023-11-16  4:40                     ` Alexandre Oliva
2024-02-07 16:25       ` [PATCH v2] libstdc++: optimize bit iterators assuming normalization [PR110807] Torbjorn SVENSSON
2024-02-07 16:36         ` Jonathan Wakely
2024-02-09  8:49           ` Torbjorn SVENSSON

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).