[PATCH 0/8] std::experimental::simd patchset

public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/8] std::experimental::simd patchset
@ 2023-02-23  8:48 Matthias Kretz
  2023-02-23  8:49 ` [PATCH 1/8] libstdc++: Simplify three helper functions into one Matthias Kretz
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:48 UTC (permalink / raw)
  To: gcc-patches, libstdc++

Tested on x86_64-pc-linux.

This patchset provides the final changes for PR108030 and resolves 
PR108856. The latter is a pure optimization and could wait for Stage 1 (I'm 
submitting the patch because simd is experimental/TS)

Matthias Kretz (8):
  libstdc++: Simplify three helper functions into one
  libstdc++: Fix simd build failure on clang
  libstdc++: More efficient masked inc-/decrement implementation
  libstdc++: Add missing constexpr on simd shift implementation
  libstdc++: Always-inline most of non-cmath fixed_size implementation
  libstdc++: Fix formatting
  libstdc++: Fix -Wsign-compare issue
  libstdc++: Test that integral simd reductions are precise

 libstdc++-v3/include/experimental/bits/simd.h | 485 ++++++------
 .../include/experimental/bits/simd_builtin.h  | 721 +++++++++---------
 .../include/experimental/bits/simd_detail.h   |   3 +-
 .../experimental/bits/simd_fixed_size.h       | 286 ++++---
 .../include/experimental/bits/simd_neon.h     |  24 +-
 .../include/experimental/bits/simd_ppc.h      |   3 +-
 .../include/experimental/bits/simd_scalar.h   | 362 +++++----
 .../include/experimental/bits/simd_x86.h      | 158 ++--
 .../experimental/simd/tests/reductions.cc     |   3 +-
 9 files changed, 1075 insertions(+), 970 deletions(-)

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/8] libstdc++: Simplify three helper functions into one
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
@ 2023-02-23  8:49 ` Matthias Kretz
  2023-02-23 11:05   ` Jonathan Wakely
  2023-02-23  8:49 ` [PATCH 2/8] libstdc++: Fix simd build failure on clang Matthias Kretz
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:49 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]



Broadcast is a very common function. This should reduce compile-time
effort.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/108030
	* include/experimental/bits/simd.h (__vector_broadcast):
	Implement via __vector_broadcast_impl instead of
	__call_with_n_evaluations + 2 lambdas.
	(__vector_broadcast_impl): New.
---
 libstdc++-v3/include/experimental/bits/simd.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0001-libstdc-Simplify-three-helper-functions-into-one.patch --]
[-- Type: text/x-patch, Size: 1049 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 2f615d13b73..7482d109291 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1798,15 +1798,15 @@ __to_intrin(_Tp __x)
 
 // }}}
 // __vector_broadcast{{{
+template <size_t _Np, typename _Tp, size_t... _I>
+  _GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+  __vector_broadcast_impl(_Tp __x, index_sequence<_I...>)
+  { return __vector_type_t<_Tp, _Np>{((void)_I, __x)...}; }
+
 template <size_t _Np, typename _Tp>
   _GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
   __vector_broadcast(_Tp __x)
-  {
-    return __call_with_n_evaluations<_Np>(
-      [](auto... __xx) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
-	return __vector_type_t<_Tp, _Np>{__xx...};
-      }, [&__x](int) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { return __x; });
-  }
+  { return __vector_broadcast_impl<_Np, _Tp>(__x, make_index_sequence<_Np>()); }
 
 // }}}
 // __generate_vector{{{

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/8] libstdc++: Fix simd build failure on clang
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
  2023-02-23  8:49 ` [PATCH 1/8] libstdc++: Simplify three helper functions into one Matthias Kretz
@ 2023-02-23  8:49 ` Matthias Kretz
  2023-02-23 11:06   ` Jonathan Wakely
  2023-02-23  8:49 ` [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation Matthias Kretz
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:49 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]



Clang does not support __attribute__ on lambdas. Therefore, only set
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA if __clang__ is not defined.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/108030
	* include/experimental/bits/simd_detail.h
	(_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA): Define as empty for
	__clang__.
---
 libstdc++-v3/include/experimental/bits/simd_detail.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0002-libstdc-Fix-simd-build-failure-on-clang.patch --]
[-- Type: text/x-patch, Size: 1156 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h
index a0ad10efe0f..30cc1ef0eef 100644
--- a/libstdc++-v3/include/experimental/bits/simd_detail.h
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -254,15 +254,16 @@ namespace experimental
 
 #ifdef __clang__
 #define _GLIBCXX_SIMD_NORMAL_MATH
+#define _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA
 #else
 #define _GLIBCXX_SIMD_NORMAL_MATH                                              \
   [[__gnu__::__optimize__("finite-math-only,no-signed-zeros")]]
+#define _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA __attribute__((__always_inline__))
 #endif
 #define _GLIBCXX_SIMD_NEVER_INLINE [[__gnu__::__noinline__]]
 #define _GLIBCXX_SIMD_INTRINSIC                                                \
   [[__gnu__::__always_inline__, __gnu__::__artificial__]] inline
 #define _GLIBCXX_SIMD_ALWAYS_INLINE [[__gnu__::__always_inline__]] inline
-#define _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA __attribute__((__always_inline__))
 #define _GLIBCXX_SIMD_IS_UNLIKELY(__x) __builtin_expect(__x, 0)
 #define _GLIBCXX_SIMD_IS_LIKELY(__x) __builtin_expect(__x, 1)
 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
  2023-02-23  8:49 ` [PATCH 1/8] libstdc++: Simplify three helper functions into one Matthias Kretz
  2023-02-23  8:49 ` [PATCH 2/8] libstdc++: Fix simd build failure on clang Matthias Kretz
@ 2023-02-23  8:49 ` Matthias Kretz
  2023-02-24 17:12   ` Jonathan Wakely
  2023-02-23  8:49 ` [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation Matthias Kretz
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:49 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1216 bytes --]



Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/108856
	* include/experimental/bits/simd_builtin.h
	(_SimdImplBuiltin::_S_masked_unary): More efficient
	implementation of masked inc-/decrement for integers and floats
	without AVX2.
	* include/experimental/bits/simd_x86.h
	(_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
	builtins for masked inc-/decrement.
---
 .../include/experimental/bits/simd_builtin.h  | 27 +++++++-
 .../include/experimental/bits/simd_x86.h      | 68 +++++++++++++++++++
 2 files changed, 93 insertions(+), 2 deletions(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0003-libstdc-More-efficient-masked-inc-decrement-implemen.patch --]
[-- Type: text/x-patch, Size: 4989 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index 792439a81bf..4a4de4534f3 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -2546,8 +2546,31 @@ _S_masked_unary(const _SimdWrapper<_K, _Np> __k,
 	_Op<decltype(__vv)> __op;
 	if (__k._M_is_constprop_all_of())
 	  return __data(__op(__vv));
-	else
-	  return _CommonImpl::_S_blend(__k, __v, __data(__op(__vv)));
+	else if constexpr (is_same_v<_Op<void>, __increment<void>>)
+	  {
+	    static_assert(not std::is_same_v<_K, bool>);
+	    if constexpr (is_integral_v<_Tp>)
+	      // Take a shortcut knowing that __k is an integer vector with values -1 or 0.
+	      return __v._M_data - __vector_bitcast<_Tp>(__k._M_data);
+	    else if constexpr (not __have_avx2)
+	      return __v._M_data
+		       + __vector_bitcast<_Tp>(__k._M_data & __builtin_bit_cast(
+							       _K, _Tp(1)));
+	    // starting with AVX2 it is more efficient to blend after add
+	  }
+	else if constexpr (is_same_v<_Op<void>, __decrement<void>>)
+	  {
+	    static_assert(not std::is_same_v<_K, bool>);
+	    if constexpr (is_integral_v<_Tp>)
+	      // Take a shortcut knowing that __k is an integer vector with values -1 or 0.
+	      return __v._M_data + __vector_bitcast<_Tp>(__k._M_data);
+	    else if constexpr (not __have_avx2)
+	      return __v._M_data
+		       - __vector_bitcast<_Tp>(__k._M_data & __builtin_bit_cast(
+							       _K, _Tp(1)));
+	    // starting with AVX2 it is more efficient to blend after sub
+	  }
+	return _CommonImpl::_S_blend(__k, __v, __data(__op(__vv)));
       }
 
     //}}}2
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index dcfdc2a9496..897a67829d1 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -3462,6 +3462,74 @@ _S_islessgreater(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
       }
 
     //}}} }}}
+    template <template <typename> class _Op, typename _Tp, typename _K,
+	      size_t _Np>
+      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _S_masked_unary(const _SimdWrapper<_K, _Np> __k,
+		      const _SimdWrapper<_Tp, _Np> __v)
+      {
+	if (__k._M_is_constprop_none_of())
+	  return __v;
+	else if (__k._M_is_constprop_all_of())
+	  {
+	    auto __vv = _Base::_M_make_simd(__v);
+	    _Op<decltype(__vv)> __op;
+	    return __data(__op(__vv));
+	  }
+	else if constexpr (__is_bitmask_v<decltype(__k)>
+			     && (is_same_v<_Op<void>, __increment<void>>
+				   || is_same_v<_Op<void>, __decrement<void>>))
+	  {
+	    // optimize masked unary increment and decrement as masked sub +/-1
+	    constexpr int __pm_one
+	      = is_same_v<_Op<void>, __increment<void>> ? -1 : 1;
+	    if constexpr (is_integral_v<_Tp>)
+	      {
+		constexpr bool __lp64 = sizeof(long) == sizeof(long long);
+		using _Ip = std::make_signed_t<_Tp>;
+		using _Up = std::conditional_t<
+			      std::is_same_v<_Ip, long>,
+			      std::conditional_t<__lp64, long long, int>,
+			      std::conditional_t<
+				std::is_same_v<_Ip, signed char>, char, _Ip>>;
+		const auto __value = __vector_bitcast<_Up>(__v._M_data);
+#define _GLIBCXX_SIMD_MASK_SUB(_Sizeof, _Width, _Instr)                        \
+  if constexpr (sizeof(_Tp) == _Sizeof && sizeof(__v) == _Width)               \
+    return __vector_bitcast<_Tp>(__builtin_ia32_##_Instr##_mask(__value,       \
+	     __vector_broadcast<_Np>(_Up(__pm_one)), __value, __k._M_data))
+		_GLIBCXX_SIMD_MASK_SUB(1, 64, psubb512);
+		_GLIBCXX_SIMD_MASK_SUB(1, 32, psubb256);
+		_GLIBCXX_SIMD_MASK_SUB(1, 16, psubb128);
+		_GLIBCXX_SIMD_MASK_SUB(2, 64, psubw512);
+		_GLIBCXX_SIMD_MASK_SUB(2, 32, psubw256);
+		_GLIBCXX_SIMD_MASK_SUB(2, 16, psubw128);
+		_GLIBCXX_SIMD_MASK_SUB(4, 64, psubd512);
+		_GLIBCXX_SIMD_MASK_SUB(4, 32, psubd256);
+		_GLIBCXX_SIMD_MASK_SUB(4, 16, psubd128);
+		_GLIBCXX_SIMD_MASK_SUB(8, 64, psubq512);
+		_GLIBCXX_SIMD_MASK_SUB(8, 32, psubq256);
+		_GLIBCXX_SIMD_MASK_SUB(8, 16, psubq128);
+#undef _GLIBCXX_SIMD_MASK_SUB
+	      }
+	    else
+	      {
+#define _GLIBCXX_SIMD_MASK_SUB(_Sizeof, _Width, _Instr)                        \
+  if constexpr (sizeof(_Tp) == _Sizeof && sizeof(__v) == _Width)               \
+    return __builtin_ia32_##_Instr##_mask(                                     \
+	     __v._M_data, __vector_broadcast<_Np>(_Tp(__pm_one)), __v._M_data, \
+	     __k._M_data, _MM_FROUND_CUR_DIRECTION)
+		_GLIBCXX_SIMD_MASK_SUB(4, 64, subps512);
+		_GLIBCXX_SIMD_MASK_SUB(4, 32, subps256);
+		_GLIBCXX_SIMD_MASK_SUB(4, 16, subps128);
+		_GLIBCXX_SIMD_MASK_SUB(8, 64, subpd512);
+		_GLIBCXX_SIMD_MASK_SUB(8, 32, subpd256);
+		_GLIBCXX_SIMD_MASK_SUB(8, 16, subpd128);
+#undef _GLIBCXX_SIMD_MASK_SUB
+	      }
+	  }
+	else
+	  return _Base::template _S_masked_unary<_Op>(__k, __v);
+      }
   };
 
 // }}}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
                   ` (2 preceding siblings ...)
  2023-02-23  8:49 ` [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation Matthias Kretz
@ 2023-02-23  8:49 ` Matthias Kretz
  2023-02-23 11:07   ` Jonathan Wakely
  2023-02-23  8:49 ` [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation Matthias Kretz
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:49 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1102 bytes --]



Resolves -Wtautological-compare warnings about `if
(__builtin_is_constant_evaluated())` in the implementations of these
functions.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	* include/experimental/bits/simd_x86.h (_S_bit_shift_left)
	(_S_bit_shift_right): Declare constexpr. The implementation was
	already expecting constexpr evaluation.
---
 libstdc++-v3/include/experimental/bits/simd_x86.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0004-libstdc-Add-missing-constexpr-on-simd-shift-implemen.patch --]
[-- Type: text/x-patch, Size: 1763 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index 897a67829d1..8872ca301b9 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -1526,7 +1526,7 @@ _S_modulus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
     // values.
   #ifndef _GLIBCXX_SIMD_NO_SHIFT_OPT
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      inline _GLIBCXX_CONST static typename _TVT::type
+      constexpr inline _GLIBCXX_CONST static typename _TVT::type
       _S_bit_shift_left(_Tp __xx, int __y)
       {
 	using _V = typename _TVT::type;
@@ -1631,7 +1631,7 @@ unsigned(
       }
 
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      inline _GLIBCXX_CONST static typename _TVT::type
+      constexpr inline _GLIBCXX_CONST static typename _TVT::type
       _S_bit_shift_left(_Tp __xx, typename _TVT::type __y)
       {
 	using _V = typename _TVT::type;
@@ -1800,7 +1800,7 @@ _mm512_cvtepi16_epi8(
     // _S_bit_shift_right {{{
 #ifndef _GLIBCXX_SIMD_NO_SHIFT_OPT
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      inline _GLIBCXX_CONST static typename _TVT::type
+      constexpr inline _GLIBCXX_CONST static typename _TVT::type
       _S_bit_shift_right(_Tp __xx, int __y)
       {
 	using _V = typename _TVT::type;
@@ -1850,7 +1850,7 @@ _S_bit_shift_right(_Tp __xx, int __y)
       }
 
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      inline _GLIBCXX_CONST static typename _TVT::type
+      constexpr inline _GLIBCXX_CONST static typename _TVT::type
       _S_bit_shift_right(_Tp __xx, typename _TVT::type __y)
       {
 	using _V = typename _TVT::type;

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
                   ` (3 preceding siblings ...)
  2023-02-23  8:49 ` [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation Matthias Kretz
@ 2023-02-23  8:49 ` Matthias Kretz
  2023-02-24 17:10   ` Jonathan Wakely
  2023-02-23  8:50 ` [PATCH 6/8] libstdc++: Fix formatting Matthias Kretz
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:49 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 2375 bytes --]



For simd, the inlining behavior should be similar to builtin types. (No
operator on buitin types is ever translated into a function call.)
Therefore, always_inline is the right choice (i.e. inline on -O0 as
well).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/108030
	* include/experimental/bits/simd_fixed_size.h
	(_SimdImplFixedSize::_S_broadcast): Replace inline with
	_GLIBCXX_SIMD_INTRINSIC.
	(_SimdImplFixedSize::_S_generate): Likewise.
	(_SimdImplFixedSize::_S_load): Likewise.
	(_SimdImplFixedSize::_S_masked_load): Likewise.
	(_SimdImplFixedSize::_S_store): Likewise.
	(_SimdImplFixedSize::_S_masked_store): Likewise.
	(_SimdImplFixedSize::_S_min): Likewise.
	(_SimdImplFixedSize::_S_max): Likewise.
	(_SimdImplFixedSize::_S_complement): Likewise.
	(_SimdImplFixedSize::_S_unary_minus): Likewise.
	(_SimdImplFixedSize::_S_plus): Likewise.
	(_SimdImplFixedSize::_S_minus): Likewise.
	(_SimdImplFixedSize::_S_multiplies): Likewise.
	(_SimdImplFixedSize::_S_divides): Likewise.
	(_SimdImplFixedSize::_S_modulus): Likewise.
	(_SimdImplFixedSize::_S_bit_and): Likewise.
	(_SimdImplFixedSize::_S_bit_or): Likewise.
	(_SimdImplFixedSize::_S_bit_xor): Likewise.
	(_SimdImplFixedSize::_S_bit_shift_left): Likewise.
	(_SimdImplFixedSize::_S_bit_shift_right): Likewise.
	(_SimdImplFixedSize::_S_remquo): Add inline keyword (to be
	explicit about not always-inline, yet).
	(_SimdImplFixedSize::_S_isinf): Likewise.
	(_SimdImplFixedSize::_S_isfinite): Likewise.
	(_SimdImplFixedSize::_S_isnan): Likewise.
	(_SimdImplFixedSize::_S_isnormal): Likewise.
	(_SimdImplFixedSize::_S_signbit): Likewise.
---
 .../experimental/bits/simd_fixed_size.h       | 60 +++++++++----------
 1 file changed, 30 insertions(+), 30 deletions(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0005-libstdc-Always-inline-most-of-non-cmath-fixed_size-i.patch --]
[-- Type: text/x-patch, Size: 8252 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index 3ac6eaa3f6b..88a9b27e359 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -1284,7 +1284,8 @@ struct _SimdImplFixedSize
 
     // broadcast {{{2
     template <typename _Tp>
-      static constexpr inline _SimdMember<_Tp> _S_broadcast(_Tp __x) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdMember<_Tp>
+      _S_broadcast(_Tp __x) noexcept
       {
 	return _SimdMember<_Tp>::_S_generate(
 		 [&](auto __meta) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1294,8 +1295,8 @@ struct _SimdImplFixedSize
 
     // _S_generator {{{2
     template <typename _Fp, typename _Tp>
-      static constexpr inline _SimdMember<_Tp> _S_generator(_Fp&& __gen,
-							    _TypeTag<_Tp>)
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdMember<_Tp>
+      _S_generator(_Fp&& __gen, _TypeTag<_Tp>)
       {
 	return _SimdMember<_Tp>::_S_generate(
 		 [&__gen](auto __meta) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1310,8 +1311,8 @@ struct _SimdImplFixedSize
 
     // _S_load {{{2
     template <typename _Tp, typename _Up>
-      static inline _SimdMember<_Tp> _S_load(const _Up* __mem,
-					     _TypeTag<_Tp>) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp>
+      _S_load(const _Up* __mem, _TypeTag<_Tp>) noexcept
       {
 	return _SimdMember<_Tp>::_S_generate(
 		 [&](auto __meta) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1321,7 +1322,7 @@ struct _SimdImplFixedSize
 
     // _S_masked_load {{{2
     template <typename _Tp, typename... _As, typename _Up>
-      static inline _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static _SimdTuple<_Tp, _As...>
       _S_masked_load(const _SimdTuple<_Tp, _As...>& __old,
 		     const _MaskMember __bits, const _Up* __mem) noexcept
       {
@@ -1344,8 +1345,8 @@ _S_masked_load(const _SimdTuple<_Tp, _As...>& __old,
 
     // _S_store {{{2
     template <typename _Tp, typename _Up>
-      static inline void _S_store(const _SimdMember<_Tp>& __v, _Up* __mem,
-				  _TypeTag<_Tp>) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_store(const _SimdMember<_Tp>& __v, _Up* __mem, _TypeTag<_Tp>) noexcept
       {
 	__for_each(__v, [&](auto __meta, auto __native) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 	  __meta._S_store(__native, &__mem[__meta._S_offset], _TypeTag<_Tp>());
@@ -1354,9 +1355,9 @@ _S_masked_load(const _SimdTuple<_Tp, _As...>& __old,
 
     // _S_masked_store {{{2
     template <typename _Tp, typename... _As, typename _Up>
-      static inline void _S_masked_store(const _SimdTuple<_Tp, _As...>& __v,
-					 _Up* __mem,
-					 const _MaskMember __bits) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_masked_store(const _SimdTuple<_Tp, _As...>& __v, _Up* __mem,
+		      const _MaskMember __bits) noexcept
       {
 	__for_each(__v, [&](auto __meta, auto __native) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 	  if (__meta._S_submask(__bits).any())
@@ -1464,7 +1465,7 @@ __for_each(
 
     // _S_min, _S_max {{{2
     template <typename _Tp, typename... _As>
-      static inline constexpr _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
       _S_min(const _SimdTuple<_Tp, _As...>& __a,
 	     const _SimdTuple<_Tp, _As...>& __b)
       {
@@ -1476,7 +1477,7 @@ _S_min(const _SimdTuple<_Tp, _As...>& __a,
       }
 
     template <typename _Tp, typename... _As>
-      static inline constexpr _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
       _S_max(const _SimdTuple<_Tp, _As...>& __a,
 	     const _SimdTuple<_Tp, _As...>& __b)
       {
@@ -1489,7 +1490,7 @@ _S_max(const _SimdTuple<_Tp, _As...>& __a,
 
     // _S_complement {{{2
     template <typename _Tp, typename... _As>
-      static inline constexpr _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
       _S_complement(const _SimdTuple<_Tp, _As...>& __x) noexcept
       {
 	return __x._M_apply_per_chunk(
@@ -1500,7 +1501,7 @@ _S_complement(const _SimdTuple<_Tp, _As...>& __x) noexcept
 
     // _S_unary_minus {{{2
     template <typename _Tp, typename... _As>
-      static inline constexpr _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
       _S_unary_minus(const _SimdTuple<_Tp, _As...>& __x) noexcept
       {
 	return __x._M_apply_per_chunk(
@@ -1513,7 +1514,7 @@ _S_unary_minus(const _SimdTuple<_Tp, _As...>& __x) noexcept
 
 #define _GLIBCXX_SIMD_FIXED_OP(name_, op_)                                                     \
     template <typename _Tp, typename... _As>                                                   \
-      static inline constexpr _SimdTuple<_Tp, _As...> name_(                                   \
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...> name_(                  \
 	const _SimdTuple<_Tp, _As...>& __x, const _SimdTuple<_Tp, _As...>& __y)                \
       {                                                                                        \
 	return __x._M_apply_per_chunk(                                                         \
@@ -1536,7 +1537,7 @@ _S_unary_minus(const _SimdTuple<_Tp, _As...>& __x) noexcept
 #undef _GLIBCXX_SIMD_FIXED_OP
 
     template <typename _Tp, typename... _As>
-      static inline constexpr _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
       _S_bit_shift_left(const _SimdTuple<_Tp, _As...>& __x, int __y)
       {
 	return __x._M_apply_per_chunk(
@@ -1546,7 +1547,7 @@ _S_bit_shift_left(const _SimdTuple<_Tp, _As...>& __x, int __y)
       }
 
     template <typename _Tp, typename... _As>
-      static inline constexpr _SimdTuple<_Tp, _As...>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
       _S_bit_shift_right(const _SimdTuple<_Tp, _As...>& __x, int __y)
       {
 	return __x._M_apply_per_chunk(
@@ -1665,10 +1666,9 @@ _S_bit_shift_right(const _SimdTuple<_Tp, _As...>& __x, int __y)
 #undef _GLIBCXX_SIMD_APPLY_ON_TUPLE
 
     template <typename _Tp, typename... _Abis>
-      static _SimdTuple<_Tp, _Abis...> _S_remquo(
-	const _SimdTuple<_Tp, _Abis...>& __x,
-	const _SimdTuple<_Tp, _Abis...>& __y,
-	__fixed_size_storage_t<int, _SimdTuple<_Tp, _Abis...>::_S_size()>* __z)
+      static inline _SimdTuple<_Tp, _Abis...>
+      _S_remquo(const _SimdTuple<_Tp, _Abis...>& __x, const _SimdTuple<_Tp, _Abis...>& __y,
+		__fixed_size_storage_t<int, _SimdTuple<_Tp, _Abis...>::_S_size()>* __z)
       {
 	return __x._M_apply_per_chunk(
 		 [](auto __impl, const auto __xx, const auto __yy, auto& __zz)
@@ -1689,14 +1689,14 @@ _S_frexp(const _SimdTuple<_Tp, _As...>& __x,
 		 }, __exp);
       }
 
-#define _GLIBCXX_SIMD_TEST_ON_TUPLE_(name_)                                    \
-    template <typename _Tp, typename... _As>                                   \
-      static inline _MaskMember                                                \
-	_S_##name_(const _SimdTuple<_Tp, _As...>& __x) noexcept                \
-      {                                                                        \
-	return _M_test([](auto __impl,                                         \
-			  auto __xx) { return __impl._S_##name_(__xx); },      \
-		       __x);                                                   \
+#define _GLIBCXX_SIMD_TEST_ON_TUPLE_(name_)                                              \
+    template <typename _Tp, typename... _As>                                             \
+      static inline _MaskMember                                                          \
+	_S_##name_(const _SimdTuple<_Tp, _As...>& __x) noexcept                          \
+      {                                                                                  \
+	return _M_test([] (auto __impl, auto __xx) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA  { \
+		 return __impl._S_##name_(__xx);                                         \
+	       }, __x);                                                                  \
       }
 
     _GLIBCXX_SIMD_TEST_ON_TUPLE_(isinf)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 6/8] libstdc++: Fix formatting
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
                   ` (4 preceding siblings ...)
  2023-02-23  8:49 ` [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation Matthias Kretz
@ 2023-02-23  8:50 ` Matthias Kretz
  2023-02-24 17:14   ` Jonathan Wakely
  2023-02-23  8:50 ` [PATCH 7/8] libstdc++: Fix -Wsign-compare issue Matthias Kretz
  2023-02-23  8:50 ` [PATCH 8/8] libstdc++: Test that integral simd reductions are precise Matthias Kretz
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:50 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1626 bytes --]



Whitespace changes only.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	* include/experimental/bits/simd.h: Line breaks and indenting
	fixed to follow the libstdc++ standard.
	* include/experimental/bits/simd_builtin.h: Likewise.
	* include/experimental/bits/simd_fixed_size.h: Likewise.
	* include/experimental/bits/simd_neon.h: Likewise.
	* include/experimental/bits/simd_ppc.h: Likewise.
	* include/experimental/bits/simd_scalar.h: Likewise.
	* include/experimental/bits/simd_x86.h: Likewise.
---
 libstdc++-v3/include/experimental/bits/simd.h | 473 ++++++------
 .../include/experimental/bits/simd_builtin.h  | 692 +++++++++---------
 .../experimental/bits/simd_fixed_size.h       | 228 +++---
 .../include/experimental/bits/simd_neon.h     |  24 +-
 .../include/experimental/bits/simd_ppc.h      |   3 +-
 .../include/experimental/bits/simd_scalar.h   | 362 +++++----
 .../include/experimental/bits/simd_x86.h      |  90 ++-
 7 files changed, 942 insertions(+), 930 deletions(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0006-libstdc-Fix-formatting.patch --]
[-- Type: text/x-patch, Size: 143058 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 7482d109291..fb661c9657f 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -180,10 +180,7 @@ struct vector_aligned_tag
   template <typename _Tp, typename _Up>
     _GLIBCXX_SIMD_INTRINSIC static constexpr _Up*
     _S_apply(_Up* __ptr)
-    {
-      return static_cast<_Up*>(
-	__builtin_assume_aligned(__ptr, _S_alignment<_Tp, _Up>));
-    }
+    { return static_cast<_Up*>( __builtin_assume_aligned(__ptr, _S_alignment<_Tp, _Up>)); }
 };
 
 template <size_t _Np> struct overaligned_tag
@@ -288,13 +285,15 @@ namespace __detail
   // expression. math_errhandling may expand to an extern symbol, in which case a constexpr value
   // must be guessed.
   template <int = math_errhandling>
-    constexpr bool __handle_fpexcept_impl(int)
+    constexpr bool
+    __handle_fpexcept_impl(int)
     { return math_errhandling & MATH_ERREXCEPT; }
 #endif
 
   // Fallback if math_errhandling doesn't work: with fast-math assume floating-point exceptions are
   // ignored, otherwise implement correct exception behavior.
-  constexpr bool __handle_fpexcept_impl(float)
+  constexpr bool
+  __handle_fpexcept_impl(float)
   {
 #if defined __FAST_MATH__
     return false;
@@ -749,8 +748,7 @@ struct __make_dependent
 // __invoke_ub{{{
 template <typename... _Args>
   [[noreturn]] _GLIBCXX_SIMD_ALWAYS_INLINE void
-  __invoke_ub([[maybe_unused]] const char* __msg,
-	      [[maybe_unused]] const _Args&... __args)
+  __invoke_ub([[maybe_unused]] const char* __msg, [[maybe_unused]] const _Args&... __args)
   {
 #ifdef _GLIBCXX_DEBUG_UB
     __builtin_fprintf(stderr, __msg, __args...);
@@ -795,11 +793,14 @@ class _ExactBool
   const bool _M_data;
 
 public:
-  _GLIBCXX_SIMD_INTRINSIC constexpr _ExactBool(bool __b) : _M_data(__b) {}
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _ExactBool(bool __b) : _M_data(__b) {}
 
   _ExactBool(int) = delete;
 
-  _GLIBCXX_SIMD_INTRINSIC constexpr operator bool() const { return _M_data; }
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  operator bool() const
+  { return _M_data; }
 };
 
 // }}}
@@ -1488,8 +1489,7 @@ struct __vector_type_n<_Tp, 1, enable_if_t<__is_vectorizable_v<_Tp>>>
 
 // else, use GNU-style builtin vector types
 template <typename _Tp, size_t _Np>
-  struct __vector_type_n<_Tp, _Np,
-			 enable_if_t<__is_vectorizable_v<_Tp> && _Np >= 2>>
+  struct __vector_type_n<_Tp, _Np, enable_if_t<__is_vectorizable_v<_Tp> && _Np >= 2>>
   {
     static constexpr size_t _S_Np2 = std::__bit_ceil(_Np * sizeof(_Tp));
 
@@ -1770,8 +1770,7 @@ __bit_cast(const _From __x)
 // }}}
 // __to_intrin {{{
 template <typename _Tp, typename _TVT = _VectorTraits<_Tp>,
-	  typename _R
-	  = __intrinsic_type_t<typename _TVT::value_type, _TVT::_S_full_size>>
+	  typename _R = __intrinsic_type_t<typename _TVT::value_type, _TVT::_S_full_size>>
   _GLIBCXX_SIMD_INTRINSIC constexpr _R
   __to_intrin(_Tp __x)
   {
@@ -1792,9 +1791,7 @@ __to_intrin(_Tp __x)
 template <typename _Tp, typename... _Args>
   _GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, sizeof...(_Args)>
   __make_vector(const _Args&... __args)
-  {
-    return __vector_type_t<_Tp, sizeof...(_Args)>{static_cast<_Tp>(__args)...};
-  }
+  { return __vector_type_t<_Tp, sizeof...(_Args)>{static_cast<_Tp>(__args)...}; }
 
 // }}}
 // __vector_broadcast{{{
@@ -1813,10 +1810,7 @@ __vector_broadcast(_Tp __x)
   template <typename _Tp, size_t _Np, typename _Gp, size_t... _I>
   _GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
   __generate_vector_impl(_Gp&& __gen, index_sequence<_I...>)
-  {
-    return __vector_type_t<_Tp, _Np>{
-      static_cast<_Tp>(__gen(_SizeConstant<_I>()))...};
-  }
+  { return __vector_type_t<_Tp, _Np>{ static_cast<_Tp>(__gen(_SizeConstant<_I>()))...}; }
 
 template <typename _V, typename _VVT = _VectorTraits<_V>, typename _Gp>
   _GLIBCXX_SIMD_INTRINSIC constexpr _V
@@ -2029,8 +2023,7 @@ __not(_Tp __a) noexcept
 // }}}
 // __concat{{{
 template <typename _Tp, typename _TVT = _VectorTraits<_Tp>,
-	  typename _R = __vector_type_t<typename _TVT::value_type,
-					_TVT::_S_full_size * 2>>
+	  typename _R = __vector_type_t<typename _TVT::value_type, _TVT::_S_full_size * 2>>
   constexpr _R
   __concat(_Tp a_, _Tp b_)
   {
@@ -2174,8 +2167,7 @@ __zero_extend(_Tp __x)
 	  int _SplitBy,
 	  typename _Tp,
 	  typename _TVT = _VectorTraits<_Tp>,
-	  typename _R = __vector_type_t<typename _TVT::value_type,
-			  _TVT::_S_full_size / _SplitBy>>
+	  typename _R = __vector_type_t<typename _TVT::value_type, _TVT::_S_full_size / _SplitBy>>
   _GLIBCXX_SIMD_INTRINSIC constexpr _R
   __extract(_Tp __in)
   {
@@ -2221,8 +2213,7 @@ __extract(_Tp __in)
 // }}}
 // __lo/__hi64[z]{{{
 template <typename _Tp,
-	  typename _R
-	  = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+	  typename _R = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
   _GLIBCXX_SIMD_INTRINSIC constexpr _R
   __lo64(_Tp __x)
   {
@@ -2232,8 +2223,7 @@ __lo64(_Tp __x)
   }
 
 template <typename _Tp,
-	  typename _R
-	  = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+	  typename _R = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
   _GLIBCXX_SIMD_INTRINSIC constexpr _R
   __hi64(_Tp __x)
   {
@@ -2244,8 +2234,7 @@ __hi64(_Tp __x)
   }
 
 template <typename _Tp,
-	  typename _R
-	  = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+	  typename _R = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
   _GLIBCXX_SIMD_INTRINSIC constexpr _R
   __hi64z([[maybe_unused]] _Tp __x)
   {
@@ -2356,18 +2345,15 @@ struct __bool_storage_member_type<64>
 // the following excludes bool via __is_vectorizable
 #if _GLIBCXX_SIMD_HAVE_SSE
 template <typename _Tp, size_t _Bytes>
-  struct __intrinsic_type<_Tp, _Bytes,
-			  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 64>>
+  struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 64>>
   {
     static_assert(!is_same_v<_Tp, long double>,
 		  "no __intrinsic_type support for long double on x86");
 
-    static constexpr size_t _S_VBytes = _Bytes <= 16   ? 16
-					: _Bytes <= 32 ? 32
-						       : 64;
+    static constexpr size_t _S_VBytes = _Bytes <= 16 ? 16 : _Bytes <= 32 ? 32 : 64;
 
     using type [[__gnu__::__vector_size__(_S_VBytes)]]
-    = conditional_t<is_integral_v<_Tp>, long long int, _Tp>;
+      = conditional_t<is_integral_v<_Tp>, long long int, _Tp>;
   };
 #endif // _GLIBCXX_SIMD_HAVE_SSE
 
@@ -2413,16 +2399,19 @@ struct __intrinsic_type<make_unsigned_t<__int_with_sizeof_t<_Bits / 8>>
 #undef _GLIBCXX_SIMD_ARM_INTRIN
 
 template <typename _Tp, size_t _Bytes>
-  struct __intrinsic_type<_Tp, _Bytes,
-			  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
+  struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
   {
     static constexpr int _SVecBytes = _Bytes <= 8 ? 8 : 16;
+
     using _Ip = __int_for_sizeof_t<_Tp>;
+
     using _Up = conditional_t<
       is_floating_point_v<_Tp>, _Tp,
       conditional_t<is_unsigned_v<_Tp>, make_unsigned_t<_Ip>, _Ip>>;
+
     static_assert(!is_same_v<_Tp, _Up> || _SVecBytes != _Bytes,
 		  "should use explicit specialization above");
+
     using type = typename __intrinsic_type<_Up, _SVecBytes>::type;
   };
 #endif // _GLIBCXX_SIMD_HAVE_NEON
@@ -2457,18 +2446,20 @@ struct __intrinsic_type_impl<_Tp>
 #undef _GLIBCXX_SIMD_PPC_INTRIN
 
 template <typename _Tp, size_t _Bytes>
-  struct __intrinsic_type<_Tp, _Bytes,
-			  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
+  struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
   {
     static constexpr bool _S_is_ldouble = is_same_v<_Tp, long double>;
+
     // allow _Tp == long double with -mlong-double-64
     static_assert(!(_S_is_ldouble && sizeof(long double) > sizeof(double)),
 		  "no __intrinsic_type support for 128-bit floating point on PowerPC");
+
 #ifndef __VSX__
     static_assert(!(is_same_v<_Tp, double>
 		    || (_S_is_ldouble && sizeof(long double) == sizeof(double))),
 		  "no __intrinsic_type support for 64-bit floating point on PowerPC w/o VSX");
 #endif
+
     using type =
       typename __intrinsic_type_impl<
 		 conditional_t<is_floating_point_v<_Tp>,
@@ -2489,22 +2480,29 @@ struct _SimdWrapper
     static constexpr size_t _S_full_size = sizeof(_BuiltinType) * __CHAR_BIT__;
 
     _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<bool, _S_full_size>
-    __as_full_vector() const { return _M_data; }
+    __as_full_vector() const
+    { return _M_data; }
+
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapper() = default;
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper() = default;
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_BuiltinType __k)
-      : _M_data(__k) {};
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapper(_BuiltinType __k) : _M_data(__k) {};
 
-    _GLIBCXX_SIMD_INTRINSIC operator const _BuiltinType&() const
+    _GLIBCXX_SIMD_INTRINSIC
+    operator const _BuiltinType&() const
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC operator _BuiltinType&()
+    _GLIBCXX_SIMD_INTRINSIC
+    operator _BuiltinType&()
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC _BuiltinType __intrin() const
+    _GLIBCXX_SIMD_INTRINSIC _BuiltinType
+    __intrin() const
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator[](size_t __i) const
+    _GLIBCXX_SIMD_INTRINSIC constexpr value_type
+    operator[](size_t __i) const
     { return _M_data & (_BuiltinType(1) << __i); }
 
     template <size_t __i>
@@ -2512,7 +2510,8 @@ struct _SimdWrapper
       operator[](_SizeConstant<__i>) const
       { return _M_data & (_BuiltinType(1) << __i); }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr void _M_set(size_t __i, value_type __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr void
+    _M_set(size_t __i, value_type __x)
     {
       if (__x)
 	_M_data |= (_BuiltinType(1) << __i);
@@ -2520,11 +2519,12 @@ struct _SimdWrapper
 	_M_data &= ~(_BuiltinType(1) << __i);
     }
 
-    _GLIBCXX_SIMD_INTRINSIC
-    constexpr bool _M_is_constprop() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr bool
+    _M_is_constprop() const
     { return __builtin_constant_p(_M_data); }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr bool _M_is_constprop_none_of() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr bool
+    _M_is_constprop_none_of() const
     {
       if (__builtin_constant_p(_M_data))
 	{
@@ -2536,7 +2536,8 @@ struct _SimdWrapper
       return false;
     }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr bool _M_is_constprop_all_of() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr bool
+    _M_is_constprop_all_of() const
     {
       if (__builtin_constant_p(_M_data))
 	{
@@ -2558,10 +2559,11 @@ struct _SimdWrapperBase
 template <typename _BuiltinType>
   struct _SimdWrapperBase<false, _BuiltinType> // no padding or no SNaNs
   {
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapperBase() = default;
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapperBase(_BuiltinType __init)
-      : _M_data(__init)
-    {}
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapperBase() = default;
+
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapperBase(_BuiltinType __init) : _M_data(__init) {}
 
     _BuiltinType _M_data;
   };
@@ -2570,10 +2572,11 @@ struct _SimdWrapperBase<false, _BuiltinType>
   struct _SimdWrapperBase<true, _BuiltinType> // with padding that needs to
 					      // never become SNaN
   {
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapperBase() : _M_data() {}
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapperBase(_BuiltinType __init)
-      : _M_data(__init)
-    {}
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapperBase() : _M_data() {}
+
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapperBase(_BuiltinType __init) : _M_data(__init) {}
 
     _BuiltinType _M_data;
   };
@@ -2612,24 +2615,33 @@ struct _SimdWrapper
     __as_full_vector() const
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(initializer_list<_Tp> __init)
-      : _Base(__generate_from_n_evaluations<_Width, _BuiltinType>(
-	[&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { return __init.begin()[__i.value]; })) {}
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapper(initializer_list<_Tp> __init)
+    : _Base(__generate_from_n_evaluations<_Width, _BuiltinType>(
+	      [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+		return __init.begin()[__i.value];
+	      })) {}
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper() = default;
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(const _SimdWrapper&)
-      = default;
-    _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_SimdWrapper&&) = default;
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapper() = default;
+
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapper(const _SimdWrapper&) = default;
+
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    _SimdWrapper(_SimdWrapper&&) = default;
 
     _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper&
     operator=(const _SimdWrapper&) = default;
+
     _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper&
     operator=(_SimdWrapper&&) = default;
 
     template <typename _V, typename = enable_if_t<disjunction_v<
 			     is_same<_V, __vector_type_t<_Tp, _Width>>,
 			     is_same<_V, __intrinsic_type_t<_Tp, _Width>>>>>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_V __x)
+      _GLIBCXX_SIMD_INTRINSIC constexpr
+      _SimdWrapper(_V __x)
       // __vector_bitcast can convert e.g. __m128 to __vector(2) float
       : _Base(__vector_bitcast<_Tp, _Width>(__x)) {}
 
@@ -2644,27 +2656,34 @@ __as_full_vector() const
 		 { return _M_data[int(__i)]; });
       }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr operator const _BuiltinType&() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    operator const _BuiltinType&() const
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr operator _BuiltinType&()
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    operator _BuiltinType&()
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr _Tp operator[](size_t __i) const
+    _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+    operator[](size_t __i) const
     { return _M_data[__i]; }
 
     template <size_t __i>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _Tp operator[](_SizeConstant<__i>) const
+      _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+      operator[](_SizeConstant<__i>) const
       { return _M_data[__i]; }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr void _M_set(size_t __i, _Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr void
+    _M_set(size_t __i, _Tp __x)
     { _M_data[__i] = __x; }
 
     _GLIBCXX_SIMD_INTRINSIC
-    constexpr bool _M_is_constprop() const
+    constexpr bool
+    _M_is_constprop() const
     { return __builtin_constant_p(_M_data); }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr bool _M_is_constprop_none_of() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr bool
+    _M_is_constprop_none_of() const
     {
       if (__builtin_constant_p(_M_data))
 	{
@@ -2685,7 +2704,8 @@ __as_full_vector() const
       return false;
     }
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr bool _M_is_constprop_all_of() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr bool
+    _M_is_constprop_all_of() const
     {
       if (__builtin_constant_p(_M_data))
 	{
@@ -2883,22 +2903,14 @@ struct deduce
   struct rebind_simd;
 
 template <typename _Tp, typename _Up, typename _Abi>
-  struct rebind_simd<
-    _Tp, simd<_Up, _Abi>,
-    void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
-  {
-    using type
-      = simd<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>;
-  };
+  struct rebind_simd<_Tp, simd<_Up, _Abi>,
+		     void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
+  { using type = simd<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>; };
 
 template <typename _Tp, typename _Up, typename _Abi>
-  struct rebind_simd<
-    _Tp, simd_mask<_Up, _Abi>,
-    void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
-  {
-    using type
-      = simd_mask<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>;
-  };
+  struct rebind_simd<_Tp, simd_mask<_Up, _Abi>,
+		     void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
+  { using type = simd_mask<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>; };
 
 template <typename _Tp, typename _V>
   using rebind_simd_t = typename rebind_simd<_Tp, _V>::type;
@@ -2908,13 +2920,11 @@ struct rebind_simd
   struct resize_simd;
 
 template <int _Np, typename _Tp, typename _Abi>
-  struct resize_simd<_Np, simd<_Tp, _Abi>,
-		     void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
+  struct resize_simd<_Np, simd<_Tp, _Abi>, void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
   { using type = simd<_Tp, simd_abi::deduce_t<_Tp, _Np, _Abi>>; };
 
 template <int _Np, typename _Tp, typename _Abi>
-  struct resize_simd<_Np, simd_mask<_Tp, _Abi>,
-		     void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
+  struct resize_simd<_Np, simd_mask<_Tp, _Abi>, void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
   { using type = simd_mask<_Tp, simd_abi::deduce_t<_Tp, _Np, _Abi>>; };
 
 template <int _Np, typename _V>
@@ -2963,13 +2973,11 @@ struct is_simd_mask<simd_mask<_Tp, _Abi>>
 
 // casts [simd.casts] {{{1
 // static_simd_cast {{{2
-template <typename _Tp, typename _Up, typename _Ap, bool = is_simd_v<_Tp>,
-	  typename = void>
+template <typename _Tp, typename _Up, typename _Ap, bool = is_simd_v<_Tp>, typename = void>
   struct __static_simd_cast_return_type;
 
 template <typename _Tp, typename _A0, typename _Up, typename _Ap>
-  struct __static_simd_cast_return_type<simd_mask<_Tp, _A0>, _Up, _Ap, false,
-					void>
+  struct __static_simd_cast_return_type<simd_mask<_Tp, _A0>, _Up, _Ap, false, void>
   : __static_simd_cast_return_type<simd<_Tp, _A0>, _Up, _Ap> {};
 
 template <typename _Tp, typename _Up, typename _Ap>
@@ -3284,6 +3292,7 @@ __get_lvalue(const const_where_expression& __x)
 
   public:
     const_where_expression(const const_where_expression&) = delete;
+
     const_where_expression& operator=(const const_where_expression&) = delete;
 
     _GLIBCXX_SIMD_INTRINSIC const_where_expression(const _M& __kk, const _Tp& dd)
@@ -3328,8 +3337,8 @@ class const_where_expression<bool, _Tp>
     struct _Wrapper { using value_type = _V; };
 
   protected:
-    using value_type =
-      typename conditional_t<is_arithmetic_v<_V>, _Wrapper, _V>::value_type;
+    using value_type
+      = typename conditional_t<is_arithmetic_v<_V>, _Wrapper, _V>::value_type;
 
     _GLIBCXX_SIMD_INTRINSIC friend const _M&
     __get_mask(const const_where_expression& __x)
@@ -3426,32 +3435,32 @@ static_assert(
     _GLIBCXX_SIMD_OP_(>>, _S_shift_right);
 #undef _GLIBCXX_SIMD_OP_
 
-    _GLIBCXX_SIMD_INTRINSIC void operator++() &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator++() &&
     {
       __data(_M_value)
-	= _Impl::template _S_masked_unary<__increment>(__data(_M_k),
-						       __data(_M_value));
+	= _Impl::template _S_masked_unary<__increment>(__data(_M_k), __data(_M_value));
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void operator++(int) &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator++(int) &&
     {
       __data(_M_value)
-	= _Impl::template _S_masked_unary<__increment>(__data(_M_k),
-						       __data(_M_value));
+	= _Impl::template _S_masked_unary<__increment>(__data(_M_k), __data(_M_value));
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void operator--() &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator--() &&
     {
       __data(_M_value)
-	= _Impl::template _S_masked_unary<__decrement>(__data(_M_k),
-						       __data(_M_value));
+	= _Impl::template _S_masked_unary<__decrement>(__data(_M_k), __data(_M_value));
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void operator--(int) &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator--(int) &&
     {
       __data(_M_value)
-	= _Impl::template _S_masked_unary<__decrement>(__data(_M_k),
-						       __data(_M_value));
+	= _Impl::template _S_masked_unary<__decrement>(__data(_M_k), __data(_M_value));
     }
 
     // intentionally hides const_where_expression::copy_from
@@ -3459,15 +3468,15 @@ static_assert(
       _GLIBCXX_SIMD_INTRINSIC void
       copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags) &&
       {
-	__data(_M_value)
-	  = _Impl::_S_masked_load(__data(_M_value), __data(_M_k),
-				  _Flags::template _S_apply<_Tp>(__mem));
+	__data(_M_value) = _Impl::_S_masked_load(__data(_M_value), __data(_M_k),
+						 _Flags::template _S_apply<_Tp>(__mem));
       }
   };
 
 // where_expression<bool, T> {{{2
 template <typename _Tp>
-  class where_expression<bool, _Tp> : public const_where_expression<bool, _Tp>
+  class where_expression<bool, _Tp>
+  : public const_where_expression<bool, _Tp>
   {
     using _M = bool;
     using typename const_where_expression<_M, _Tp>::value_type;
@@ -3478,12 +3487,14 @@ class where_expression<bool, _Tp> : public const_where_expression<bool, _Tp>
     where_expression(const where_expression&) = delete;
     where_expression& operator=(const where_expression&) = delete;
 
-    _GLIBCXX_SIMD_INTRINSIC where_expression(const _M& __kk, _Tp& dd)
-      : const_where_expression<_M, _Tp>(__kk, dd) {}
+    _GLIBCXX_SIMD_INTRINSIC
+    where_expression(const _M& __kk, _Tp& dd)
+    : const_where_expression<_M, _Tp>(__kk, dd) {}
 
 #define _GLIBCXX_SIMD_OP_(__op)                                                \
     template <typename _Up>                                                    \
-      _GLIBCXX_SIMD_INTRINSIC void operator __op(_Up&& __x)&&                  \
+      _GLIBCXX_SIMD_INTRINSIC void                                             \
+      operator __op(_Up&& __x)&&                                               \
       { if (_M_k) _M_value __op static_cast<_Up&&>(__x); }
 
     _GLIBCXX_SIMD_OP_(=)
@@ -3499,16 +3510,20 @@ class where_expression<bool, _Tp> : public const_where_expression<bool, _Tp>
     _GLIBCXX_SIMD_OP_(>>=)
   #undef _GLIBCXX_SIMD_OP_
 
-    _GLIBCXX_SIMD_INTRINSIC void operator++() &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator++() &&
     { if (_M_k) ++_M_value; }
 
-    _GLIBCXX_SIMD_INTRINSIC void operator++(int) &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator++(int) &&
     { if (_M_k) ++_M_value; }
 
-    _GLIBCXX_SIMD_INTRINSIC void operator--() &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator--() &&
     { if (_M_k) --_M_value; }
 
-    _GLIBCXX_SIMD_INTRINSIC void operator--(int) &&
+    _GLIBCXX_SIMD_INTRINSIC void
+    operator--(int) &&
     { if (_M_k) --_M_value; }
 
     // intentionally hides const_where_expression::copy_from
@@ -3526,23 +3541,20 @@ class where_expression<bool, _Tp> : public const_where_expression<bool, _Tp>
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC
-    const_where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
-    where(const typename simd<_Tp, _Ap>::mask_type& __k,
-	  const simd<_Tp, _Ap>& __value)
+  const_where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
+  where(const typename simd<_Tp, _Ap>::mask_type& __k, const simd<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC
-    where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
-    where(const remove_const_t<simd_mask<_Tp, _Ap>>& __k,
-	  simd_mask<_Tp, _Ap>& __value)
+  where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
+  where(const remove_const_t<simd_mask<_Tp, _Ap>>& __k, simd_mask<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC
-    const_where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
-    where(const remove_const_t<simd_mask<_Tp, _Ap>>& __k,
-	  const simd_mask<_Tp, _Ap>& __value)
+  const_where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
+  where(const remove_const_t<simd_mask<_Tp, _Ap>>& __k, const simd_mask<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp>
@@ -3555,11 +3567,11 @@ where(_ExactBool __k, _Tp& __value)
   where(_ExactBool __k, const _Tp& __value)
   { return {__k, __value}; }
 
-  template <typename _Tp, typename _Ap>
-    void where(bool __k, simd<_Tp, _Ap>& __value) = delete;
+template <typename _Tp, typename _Ap>
+  void where(bool __k, simd<_Tp, _Ap>& __value) = delete;
 
-  template <typename _Tp, typename _Ap>
-    void where(bool __k, const simd<_Tp, _Ap>& __value) = delete;
+template <typename _Tp, typename _Ap>
+  void where(bool __k, const simd<_Tp, _Ap>& __value) = delete;
 
 // proposed mask iterations {{{1
 namespace __proposed {
@@ -3576,10 +3588,12 @@ class iterator
       size_t __mask;
       size_t __bit;
 
-      _GLIBCXX_SIMD_INTRINSIC void __next_bit()
+      _GLIBCXX_SIMD_INTRINSIC void
+      __next_bit()
       { __bit = __builtin_ctzl(__mask); }
 
-      _GLIBCXX_SIMD_INTRINSIC void __reset_lsb()
+      _GLIBCXX_SIMD_INTRINSIC void
+      __reset_lsb()
       {
 	// 01100100 - 1 = 01100011
 	__mask &= (__mask - 1);
@@ -3591,20 +3605,24 @@ class iterator
       iterator(const iterator&) = default;
       iterator(iterator&&) = default;
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE size_t operator->() const
+      _GLIBCXX_SIMD_ALWAYS_INLINE size_t
+      operator->() const
       { return __bit; }
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE size_t operator*() const
+      _GLIBCXX_SIMD_ALWAYS_INLINE size_t
+      operator*() const
       { return __bit; }
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE iterator& operator++()
+      _GLIBCXX_SIMD_ALWAYS_INLINE iterator&
+      operator++()
       {
 	__reset_lsb();
 	__next_bit();
 	return *this;
       }
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE iterator operator++(int)
+      _GLIBCXX_SIMD_ALWAYS_INLINE iterator
+      operator++(int)
       {
 	iterator __tmp = *this;
 	__reset_lsb();
@@ -3612,17 +3630,21 @@ class iterator
 	return __tmp;
       }
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE bool operator==(const iterator& __rhs) const
+      _GLIBCXX_SIMD_ALWAYS_INLINE bool
+      operator==(const iterator& __rhs) const
       { return __mask == __rhs.__mask; }
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE bool operator!=(const iterator& __rhs) const
+      _GLIBCXX_SIMD_ALWAYS_INLINE bool
+      operator!=(const iterator& __rhs) const
       { return __mask != __rhs.__mask; }
     };
 
-    iterator begin() const
+    iterator
+    begin() const
     { return __bits.to_ullong(); }
 
-    iterator end() const
+    iterator
+    end() const
     { return 0; }
   };
 
@@ -3637,15 +3659,13 @@ where(const simd_mask<_Tp, _Ap>& __k)
 // reductions [simd.reductions] {{{1
 template <typename _Tp, typename _Abi, typename _BinaryOperation = plus<>>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
-  reduce(const simd<_Tp, _Abi>& __v,
-	 _BinaryOperation __binary_op = _BinaryOperation())
+  reduce(const simd<_Tp, _Abi>& __v, _BinaryOperation __binary_op = _BinaryOperation())
   { return _Abi::_SimdImpl::_S_reduce(__v, __binary_op); }
 
 template <typename _M, typename _V, typename _BinaryOperation = plus<>>
   _GLIBCXX_SIMD_INTRINSIC typename _V::value_type
   reduce(const const_where_expression<_M, _V>& __x,
-	 typename _V::value_type __identity_element,
-	 _BinaryOperation __binary_op)
+	 typename _V::value_type __identity_element, _BinaryOperation __binary_op)
   {
     if (__builtin_expect(none_of(__get_mask(__x)), false))
       return __identity_element;
@@ -3684,16 +3704,12 @@ reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op)
 template <typename _Tp, typename _Abi>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
   hmin(const simd<_Tp, _Abi>& __v) noexcept
-  {
-    return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum());
-  }
+  { return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum()); }
 
 template <typename _Tp, typename _Abi>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
   hmax(const simd<_Tp, _Abi>& __v) noexcept
-  {
-    return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum());
-  }
+  { return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum()); }
 
 template <typename _M, typename _V>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
@@ -3761,8 +3777,7 @@ minmax(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
-  clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo,
-	const simd<_Tp, _Ap>& __hi)
+  clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo, const simd<_Tp, _Ap>& __hi)
   {
     using _Impl = typename _Ap::_SimdImpl;
     return {__private_init,
@@ -3783,8 +3798,7 @@ clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo,
   _SimdWrapper<_Tp, _Np / _Total * _Combine>
   __extract_part(const _SimdWrapper<_Tp, _Np> __x);
 
-template <int _Index, int _Parts, int _Combine = 1, typename _Tp, typename _A0,
-	  typename... _As>
+template <int _Index, int _Parts, int _Combine = 1, typename _Tp, typename _A0, typename... _As>
   _GLIBCXX_SIMD_INTRINSIC auto
   __extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x);
 
@@ -3794,7 +3808,8 @@ clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo,
   struct _SizeList
   {
     template <size_t _I>
-      static constexpr size_t _S_at(_SizeConstant<_I> = {})
+      static constexpr size_t
+      _S_at(_SizeConstant<_I> = {})
       {
 	if constexpr (_I == 0)
 	  return _V0;
@@ -3803,7 +3818,8 @@ struct _SizeList
       }
 
     template <size_t _I>
-      static constexpr auto _S_before(_SizeConstant<_I> = {})
+      static constexpr auto
+      _S_before(_SizeConstant<_I> = {})
       {
 	if constexpr (_I == 0)
 	  return _SizeConstant<0>();
@@ -3813,7 +3829,8 @@ struct _SizeList
       }
 
     template <size_t _Np>
-      static constexpr auto _S_pop_front(_SizeConstant<_Np> = {})
+      static constexpr auto
+      _S_pop_front(_SizeConstant<_Np> = {})
       {
 	if constexpr (_Np == 0)
 	  return _SizeList();
@@ -3965,8 +3982,7 @@ split(const simd<typename _V::value_type, _Ap>& __x)
 // }}}
 // split<simd_mask>(simd_mask) {{{
 template <typename _V, typename _Ap,
-	  size_t _Parts
-	  = simd_size_v<typename _V::simd_type::value_type, _Ap> / _V::size()>
+	  size_t _Parts = simd_size_v<typename _V::simd_type::value_type, _Ap> / _V::size()>
   enable_if_t<is_simd_mask_v<_V> && simd_size_v<typename
     _V::simd_type::value_type, _Ap> == _Parts * _V::size(), array<_V, _Parts>>
   split(const simd_mask<typename _V::simd_type::value_type, _Ap>& __x)
@@ -4131,8 +4147,7 @@ static_assert(
 // __store_pack_of_simd {{{
 template <typename _Tp, typename _A0, typename... _As>
   _GLIBCXX_SIMD_INTRINSIC void
-  __store_pack_of_simd(char* __mem, const simd<_Tp, _A0>& __x0,
-		       const simd<_Tp, _As>&... __xs)
+  __store_pack_of_simd(char* __mem, const simd<_Tp, _A0>& __x0, const simd<_Tp, _As>&... __xs)
   {
     constexpr size_t __n_bytes = sizeof(_Tp) * simd_size_v<_Tp, _A0>;
     __builtin_memcpy(__mem, &__data(__x0), __n_bytes);
@@ -4188,7 +4203,8 @@ class _SmartReference
     int _M_index;
     _Up& _M_obj;
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr _ValueType _M_read() const noexcept
+    _GLIBCXX_SIMD_INTRINSIC constexpr _ValueType
+    _M_read() const noexcept
     {
       if constexpr (is_arithmetic_v<_Up>)
 	return _M_obj;
@@ -4197,7 +4213,8 @@ class _SmartReference
     }
 
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC constexpr void _M_write(_Tp&& __x) const
+      _GLIBCXX_SIMD_INTRINSIC constexpr void
+      _M_write(_Tp&& __x) const
       { _Accessor::_S_set(_M_obj, _M_index, static_cast<_Tp&&>(__x)); }
 
   public:
@@ -4207,32 +4224,32 @@ _SmartReference(_Up& __o, int __i) noexcept
 
     using value_type = _ValueType;
 
-    _GLIBCXX_SIMD_INTRINSIC _SmartReference(const _SmartReference&) = delete;
+    _GLIBCXX_SIMD_INTRINSIC
+    _SmartReference(const _SmartReference&) = delete;
 
-    _GLIBCXX_SIMD_INTRINSIC constexpr operator value_type() const noexcept
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    operator value_type() const noexcept
     { return _M_read(); }
 
-    template <typename _Tp,
-	      typename
-	      = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, value_type>>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator=(_Tp&& __x) &&
+    template <typename _Tp, typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, value_type>>
+      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference
+      operator=(_Tp&& __x) &&
       {
 	_M_write(static_cast<_Tp&&>(__x));
 	return {_M_obj, _M_index};
       }
 
-#define _GLIBCXX_SIMD_OP_(__op)                                                \
-    template <typename _Tp,                                                    \
-	      typename _TT                                                     \
-	      = decltype(declval<value_type>() __op declval<_Tp>()),           \
-	      typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, _TT>,    \
-	      typename = _ValuePreservingOrInt<_TT, value_type>>               \
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference                        \
-      operator __op##=(_Tp&& __x) &&                                           \
-      {                                                                        \
-	const value_type& __lhs = _M_read();                                   \
-	_M_write(__lhs __op __x);                                              \
-	return {_M_obj, _M_index};                                             \
+#define _GLIBCXX_SIMD_OP_(__op)                                                   \
+    template <typename _Tp,                                                       \
+	      typename _TT = decltype(declval<value_type>() __op declval<_Tp>()), \
+	      typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, _TT>,       \
+	      typename = _ValuePreservingOrInt<_TT, value_type>>                  \
+      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference                           \
+      operator __op##=(_Tp&& __x) &&                                              \
+      {                                                                           \
+	const value_type& __lhs = _M_read();                                      \
+	_M_write(__lhs __op __x);                                                 \
+	return {_M_obj, _M_index};                                                \
       }
     _GLIBCXX_SIMD_ALL_ARITHMETICS(_GLIBCXX_SIMD_OP_);
     _GLIBCXX_SIMD_ALL_SHIFTS(_GLIBCXX_SIMD_OP_);
@@ -4240,9 +4257,9 @@ _SmartReference(_Up& __o, int __i) noexcept
 #undef _GLIBCXX_SIMD_OP_
 
     template <typename _Tp = void,
-	      typename
-	      = decltype(++declval<conditional_t<true, value_type, _Tp>&>())>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator++() &&
+	      typename = decltype(++declval<conditional_t<true, value_type, _Tp>&>())>
+      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference
+      operator++() &&
       {
 	value_type __x = _M_read();
 	_M_write(++__x);
@@ -4250,9 +4267,9 @@ _SmartReference(_Up& __o, int __i) noexcept
       }
 
     template <typename _Tp = void,
-	      typename
-	      = decltype(declval<conditional_t<true, value_type, _Tp>&>()++)>
-      _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator++(int) &&
+	      typename = decltype(declval<conditional_t<true, value_type, _Tp>&>()++)>
+      _GLIBCXX_SIMD_INTRINSIC constexpr value_type
+      operator++(int) &&
       {
 	const value_type __r = _M_read();
 	value_type __x = __r;
@@ -4261,9 +4278,9 @@ _SmartReference(_Up& __o, int __i) noexcept
       }
 
     template <typename _Tp = void,
-	      typename
-	      = decltype(--declval<conditional_t<true, value_type, _Tp>&>())>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator--() &&
+	      typename = decltype(--declval<conditional_t<true, value_type, _Tp>&>())>
+      _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference
+      operator--() &&
       {
 	value_type __x = _M_read();
 	_M_write(--__x);
@@ -4271,9 +4288,9 @@ _SmartReference(_Up& __o, int __i) noexcept
       }
 
     template <typename _Tp = void,
-	      typename
-	      = decltype(declval<conditional_t<true, value_type, _Tp>&>()--)>
-      _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator--(int) &&
+	      typename = decltype(declval<conditional_t<true, value_type, _Tp>&>()--)>
+      _GLIBCXX_SIMD_INTRINSIC constexpr value_type
+      operator--(int) &&
       {
 	const value_type __r = _M_read();
 	value_type __x = __r;
@@ -4349,7 +4366,8 @@ struct __decay_abi<__scalar_abi_wrapper<_Bytes>>
 template <template <int> class _Abi, int _Bytes, typename _Tp>
   struct __find_next_valid_abi
   {
-    static constexpr auto _S_choose()
+    static constexpr auto
+    _S_choose()
     {
       constexpr int _NextBytes = std::__bit_ceil(_Bytes) / 2;
       using _NextAbi = _Abi<_NextBytes>;
@@ -4393,7 +4411,8 @@ struct _AbiList<_A0, _Rest...>
 	typename _AbiList<_Rest...>::template _FirstValidAbi<_Tp, _Np>>;
 
     template <typename _Tp, int _Np>
-      static constexpr auto _S_determine_best_abi()
+      static constexpr auto
+      _S_determine_best_abi()
       {
 	static_assert(_Np >= 1);
 	constexpr int _Bytes = sizeof(_Tp) * _Np;
@@ -4556,17 +4575,15 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     template <typename _Flags>
       _GLIBCXX_SIMD_ALWAYS_INLINE
       simd_mask(const value_type* __mem, _Flags)
-      : _M_data(_Impl::template _S_load<_Ip>(
-	_Flags::template _S_apply<simd_mask>(__mem))) {}
+      : _M_data(_Impl::template _S_load<_Ip>(_Flags::template _S_apply<simd_mask>(__mem))) {}
 
     template <typename _Flags>
       _GLIBCXX_SIMD_ALWAYS_INLINE
       simd_mask(const value_type* __mem, simd_mask __k, _Flags)
       : _M_data{}
       {
-	_M_data
-	  = _Impl::_S_masked_load(_M_data, __k._M_data,
-				  _Flags::template _S_apply<simd_mask>(__mem));
+	_M_data = _Impl::_S_masked_load(_M_data, __k._M_data,
+					_Flags::template _S_apply<simd_mask>(__mem));
       }
 
     // }}}
@@ -4574,10 +4591,7 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     template <typename _Flags>
       _GLIBCXX_SIMD_ALWAYS_INLINE void
       copy_from(const value_type* __mem, _Flags)
-      {
-	_M_data = _Impl::template _S_load<_Ip>(
-	  _Flags::template _S_apply<simd_mask>(__mem));
-      }
+      { _M_data = _Impl::template _S_load<_Ip>(_Flags::template _S_apply<simd_mask>(__mem)); }
 
     // }}}
     // stores [simd_mask.store] {{{
@@ -4618,8 +4632,7 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
   #ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
     // simd_mask<int> && simd_mask<uint> needs disambiguation
     template <typename _Up, typename _A2,
-	      typename
-	      = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
+	      typename = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
       _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
       operator&&(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
       {
@@ -4628,8 +4641,7 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
       }
 
     template <typename _Up, typename _A2,
-	      typename
-	      = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
+	      typename = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
       _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
       operator||(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
       {
@@ -4640,15 +4652,11 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
 
     _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
     operator&&(const simd_mask& __x, const simd_mask& __y)
-    {
-      return {__private_init, _Impl::_S_logical_and(__x._M_data, __y._M_data)};
-    }
+    { return {__private_init, _Impl::_S_logical_and(__x._M_data, __y._M_data)}; }
 
     _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
     operator||(const simd_mask& __x, const simd_mask& __y)
-    {
-      return {__private_init, _Impl::_S_logical_or(__x._M_data, __y._M_data)};
-    }
+    { return {__private_init, _Impl::_S_logical_or(__x._M_data, __y._M_data)}; }
 
     _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
     operator&(const simd_mask& __x, const simd_mask& __y)
@@ -4714,8 +4722,7 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     // }}}
     // bitset_init ctor {{{
     _GLIBCXX_SIMD_INTRINSIC simd_mask(_BitsetInit, bitset<size()> __init)
-    : _M_data(
-	_Impl::_S_from_bitmask(_SanitizedBitMask<size()>(__init), _S_type_tag))
+    : _M_data(_Impl::_S_from_bitmask(_SanitizedBitMask<size()>(__init), _S_type_tag))
     {}
 
     // }}}
@@ -4727,8 +4734,7 @@ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     struct _CvtProxy
     {
       template <typename _Up, typename _A2,
-		typename
-		= enable_if_t<simd_size_v<_Up, _A2> == simd_size_v<_Tp, _Abi>>>
+		typename = enable_if_t<simd_size_v<_Up, _A2> == simd_size_v<_Tp, _Abi>>>
 	_GLIBCXX_SIMD_ALWAYS_INLINE
 	operator simd_mask<_Up, _A2>() &&
 	{
@@ -5419,26 +5425,17 @@ namespace __float_bitwise_operators
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
   operator^(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
-  {
-    return {__private_init,
-	    _Ap::_SimdImpl::_S_bit_xor(__data(__a), __data(__b))};
-  }
+  { return {__private_init, _Ap::_SimdImpl::_S_bit_xor(__data(__a), __data(__b))}; }
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
   operator|(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
-  {
-    return {__private_init,
-	    _Ap::_SimdImpl::_S_bit_or(__data(__a), __data(__b))};
-  }
+  { return {__private_init, _Ap::_SimdImpl::_S_bit_or(__data(__a), __data(__b))}; }
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
   operator&(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
-  {
-    return {__private_init,
-	    _Ap::_SimdImpl::_S_bit_and(__data(__a), __data(__b))};
-  }
+  { return {__private_init, _Ap::_SimdImpl::_S_bit_and(__data(__a), __data(__b))}; }
 
 template <typename _Tp, typename _Ap>
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index 4a4de4534f3..0e75f941288 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -836,22 +836,19 @@ struct _GnuTraits
     // _SimdBase / base class for simd, providing extra conversions {{{
     struct _SimdBase2
     {
-      _GLIBCXX_SIMD_ALWAYS_INLINE
-      explicit operator __intrinsic_type_t<_Tp, _Np>() const
-      {
-	return __to_intrin(static_cast<const simd<_Tp, _Abi>*>(this)->_M_data);
-      }
-      _GLIBCXX_SIMD_ALWAYS_INLINE
-      explicit operator __vector_type_t<_Tp, _Np>() const
-      {
-	return static_cast<const simd<_Tp, _Abi>*>(this)->_M_data.__builtin();
-      }
+      _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+      operator __intrinsic_type_t<_Tp, _Np>() const
+      { return __to_intrin(static_cast<const simd<_Tp, _Abi>*>(this)->_M_data); }
+
+      _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+      operator __vector_type_t<_Tp, _Np>() const
+      { return static_cast<const simd<_Tp, _Abi>*>(this)->_M_data.__builtin(); }
     };
 
     struct _SimdBase1
     {
-      _GLIBCXX_SIMD_ALWAYS_INLINE
-      explicit operator __intrinsic_type_t<_Tp, _Np>() const
+      _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+      operator __intrinsic_type_t<_Tp, _Np>() const
       { return __data(*static_cast<const simd<_Tp, _Abi>*>(this)); }
     };
 
@@ -863,23 +860,19 @@ struct _SimdBase1
     // _MaskBase {{{
     struct _MaskBase2
     {
-      _GLIBCXX_SIMD_ALWAYS_INLINE
-      explicit operator __intrinsic_type_t<_Tp, _Np>() const
-      {
-	return static_cast<const simd_mask<_Tp, _Abi>*>(this)
-	  ->_M_data.__intrin();
-      }
-      _GLIBCXX_SIMD_ALWAYS_INLINE
-      explicit operator __vector_type_t<_Tp, _Np>() const
-      {
-	return static_cast<const simd_mask<_Tp, _Abi>*>(this)->_M_data._M_data;
-      }
+      _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+      operator __intrinsic_type_t<_Tp, _Np>() const
+      { return static_cast<const simd_mask<_Tp, _Abi>*>(this) ->_M_data.__intrin(); }
+
+      _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+      operator __vector_type_t<_Tp, _Np>() const
+      { return static_cast<const simd_mask<_Tp, _Abi>*>(this)->_M_data._M_data; }
     };
 
     struct _MaskBase1
     {
-      _GLIBCXX_SIMD_ALWAYS_INLINE
-      explicit operator __intrinsic_type_t<_Tp, _Np>() const
+      _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+      operator __intrinsic_type_t<_Tp, _Np>() const
       { return __data(*static_cast<const simd_mask<_Tp, _Abi>*>(this)); }
     };
 
@@ -898,6 +891,7 @@ class _MaskCastType
     public:
       _GLIBCXX_SIMD_ALWAYS_INLINE
       _MaskCastType(_Up __x) : _M_data(__x) {}
+
       _GLIBCXX_SIMD_ALWAYS_INLINE
       operator _MaskMember() const { return _M_data; }
     };
@@ -913,6 +907,7 @@ class _SimdCastType1
     public:
       _GLIBCXX_SIMD_ALWAYS_INLINE
       _SimdCastType1(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
+
       _GLIBCXX_SIMD_ALWAYS_INLINE
       operator _SimdMember() const { return _M_data; }
     };
@@ -926,8 +921,10 @@ class _SimdCastType2
     public:
       _GLIBCXX_SIMD_ALWAYS_INLINE
       _SimdCastType2(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
+
       _GLIBCXX_SIMD_ALWAYS_INLINE
       _SimdCastType2(_Bp __b) : _M_data(__b) {}
+
       _GLIBCXX_SIMD_ALWAYS_INLINE
       operator _SimdMember() const { return _M_data; }
     };
@@ -1039,16 +1036,13 @@ _S_implicit_mask()
       }
 
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static constexpr __intrinsic_type_t<_Tp,
-								  _S_size<_Tp>>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr __intrinsic_type_t<_Tp, _S_size<_Tp>>
       _S_implicit_mask_intrin()
-      {
-	return __to_intrin(
-	  __vector_bitcast<_Tp>(_S_implicit_mask<_Tp>()._M_data));
-      }
+      { return __to_intrin(__vector_bitcast<_Tp>(_S_implicit_mask<_Tp>()._M_data)); }
 
     template <typename _TW, typename _TVT = _VectorTraits<_TW>>
-      _GLIBCXX_SIMD_INTRINSIC static constexpr _TW _S_masked(_TW __x)
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _TW
+      _S_masked(_TW __x)
       {
 	using _Tp = typename _TVT::value_type;
 	if constexpr (!_MaskMember<_Tp>::_S_is_partial)
@@ -1170,8 +1164,7 @@ _S_implicit_mask()
       { return __implicit_mask_n<_S_size<_Tp>>(); }
 
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static constexpr __bool_storage_member_type_t<
-	_S_size<_Tp>>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr __bool_storage_member_type_t<_S_size<_Tp>>
       _S_implicit_mask_intrin()
       { return __implicit_mask_n<_S_size<_Tp>>(); }
 
@@ -1303,7 +1296,8 @@ _S_load(const void* __p)
   // }}}
   // _S_store {{{
   template <size_t _ReqBytes = 0, typename _TV>
-    _GLIBCXX_SIMD_INTRINSIC static void _S_store(_TV __x, void* __addr)
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_store(_TV __x, void* __addr)
     {
       constexpr size_t _Bytes = _ReqBytes == 0 ? sizeof(__x) : _ReqBytes;
       static_assert(sizeof(__x) >= _Bytes);
@@ -1339,8 +1333,8 @@ _S_load(const void* __p)
     }
 
   template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static void _S_store(_SimdWrapper<_Tp, _Np> __x,
-						 void* __addr)
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_store(_SimdWrapper<_Tp, _Np> __x, void* __addr)
     { _S_store<_Np * sizeof(_Tp)>(__x._M_data, __addr); }
 
   // }}}
@@ -1447,8 +1441,8 @@ _S_broadcast(_Tp __x) noexcept
 
     // _S_generator {{{2
     template <typename _Fp, typename _Tp>
-      inline static constexpr _SimdMember<_Tp> _S_generator(_Fp&& __gen,
-							    _TypeTag<_Tp>)
+      inline static constexpr _SimdMember<_Tp>
+      _S_generator(_Fp&& __gen, _TypeTag<_Tp>)
       {
 	return __generate_vector<_Tp, _S_full_size<_Tp>>(
 		 [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1569,8 +1563,7 @@ _S_masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
     // _S_masked_store_nocvt {{{2
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static void
-      _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem,
-			    _MaskMember<_Tp> __k)
+      _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _MaskMember<_Tp> __k)
       {
 	_BitOps::_S_bit_iteration(
 	  _MaskImpl::_S_to_bits(__k),
@@ -1583,8 +1576,7 @@ _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem,
     template <typename _TW, typename _TVT = _VectorTraits<_TW>,
 	      typename _Tp = typename _TVT::value_type, typename _Up>
       static inline void
-      _S_masked_store(const _TW __v, _Up* __mem, const _MaskMember<_Tp> __k)
-	noexcept
+      _S_masked_store(const _TW __v, _Up* __mem, const _MaskMember<_Tp> __k) noexcept
       {
 	constexpr size_t _TV_size = _S_size<_Tp>;
 	[[maybe_unused]] const auto __vi = __to_intrin(__v);
@@ -1946,7 +1938,8 @@ _S_reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
     // frexp, modf and copysign implemented in simd_math.h
 #define _GLIBCXX_SIMD_MATH_FALLBACK(__name)                                    \
     template <typename _Tp, typename... _More>                                 \
-      static _Tp _S_##__name(const _Tp& __x, const _More&... __more)           \
+      static _Tp                                                               \
+      _S_##__name(const _Tp& __x, const _More&... __more)                      \
       {                                                                        \
 	return __generate_vector<_Tp>(                                         \
 		 [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {            \
@@ -1956,8 +1949,8 @@ _S_reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
 
 #define _GLIBCXX_SIMD_MATH_FALLBACK_MASKRET(__name)                            \
     template <typename _Tp, typename... _More>                                 \
-      static typename _Tp::mask_type _S_##__name(const _Tp& __x,               \
-						 const _More&... __more)       \
+      static typename _Tp::mask_type                                           \
+      _S_##__name(const _Tp& __x, const _More&... __more)                      \
       {                                                                        \
 	return __generate_vector<_Tp>(                                         \
 		 [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {            \
@@ -1967,7 +1960,8 @@ _S_reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
 
 #define _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(_RetTp, __name)                          \
     template <typename _Tp, typename... _More>                                        \
-      static auto _S_##__name(const _Tp& __x, const _More&... __more)                 \
+      static auto                                                                     \
+      _S_##__name(const _Tp& __x, const _More&... __more)                             \
       {                                                                               \
 	return __fixed_size_storage_t<_RetTp,                                         \
 				      _VectorTraits<_Tp>::_S_partial_width>::         \
@@ -2115,22 +2109,22 @@ _S_islessgreater(_SimdWrapper<_Tp, _Np> __x,
 #undef _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET
     // _S_abs {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
-    _S_abs(_SimdWrapper<_Tp, _Np> __x) noexcept
-    {
-      // if (__builtin_is_constant_evaluated())
-      //  {
-      //    return __x._M_data < 0 ? -__x._M_data : __x._M_data;
-      //  }
-      if constexpr (is_floating_point_v<_Tp>)
-	// `v < 0 ? -v : v` cannot compile to the efficient implementation of
-	// masking the signbit off because it must consider v == -0
-
-	// ~(-0.) & v would be easy, but breaks with fno-signed-zeros
-	return __and(_S_absmask<__vector_type_t<_Tp, _Np>>, __x._M_data);
-      else
-	return __x._M_data < 0 ? -__x._M_data : __x._M_data;
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _S_abs(_SimdWrapper<_Tp, _Np> __x) noexcept
+      {
+	// if (__builtin_is_constant_evaluated())
+	//  {
+	//    return __x._M_data < 0 ? -__x._M_data : __x._M_data;
+	//  }
+	if constexpr (is_floating_point_v<_Tp>)
+	  // `v < 0 ? -v : v` cannot compile to the efficient implementation of
+	  // masking the signbit off because it must consider v == -0
+
+	  // ~(-0.) & v would be easy, but breaks with fno-signed-zeros
+	  return __and(_S_absmask<__vector_type_t<_Tp, _Np>>, __x._M_data);
+	else
+	  return __x._M_data < 0 ? -__x._M_data : __x._M_data;
+      }
 
     // }}}3
     // _S_plus_minus {{{
@@ -2138,318 +2132,316 @@ _S_abs(_SimdWrapper<_Tp, _Np> __x) noexcept
     // - _TV must be __vector_type_t<floating-point type, N>.
     // - _UV must be _TV or floating-point type.
     template <typename _TV, typename _UV>
-    _GLIBCXX_SIMD_INTRINSIC static constexpr _TV _S_plus_minus(_TV __x,
-							       _UV __y) noexcept
-    {
-  #if defined __i386__ && !defined __SSE_MATH__
-      if constexpr (sizeof(__x) == 8)
-	{ // operations on __x would use the FPU
-	  static_assert(is_same_v<_TV, __vector_type_t<float, 2>>);
-	  const auto __x4 = __vector_bitcast<float, 4>(__x);
-	  if constexpr (is_same_v<_TV, _UV>)
-	    return __vector_bitcast<float, 2>(
-	      _S_plus_minus(__x4, __vector_bitcast<float, 4>(__y)));
-	  else
-	    return __vector_bitcast<float, 2>(_S_plus_minus(__x4, __y));
-	}
-  #endif
-  #if !defined __clang__ && __GCC_IEC_559 == 0
-      if (__builtin_is_constant_evaluated()
-	  || (__builtin_constant_p(__x) && __builtin_constant_p(__y)))
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _TV
+      _S_plus_minus(_TV __x, _UV __y) noexcept
+      {
+#if defined __i386__ && !defined __SSE_MATH__
+	if constexpr (sizeof(__x) == 8)
+	  { // operations on __x would use the FPU
+	    static_assert(is_same_v<_TV, __vector_type_t<float, 2>>);
+	    const auto __x4 = __vector_bitcast<float, 4>(__x);
+	    if constexpr (is_same_v<_TV, _UV>)
+	      return __vector_bitcast<float, 2>(
+		       _S_plus_minus(__x4, __vector_bitcast<float, 4>(__y)));
+	    else
+	      return __vector_bitcast<float, 2>(_S_plus_minus(__x4, __y));
+	  }
+#endif
+#if !defined __clang__ && __GCC_IEC_559 == 0
+	if (__builtin_is_constant_evaluated()
+	      || (__builtin_constant_p(__x) && __builtin_constant_p(__y)))
+	  return (__x + __y) - __y;
+	else
+	  return [&] {
+	    __x += __y;
+	    if constexpr(__have_sse)
+	      {
+		if constexpr (sizeof(__x) >= 16)
+		  asm("" : "+x"(__x));
+		else if constexpr (is_same_v<__vector_type_t<float, 2>, _TV>)
+		  asm("" : "+x"(__x[0]), "+x"(__x[1]));
+		else
+		  __assert_unreachable<_TV>();
+	      }
+	    else if constexpr(__have_neon)
+	      asm("" : "+w"(__x));
+	    else if constexpr (__have_power_vmx)
+	      {
+		if constexpr (is_same_v<__vector_type_t<float, 2>, _TV>)
+		  asm("" : "+fgr"(__x[0]), "+fgr"(__x[1]));
+		else
+		  asm("" : "+v"(__x));
+	      }
+	    else
+	      asm("" : "+g"(__x));
+	    return __x - __y;
+	  }();
+#else
 	return (__x + __y) - __y;
-      else
-	return [&] {
-	  __x += __y;
-	  if constexpr(__have_sse)
-	    {
-	      if constexpr (sizeof(__x) >= 16)
-		asm("" : "+x"(__x));
-	      else if constexpr (is_same_v<__vector_type_t<float, 2>, _TV>)
-		asm("" : "+x"(__x[0]), "+x"(__x[1]));
-	      else
-		__assert_unreachable<_TV>();
-	    }
-	  else if constexpr(__have_neon)
-	    asm("" : "+w"(__x));
-	  else if constexpr (__have_power_vmx)
-	    {
-	      if constexpr (is_same_v<__vector_type_t<float, 2>, _TV>)
-		asm("" : "+fgr"(__x[0]), "+fgr"(__x[1]));
-	      else
-		asm("" : "+v"(__x));
-	    }
-	  else
-	    asm("" : "+g"(__x));
-	  return __x - __y;
-	}();
-  #else
-      return (__x + __y) - __y;
-  #endif
-    }
+#endif
+      }
 
     // }}}
     // _S_nearbyint {{{3
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_nearbyint(_Tp __x_) noexcept
-    {
-      using value_type = typename _TVT::value_type;
-      using _V = typename _TVT::type;
-      const _V __x = __x_;
-      const _V __absx = __and(__x, _S_absmask<_V>);
-      static_assert(__CHAR_BIT__ * sizeof(1ull) >= __digits_v<value_type>);
-      _GLIBCXX_SIMD_USE_CONSTEXPR _V __shifter_abs
-	= _V() + (1ull << (__digits_v<value_type> - 1));
-      const _V __shifter = __or(__and(_S_signmask<_V>, __x), __shifter_abs);
-      const _V __shifted = _S_plus_minus(__x, __shifter);
-      return __absx < __shifter_abs ? __shifted : __x;
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_nearbyint(_Tp __x_) noexcept
+      {
+	using value_type = typename _TVT::value_type;
+	using _V = typename _TVT::type;
+	const _V __x = __x_;
+	const _V __absx = __and(__x, _S_absmask<_V>);
+	static_assert(__CHAR_BIT__ * sizeof(1ull) >= __digits_v<value_type>);
+	_GLIBCXX_SIMD_USE_CONSTEXPR _V __shifter_abs
+	  = _V() + (1ull << (__digits_v<value_type> - 1));
+	const _V __shifter = __or(__and(_S_signmask<_V>, __x), __shifter_abs);
+	const _V __shifted = _S_plus_minus(__x, __shifter);
+	return __absx < __shifter_abs ? __shifted : __x;
+      }
 
     // _S_rint {{{3
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_rint(_Tp __x) noexcept
-    {
-      return _SuperImpl::_S_nearbyint(__x);
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_rint(_Tp __x) noexcept
+      { return _SuperImpl::_S_nearbyint(__x); }
 
     // _S_trunc {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
-    _S_trunc(_SimdWrapper<_Tp, _Np> __x)
-    {
-      using _V = __vector_type_t<_Tp, _Np>;
-      const _V __absx = __and(__x._M_data, _S_absmask<_V>);
-      static_assert(__CHAR_BIT__ * sizeof(1ull) >= __digits_v<_Tp>);
-      constexpr _Tp __shifter = 1ull << (__digits_v<_Tp> - 1);
-      _V __truncated = _S_plus_minus(__absx, __shifter);
-      __truncated -= __truncated > __absx ? _V() + 1 : _V();
-      return __absx < __shifter ? __or(__xor(__absx, __x._M_data), __truncated)
-				: __x._M_data;
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _S_trunc(_SimdWrapper<_Tp, _Np> __x)
+      {
+	using _V = __vector_type_t<_Tp, _Np>;
+	const _V __absx = __and(__x._M_data, _S_absmask<_V>);
+	static_assert(__CHAR_BIT__ * sizeof(1ull) >= __digits_v<_Tp>);
+	constexpr _Tp __shifter = 1ull << (__digits_v<_Tp> - 1);
+	_V __truncated = _S_plus_minus(__absx, __shifter);
+	__truncated -= __truncated > __absx ? _V() + 1 : _V();
+	return __absx < __shifter ? __or(__xor(__absx, __x._M_data), __truncated)
+				  : __x._M_data;
+      }
 
     // _S_round {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
-    _S_round(_SimdWrapper<_Tp, _Np> __x)
-    {
-      const auto __abs_x = _SuperImpl::_S_abs(__x);
-      const auto __t_abs = _SuperImpl::_S_trunc(__abs_x)._M_data;
-      const auto __r_abs // round(abs(x)) =
-	= __t_abs + (__abs_x._M_data - __t_abs >= _Tp(.5) ? _Tp(1) : 0);
-      return __or(__xor(__abs_x._M_data, __x._M_data), __r_abs);
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _S_round(_SimdWrapper<_Tp, _Np> __x)
+      {
+	const auto __abs_x = _SuperImpl::_S_abs(__x);
+	const auto __t_abs = _SuperImpl::_S_trunc(__abs_x)._M_data;
+	const auto __r_abs // round(abs(x)) =
+	  = __t_abs + (__abs_x._M_data - __t_abs >= _Tp(.5) ? _Tp(1) : 0);
+	return __or(__xor(__abs_x._M_data, __x._M_data), __r_abs);
+      }
 
     // _S_floor {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
-    _S_floor(_SimdWrapper<_Tp, _Np> __x)
-    {
-      const auto __y = _SuperImpl::_S_trunc(__x)._M_data;
-      const auto __negative_input
-	= __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
-      const auto __mask
-	= __andnot(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
-      return __or(__andnot(__mask, __y),
-		  __and(__mask, __y - __vector_broadcast<_Np, _Tp>(1)));
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _S_floor(_SimdWrapper<_Tp, _Np> __x)
+      {
+	const auto __y = _SuperImpl::_S_trunc(__x)._M_data;
+	const auto __negative_input
+	  = __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
+	const auto __mask
+	  = __andnot(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
+	return __or(__andnot(__mask, __y),
+		    __and(__mask, __y - __vector_broadcast<_Np, _Tp>(1)));
+      }
 
     // _S_ceil {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
-    _S_ceil(_SimdWrapper<_Tp, _Np> __x)
-    {
-      const auto __y = _SuperImpl::_S_trunc(__x)._M_data;
-      const auto __negative_input
-	= __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
-      const auto __inv_mask
-	= __or(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
-      return __or(__and(__inv_mask, __y),
-		  __andnot(__inv_mask, __y + __vector_broadcast<_Np, _Tp>(1)));
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _S_ceil(_SimdWrapper<_Tp, _Np> __x)
+      {
+	const auto __y = _SuperImpl::_S_trunc(__x)._M_data;
+	const auto __negative_input
+	  = __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
+	const auto __inv_mask
+	  = __or(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
+	return __or(__and(__inv_mask, __y),
+		    __andnot(__inv_mask, __y + __vector_broadcast<_Np, _Tp>(1)));
+      }
 
     // _S_isnan {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
-    _S_isnan([[maybe_unused]] _SimdWrapper<_Tp, _Np> __x)
-    {
-  #if __FINITE_MATH_ONLY__
-      return {}; // false
-  #elif !defined __SUPPORT_SNAN__
-      return ~(__x._M_data == __x._M_data);
-  #elif defined __STDC_IEC_559__
-      using _Ip = __int_for_sizeof_t<_Tp>;
-      const auto __absn = __vector_bitcast<_Ip>(_SuperImpl::_S_abs(__x));
-      const auto __infn
-	= __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__infinity_v<_Tp>));
-      return __infn < __absn;
-  #else
-  #error "Not implemented: how to support SNaN but non-IEC559 floating-point?"
-  #endif
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+      _S_isnan([[maybe_unused]] _SimdWrapper<_Tp, _Np> __x)
+      {
+#if __FINITE_MATH_ONLY__
+	return {}; // false
+#elif !defined __SUPPORT_SNAN__
+	return ~(__x._M_data == __x._M_data);
+#elif defined __STDC_IEC_559__
+	using _Ip = __int_for_sizeof_t<_Tp>;
+	const auto __absn = __vector_bitcast<_Ip>(_SuperImpl::_S_abs(__x));
+	const auto __infn
+	  = __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__infinity_v<_Tp>));
+	return __infn < __absn;
+#else
+#error "Not implemented: how to support SNaN but non-IEC559 floating-point?"
+#endif
+      }
 
     // _S_isfinite {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
-    _S_isfinite([[maybe_unused]] _SimdWrapper<_Tp, _Np> __x)
-    {
-  #if __FINITE_MATH_ONLY__
-      using _UV = typename _MaskMember<_Tp>::_BuiltinType;
-      _GLIBCXX_SIMD_USE_CONSTEXPR _UV __alltrue = ~_UV();
-      return __alltrue;
-  #else
-      // if all exponent bits are set, __x is either inf or NaN
-      using _Ip = __int_for_sizeof_t<_Tp>;
-      const auto __absn = __vector_bitcast<_Ip>(_SuperImpl::_S_abs(__x));
-      const auto __maxn
-	= __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__finite_max_v<_Tp>));
-      return __absn <= __maxn;
-  #endif
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+      _S_isfinite([[maybe_unused]] _SimdWrapper<_Tp, _Np> __x)
+      {
+#if __FINITE_MATH_ONLY__
+	using _UV = typename _MaskMember<_Tp>::_BuiltinType;
+	_GLIBCXX_SIMD_USE_CONSTEXPR _UV __alltrue = ~_UV();
+	return __alltrue;
+#else
+	// if all exponent bits are set, __x is either inf or NaN
+	using _Ip = __int_for_sizeof_t<_Tp>;
+	const auto __absn = __vector_bitcast<_Ip>(_SuperImpl::_S_abs(__x));
+	const auto __maxn
+	  = __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__finite_max_v<_Tp>));
+	return __absn <= __maxn;
+#endif
+      }
 
     // _S_isunordered {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
-    _S_isunordered(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
-    {
-      return __or(_S_isnan(__x), _S_isnan(__y));
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+      _S_isunordered(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+      { return __or(_S_isnan(__x), _S_isnan(__y)); }
 
     // _S_signbit {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
-    _S_signbit(_SimdWrapper<_Tp, _Np> __x)
-    {
-      using _Ip = __int_for_sizeof_t<_Tp>;
-      return __vector_bitcast<_Ip>(__x) < 0;
-      // Arithmetic right shift (SRA) would also work (instead of compare), but
-      // 64-bit SRA isn't available on x86 before AVX512. And in general,
-      // compares are more likely to be efficient than SRA.
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+      _S_signbit(_SimdWrapper<_Tp, _Np> __x)
+      {
+	using _Ip = __int_for_sizeof_t<_Tp>;
+	return __vector_bitcast<_Ip>(__x) < 0;
+	// Arithmetic right shift (SRA) would also work (instead of compare), but
+	// 64-bit SRA isn't available on x86 before AVX512. And in general,
+	// compares are more likely to be efficient than SRA.
+      }
 
     // _S_isinf {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
-    _S_isinf([[maybe_unused]] _SimdWrapper<_Tp, _Np> __x)
-    {
-  #if __FINITE_MATH_ONLY__
-      return {}; // false
-  #else
-      return _SuperImpl::template _S_equal_to<_Tp, _Np>(_SuperImpl::_S_abs(__x),
-							__vector_broadcast<_Np>(
-							  __infinity_v<_Tp>));
-      // alternative:
-      // compare to inf using the corresponding integer type
-      /*
-	 return
-	 __vector_bitcast<_Tp>(__vector_bitcast<__int_for_sizeof_t<_Tp>>(
-			       _S_abs(__x)._M_data)
-	 ==
-	 __vector_bitcast<__int_for_sizeof_t<_Tp>>(__vector_broadcast<_Np>(
-	 __infinity_v<_Tp>)));
-	 */
-  #endif
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+      _S_isinf([[maybe_unused]] _SimdWrapper<_Tp, _Np> __x)
+      {
+#if __FINITE_MATH_ONLY__
+	return {}; // false
+#else
+	return _SuperImpl::template _S_equal_to<_Tp, _Np>(_SuperImpl::_S_abs(__x),
+							  __vector_broadcast<_Np>(
+							    __infinity_v<_Tp>));
+	// alternative:
+	// compare to inf using the corresponding integer type
+	/*
+	   return
+	   __vector_bitcast<_Tp>(__vector_bitcast<__int_for_sizeof_t<_Tp>>(
+				 _S_abs(__x)._M_data)
+	   ==
+	   __vector_bitcast<__int_for_sizeof_t<_Tp>>(__vector_broadcast<_Np>(
+	   __infinity_v<_Tp>)));
+	   */
+#endif
+      }
 
     // _S_isnormal {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
-    _S_isnormal(_SimdWrapper<_Tp, _Np> __x)
-    {
-      using _Ip = __int_for_sizeof_t<_Tp>;
-      const auto __absn = __vector_bitcast<_Ip>(_SuperImpl::_S_abs(__x));
-      const auto __minn
-	= __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__norm_min_v<_Tp>));
-  #if __FINITE_MATH_ONLY__
-      return __absn >= __minn;
-  #else
-      const auto __maxn
-	= __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__finite_max_v<_Tp>));
-      return __minn <= __absn && __absn <= __maxn;
-  #endif
-    }
+      _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+      _S_isnormal(_SimdWrapper<_Tp, _Np> __x)
+      {
+	using _Ip = __int_for_sizeof_t<_Tp>;
+	const auto __absn = __vector_bitcast<_Ip>(_SuperImpl::_S_abs(__x));
+	const auto __minn
+	  = __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__norm_min_v<_Tp>));
+#if __FINITE_MATH_ONLY__
+	return __absn >= __minn;
+#else
+	const auto __maxn
+	  = __vector_bitcast<_Ip>(__vector_broadcast<_Np>(__finite_max_v<_Tp>));
+	return __minn <= __absn && __absn <= __maxn;
+#endif
+      }
 
     // _S_fpclassify {{{3
     template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static __fixed_size_storage_t<int, _Np>
-    _S_fpclassify(_SimdWrapper<_Tp, _Np> __x)
-    {
-      using _I = __int_for_sizeof_t<_Tp>;
-      const auto __xn
-	= __vector_bitcast<_I>(__to_intrin(_SuperImpl::_S_abs(__x)));
-      constexpr size_t _NI = sizeof(__xn) / sizeof(_I);
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __minn
-	= __vector_bitcast<_I>(__vector_broadcast<_NI>(__norm_min_v<_Tp>));
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __infn
-	= __vector_bitcast<_I>(__vector_broadcast<_NI>(__infinity_v<_Tp>));
-
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_normal
-	= __vector_broadcast<_NI, _I>(FP_NORMAL);
-  #if !__FINITE_MATH_ONLY__
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_nan
-	= __vector_broadcast<_NI, _I>(FP_NAN);
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_infinite
-	= __vector_broadcast<_NI, _I>(FP_INFINITE);
-  #endif
-  #ifndef __FAST_MATH__
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_subnormal
-	= __vector_broadcast<_NI, _I>(FP_SUBNORMAL);
-  #endif
-      _GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_zero
-	= __vector_broadcast<_NI, _I>(FP_ZERO);
+      _GLIBCXX_SIMD_INTRINSIC static __fixed_size_storage_t<int, _Np>
+      _S_fpclassify(_SimdWrapper<_Tp, _Np> __x)
+      {
+	using _I = __int_for_sizeof_t<_Tp>;
+	const auto __xn
+	  = __vector_bitcast<_I>(__to_intrin(_SuperImpl::_S_abs(__x)));
+	constexpr size_t _NI = sizeof(__xn) / sizeof(_I);
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __minn
+	  = __vector_bitcast<_I>(__vector_broadcast<_NI>(__norm_min_v<_Tp>));
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __infn
+	  = __vector_bitcast<_I>(__vector_broadcast<_NI>(__infinity_v<_Tp>));
+
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_normal
+	  = __vector_broadcast<_NI, _I>(FP_NORMAL);
+#if !__FINITE_MATH_ONLY__
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_nan
+	  = __vector_broadcast<_NI, _I>(FP_NAN);
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_infinite
+	  = __vector_broadcast<_NI, _I>(FP_INFINITE);
+#endif
+#ifndef __FAST_MATH__
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_subnormal
+	  = __vector_broadcast<_NI, _I>(FP_SUBNORMAL);
+#endif
+	_GLIBCXX_SIMD_USE_CONSTEXPR auto __fp_zero
+	  = __vector_broadcast<_NI, _I>(FP_ZERO);
 
-      __vector_type_t<_I, _NI>
-	__tmp = __xn < __minn
+	__vector_type_t<_I, _NI>
+	  __tmp = __xn < __minn
   #ifdef __FAST_MATH__
-		  ? __fp_zero
+		    ? __fp_zero
   #else
-		  ? (__xn == 0 ? __fp_zero : __fp_subnormal)
+		    ? (__xn == 0 ? __fp_zero : __fp_subnormal)
   #endif
   #if __FINITE_MATH_ONLY__
-		  : __fp_normal;
+		    : __fp_normal;
   #else
-		  : (__xn < __infn ? __fp_normal
-				   : (__xn == __infn ? __fp_infinite : __fp_nan));
+		    : (__xn < __infn ? __fp_normal
+				     : (__xn == __infn ? __fp_infinite : __fp_nan));
   #endif
 
-      if constexpr (sizeof(_I) == sizeof(int))
-	{
-	  using _FixedInt = __fixed_size_storage_t<int, _Np>;
-	  const auto __as_int = __vector_bitcast<int, _Np>(__tmp);
-	  if constexpr (_FixedInt::_S_tuple_size == 1)
-	    return {__as_int};
-	  else if constexpr (_FixedInt::_S_tuple_size == 2
-			     && is_same_v<
-			       typename _FixedInt::_SecondType::_FirstAbi,
-			       simd_abi::scalar>)
-	    return {__extract<0, 2>(__as_int), __as_int[_Np - 1]};
-	  else if constexpr (_FixedInt::_S_tuple_size == 2)
-	    return {__extract<0, 2>(__as_int),
-		    __auto_bitcast(__extract<1, 2>(__as_int))};
-	  else
-	    __assert_unreachable<_Tp>();
-	}
-      else if constexpr (_Np == 2 && sizeof(_I) == 8
-			 && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 2)
-	{
-	  const auto __aslong = __vector_bitcast<_LLong>(__tmp);
-	  return {int(__aslong[0]), {int(__aslong[1])}};
-	}
-  #if _GLIBCXX_SIMD_X86INTRIN
-      else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 32
-			 && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
-	return {_mm_packs_epi32(__to_intrin(__lo128(__tmp)),
-				__to_intrin(__hi128(__tmp)))};
-      else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 64
-			 && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
-	return {_mm512_cvtepi64_epi32(__to_intrin(__tmp))};
-  #endif // _GLIBCXX_SIMD_X86INTRIN
-      else if constexpr (__fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
-	return {__call_with_subscripts<_Np>(__vector_bitcast<_LLong>(__tmp),
-					    [](auto... __l) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
-					      return __make_wrapper<int>(__l...);
-					    })};
-      else
-	__assert_unreachable<_Tp>();
-    }
+	if constexpr (sizeof(_I) == sizeof(int))
+	  {
+	    using _FixedInt = __fixed_size_storage_t<int, _Np>;
+	    const auto __as_int = __vector_bitcast<int, _Np>(__tmp);
+	    if constexpr (_FixedInt::_S_tuple_size == 1)
+	      return {__as_int};
+	    else if constexpr (_FixedInt::_S_tuple_size == 2
+				 && is_same_v<
+				      typename _FixedInt::_SecondType::_FirstAbi,
+				      simd_abi::scalar>)
+	      return {__extract<0, 2>(__as_int), __as_int[_Np - 1]};
+	    else if constexpr (_FixedInt::_S_tuple_size == 2)
+	      return {__extract<0, 2>(__as_int),
+		      __auto_bitcast(__extract<1, 2>(__as_int))};
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (_Np == 2 && sizeof(_I) == 8
+			     && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 2)
+	  {
+	    const auto __aslong = __vector_bitcast<_LLong>(__tmp);
+	    return {int(__aslong[0]), {int(__aslong[1])}};
+	  }
+#if _GLIBCXX_SIMD_X86INTRIN
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 32
+			     && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+	  return {_mm_packs_epi32(__to_intrin(__lo128(__tmp)),
+				  __to_intrin(__hi128(__tmp)))};
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 64
+			     && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+	  return {_mm512_cvtepi64_epi32(__to_intrin(__tmp))};
+#endif // _GLIBCXX_SIMD_X86INTRIN
+	else if constexpr (__fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+	  return {__call_with_subscripts<_Np>(__vector_bitcast<_LLong>(__tmp),
+					      [](auto... __l) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+						return __make_wrapper<int>(__l...);
+					      })};
+	else
+	  __assert_unreachable<_Tp>();
+      }
 
     // _S_increment & _S_decrement{{{2
     template <typename _Tp, size_t _Np>
@@ -2703,10 +2695,7 @@ struct _MaskImplBuiltin
     template <typename _Tp>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
       _S_broadcast(bool __x)
-      {
-	return __x ? _Abi::template _S_implicit_mask<_Tp>()
-		   : _MaskMember<_Tp>();
-      }
+      { return __x ? _Abi::template _S_implicit_mask<_Tp>() : _MaskMember<_Tp>(); }
 
     // }}}
     // _S_load {{{
@@ -2808,8 +2797,8 @@ _S_masked_load(_SimdWrapper<_Tp, _Np> __merge,
 
     // _S_store {{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void _S_store(_SimdWrapper<_Tp, _Np> __v,
-						   bool* __mem) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_store(_SimdWrapper<_Tp, _Np> __v, bool* __mem) noexcept
       {
 	__execute_n_times<_Np>([&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 	  __mem[__i] = __v[__i];
@@ -2832,21 +2821,17 @@ _S_masked_store(const _SimdWrapper<_Tp, _Np> __v, bool* __mem,
     template <size_t _Np, typename _Tp>
       _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
       _S_from_bitmask(_SanitizedBitMask<_Np> __bits, _TypeTag<_Tp>)
-      {
-	return _SuperImpl::template _S_to_maskvector<_Tp, _S_size<_Tp>>(__bits);
-      }
+      { return _SuperImpl::template _S_to_maskvector<_Tp, _S_size<_Tp>>(__bits); }
 
     // logical and bitwise operators {{{2
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_logical_and(const _SimdWrapper<_Tp, _Np>& __x,
-		     const _SimdWrapper<_Tp, _Np>& __y)
+      _S_logical_and(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       { return __and(__x._M_data, __y._M_data); }
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_logical_or(const _SimdWrapper<_Tp, _Np>& __x,
-		    const _SimdWrapper<_Tp, _Np>& __y)
+      _S_logical_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       { return __or(__x._M_data, __y._M_data); }
 
     template <typename _Tp, size_t _Np>
@@ -2862,26 +2847,23 @@ _S_bit_not(const _SimdWrapper<_Tp, _Np>& __x)
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_bit_and(const _SimdWrapper<_Tp, _Np>& __x,
-		 const _SimdWrapper<_Tp, _Np>& __y)
+      _S_bit_and(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       { return __and(__x._M_data, __y._M_data); }
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_bit_or(const _SimdWrapper<_Tp, _Np>& __x,
-		const _SimdWrapper<_Tp, _Np>& __y)
+      _S_bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       { return __or(__x._M_data, __y._M_data); }
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
-		 const _SimdWrapper<_Tp, _Np>& __y)
+      _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       { return __xor(__x._M_data, __y._M_data); }
 
     // smart_reference access {{{2
     template <typename _Tp, size_t _Np>
-      static constexpr void _S_set(_SimdWrapper<_Tp, _Np>& __k, int __i,
-				   bool __x) noexcept
+      static constexpr void
+      _S_set(_SimdWrapper<_Tp, _Np>& __k, int __i, bool __x) noexcept
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  __k._M_set(__i, __x);
@@ -2907,15 +2889,13 @@ _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
     // _S_masked_assign{{{2
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static void
-      _S_masked_assign(_SimdWrapper<_Tp, _Np> __k,
-		       _SimdWrapper<_Tp, _Np>& __lhs,
+      _S_masked_assign(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
 		       __type_identity_t<_SimdWrapper<_Tp, _Np>> __rhs)
       { __lhs = _CommonImpl::_S_blend(__k, __lhs, __rhs); }
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static void
-      _S_masked_assign(_SimdWrapper<_Tp, _Np> __k,
-		       _SimdWrapper<_Tp, _Np>& __lhs, bool __rhs)
+      _S_masked_assign(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs, bool __rhs)
       {
 	if (__builtin_constant_p(__rhs))
 	  {
@@ -2995,20 +2975,14 @@ _S_popcount(simd_mask<_Tp, _Abi> __k)
     template <typename _Tp>
       _GLIBCXX_SIMD_INTRINSIC static int
       _S_find_first_set(simd_mask<_Tp, _Abi> __k)
-      {
-	return std::__countr_zero(
-	  _SuperImpl::_S_to_bits(__data(__k))._M_to_bits());
-      }
+      { return std::__countr_zero( _SuperImpl::_S_to_bits(__data(__k))._M_to_bits()); }
 
     // }}}
     // _S_find_last_set {{{
     template <typename _Tp>
       _GLIBCXX_SIMD_INTRINSIC static int
       _S_find_last_set(simd_mask<_Tp, _Abi> __k)
-      {
-	return std::__bit_width(
-	  _SuperImpl::_S_to_bits(__data(__k))._M_to_bits()) - 1;
-      }
+      { return std::__bit_width( _SuperImpl::_S_to_bits(__data(__k))._M_to_bits()) - 1; }
 
     // }}}
   };
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index 88a9b27e359..123e714b528 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -55,10 +55,7 @@ struct __simd_tuple_element<0, _SimdTuple<_Tp, _A0, _As...>>
 
 template <size_t _I, typename _Tp, typename _A0, typename... _As>
   struct __simd_tuple_element<_I, _SimdTuple<_Tp, _A0, _As...>>
-  {
-    using type =
-      typename __simd_tuple_element<_I - 1, _SimdTuple<_Tp, _As...>>::type;
-  };
+  { using type = typename __simd_tuple_element<_I - 1, _SimdTuple<_Tp, _As...>>::type; };
 
 template <size_t _I, typename _Tp>
   using __simd_tuple_element_t = typename __simd_tuple_element<_I, _Tp>::type;
@@ -80,10 +77,8 @@ __simd_tuple_concat(const _SimdTuple<_Tp, _A0s...>& __left,
   }
 
 template <typename _Tp, typename _A10, typename... _A1s>
-  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, simd_abi::scalar, _A10,
-					       _A1s...>
-  __simd_tuple_concat(const _Tp& __left,
-		      const _SimdTuple<_Tp, _A10, _A1s...>& __right)
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, simd_abi::scalar, _A10, _A1s...>
+  __simd_tuple_concat(const _Tp& __left, const _SimdTuple<_Tp, _A10, _A1s...>& __right)
   { return {__left, __right}; }
 
 // }}}
@@ -112,37 +107,29 @@ struct __as_simd_tuple
 
 template <typename _Tp, typename _A0, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr simd<_Tp, _A0>
-  __simd_tuple_get_impl(__as_simd, const _SimdTuple<_Tp, _A0, _Abis...>& __t,
-			_SizeConstant<0>)
+  __simd_tuple_get_impl(__as_simd, const _SimdTuple<_Tp, _A0, _Abis...>& __t, _SizeConstant<0>)
   { return {__private_init, __t.first}; }
 
 template <typename _Tp, typename _A0, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr const auto&
-  __simd_tuple_get_impl(__as_simd_tuple,
-			const _SimdTuple<_Tp, _A0, _Abis...>& __t,
+  __simd_tuple_get_impl(__as_simd_tuple, const _SimdTuple<_Tp, _A0, _Abis...>& __t,
 			_SizeConstant<0>)
   { return __t.first; }
 
 template <typename _Tp, typename _A0, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr auto&
-  __simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _A0, _Abis...>& __t,
-			_SizeConstant<0>)
+  __simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _A0, _Abis...>& __t, _SizeConstant<0>)
   { return __t.first; }
 
 template <typename _R, size_t _Np, typename _Tp, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr auto
-  __simd_tuple_get_impl(_R, const _SimdTuple<_Tp, _Abis...>& __t,
-			_SizeConstant<_Np>)
+  __simd_tuple_get_impl(_R, const _SimdTuple<_Tp, _Abis...>& __t, _SizeConstant<_Np>)
   { return __simd_tuple_get_impl(_R(), __t.second, _SizeConstant<_Np - 1>()); }
 
 template <size_t _Np, typename _Tp, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr auto&
-  __simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _Abis...>& __t,
-			_SizeConstant<_Np>)
-  {
-    return __simd_tuple_get_impl(__as_simd_tuple(), __t.second,
-				 _SizeConstant<_Np - 1>());
-  }
+  __simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _Abis...>& __t, _SizeConstant<_Np>)
+  { return __simd_tuple_get_impl(__as_simd_tuple(), __t.second, _SizeConstant<_Np - 1>()); }
 
 template <size_t _Np, typename _Tp, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr auto
@@ -154,16 +141,12 @@ __get_simd_at(const _SimdTuple<_Tp, _Abis...>& __t)
 template <size_t _Np, typename _Tp, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr auto
   __get_tuple_at(const _SimdTuple<_Tp, _Abis...>& __t)
-  {
-    return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>());
-  }
+  { return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>()); }
 
 template <size_t _Np, typename _Tp, typename... _Abis>
   _GLIBCXX_SIMD_INTRINSIC constexpr auto&
   __get_tuple_at(_SimdTuple<_Tp, _Abis...>& __t)
-  {
-    return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>());
-  }
+  { return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>()); }
 
 // __tuple_element_meta {{{1
 template <typename _Tp, typename _Abi, size_t _Offset>
@@ -213,17 +196,13 @@ struct _WithOffset
   {
     static inline constexpr size_t _S_offset = _Offset;
 
-    _GLIBCXX_SIMD_INTRINSIC char* _M_as_charptr()
-    {
-      return reinterpret_cast<char*>(this)
-	     + _S_offset * sizeof(typename _Base::value_type);
-    }
+    _GLIBCXX_SIMD_INTRINSIC char*
+    _M_as_charptr()
+    { return reinterpret_cast<char*>(this) + _S_offset * sizeof(typename _Base::value_type); }
 
-    _GLIBCXX_SIMD_INTRINSIC const char* _M_as_charptr() const
-    {
-      return reinterpret_cast<const char*>(this)
-	     + _S_offset * sizeof(typename _Base::value_type);
-    }
+    _GLIBCXX_SIMD_INTRINSIC const char*
+    _M_as_charptr() const
+    { return reinterpret_cast<const char*>(this) + _S_offset * sizeof(typename _Base::value_type); }
   };
 
 // make _WithOffset<_WithOffset> ill-formed to use:
@@ -240,19 +219,13 @@ __add_offset(_Tp& __base)
   _GLIBCXX_SIMD_INTRINSIC
   decltype(auto)
   __add_offset(const _Tp& __base)
-  {
-    return static_cast<const _WithOffset<_Offset, __remove_cvref_t<_Tp>>&>(
-      __base);
-  }
+  { return static_cast<const _WithOffset<_Offset, __remove_cvref_t<_Tp>>&>(__base); }
 
 template <size_t _Offset, size_t _ExistingOffset, typename _Tp>
   _GLIBCXX_SIMD_INTRINSIC
   decltype(auto)
   __add_offset(_WithOffset<_ExistingOffset, _Tp>& __base)
-  {
-    return static_cast<_WithOffset<_Offset + _ExistingOffset, _Tp>&>(
-      static_cast<_Tp&>(__base));
-  }
+  { return static_cast<_WithOffset<_Offset + _ExistingOffset, _Tp>&>(static_cast<_Tp&>(__base)); }
 
 template <size_t _Offset, size_t _ExistingOffset, typename _Tp>
   _GLIBCXX_SIMD_INTRINSIC
@@ -298,7 +271,8 @@ struct _SimdTupleData
     _SecondType second;
 
     _GLIBCXX_SIMD_INTRINSIC
-    constexpr bool _M_is_constprop() const
+    constexpr bool
+    _M_is_constprop() const
     {
       if constexpr (is_class_v<_FirstType>)
 	return first._M_is_constprop() && second._M_is_constprop();
@@ -314,7 +288,8 @@ struct _SimdTupleData<_FirstType, _SimdTuple<_Tp>>
     static constexpr _SimdTuple<_Tp> second = {};
 
     _GLIBCXX_SIMD_INTRINSIC
-    constexpr bool _M_is_constprop() const
+    constexpr bool
+    _M_is_constprop() const
     {
       if constexpr (is_class_v<_FirstType>)
 	return first._M_is_constprop();
@@ -353,25 +328,31 @@ struct _SimdTuple<_Tp, _Abi0, _Abis...>
       = default;
 
     template <typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x)
+      _GLIBCXX_SIMD_INTRINSIC constexpr
+      _SimdTuple(_Up&& __x)
       : _Base{static_cast<_Up&&>(__x)} {}
 
     template <typename _Up, typename _Up2>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x, _Up2&& __y)
+      _GLIBCXX_SIMD_INTRINSIC constexpr
+      _SimdTuple(_Up&& __x, _Up2&& __y)
       : _Base{static_cast<_Up&&>(__x), static_cast<_Up2&&>(__y)} {}
 
     template <typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x, _SimdTuple<_Tp>)
+      _GLIBCXX_SIMD_INTRINSIC constexpr
+      _SimdTuple(_Up&& __x, _SimdTuple<_Tp>)
       : _Base{static_cast<_Up&&>(__x)} {}
 
-    _GLIBCXX_SIMD_INTRINSIC char* _M_as_charptr()
+    _GLIBCXX_SIMD_INTRINSIC char*
+    _M_as_charptr()
     { return reinterpret_cast<char*>(this); }
 
-    _GLIBCXX_SIMD_INTRINSIC const char* _M_as_charptr() const
+    _GLIBCXX_SIMD_INTRINSIC const char*
+    _M_as_charptr() const
     { return reinterpret_cast<const char*>(this); }
 
     template <size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC constexpr auto& _M_at()
+      _GLIBCXX_SIMD_INTRINSIC constexpr auto&
+      _M_at()
       {
 	if constexpr (_Np == 0)
 	  return first;
@@ -380,7 +361,8 @@ struct _SimdTuple<_Tp, _Abi0, _Abis...>
       }
 
     template <size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC constexpr const auto& _M_at() const
+      _GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+      _M_at() const
       {
 	if constexpr (_Np == 0)
 	  return first;
@@ -389,7 +371,8 @@ struct _SimdTuple<_Tp, _Abi0, _Abis...>
       }
 
     template <size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC constexpr auto _M_simd_at() const
+      _GLIBCXX_SIMD_INTRINSIC constexpr auto
+      _M_simd_at() const
       {
 	if constexpr (_Np == 0)
 	  return simd<_Tp, _Abi0>(__private_init, first);
@@ -552,8 +535,8 @@ _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
       }
 
     template <typename _R = _Tp, typename _Fp, typename... _More>
-      _GLIBCXX_SIMD_INTRINSIC auto _M_apply_r(_Fp&& __fun,
-					      const _More&... __more) const
+      _GLIBCXX_SIMD_INTRINSIC auto
+      _M_apply_r(_Fp&& __fun, const _More&... __more) const
       {
 	auto&& __first = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first,
 			       __more.first...);
@@ -590,8 +573,8 @@ _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
 	  return second[integral_constant<_Up, _I - simd_size_v<_Tp, _Abi0>>()];
       }
 
-    _GLIBCXX_SIMD_INTRINSIC
-    _Tp operator[](size_t __i) const noexcept
+    _GLIBCXX_SIMD_INTRINSIC _Tp
+    operator[](size_t __i) const noexcept
     {
       if constexpr (_S_tuple_size == 1)
 	return _M_subscript_read(__i);
@@ -613,8 +596,8 @@ _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
 	}
     }
 
-    _GLIBCXX_SIMD_INTRINSIC
-    void _M_set(size_t __i, _Tp __val) noexcept
+    _GLIBCXX_SIMD_INTRINSIC void
+    _M_set(size_t __i, _Tp __val) noexcept
     {
       if constexpr (_S_tuple_size == 1)
 	return _M_subscript_write(__i, __val);
@@ -633,8 +616,8 @@ _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
 
   private:
     // _M_subscript_read/_write {{{
-    _GLIBCXX_SIMD_INTRINSIC
-    _Tp _M_subscript_read([[maybe_unused]] size_t __i) const noexcept
+    _GLIBCXX_SIMD_INTRINSIC _Tp
+    _M_subscript_read([[maybe_unused]] size_t __i) const noexcept
     {
       if constexpr (__is_vectorizable_v<_FirstType>)
 	return first;
@@ -642,8 +625,8 @@ _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
 	return first[__i];
     }
 
-    _GLIBCXX_SIMD_INTRINSIC
-    void _M_subscript_write([[maybe_unused]] size_t __i, _Tp __y) noexcept
+    _GLIBCXX_SIMD_INTRINSIC void
+    _M_subscript_write([[maybe_unused]] size_t __i, _Tp __y) noexcept
     {
       if constexpr (__is_vectorizable_v<_FirstType>)
 	first = __y;
@@ -687,8 +670,7 @@ __make_simd_tuple(
 	  size_t _Offset = 0, // skip this many elements in __from0
 	  typename _R = __fixed_size_storage_t<_Tp, _Np>, typename _V0,
 	  typename _V0VT = _VectorTraits<_V0>, typename... _VX>
-  _GLIBCXX_SIMD_INTRINSIC _R constexpr __to_simd_tuple(const _V0 __from0,
-						       const _VX... __fromX)
+  _GLIBCXX_SIMD_INTRINSIC _R constexpr __to_simd_tuple(const _V0 __from0, const _VX... __fromX)
   {
     static_assert(is_same_v<typename _V0VT::value_type, _Tp>);
     static_assert(_Offset < _V0VT::_S_full_size);
@@ -900,11 +882,8 @@ __for_each(_SimdTuple<_Tp, _A0, _A1, _As...>& __t, _Fp&& __fun)
 // __for_each(_SimdTuple &, const _SimdTuple &, Fun) {{{1
 template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
   _GLIBCXX_SIMD_INTRINSIC constexpr void
-  __for_each(_SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b,
-	     _Fp&& __fun)
-  {
-    static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first);
-  }
+  __for_each(_SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b, _Fp&& __fun)
+  { static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first); }
 
 template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
 	  typename... _As, typename _Fp>
@@ -920,11 +899,8 @@ __for_each(_SimdTuple<_Tp, _A0, _A1, _As...>& __a,
 // __for_each(const _SimdTuple &, const _SimdTuple &, Fun) {{{1
 template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
   _GLIBCXX_SIMD_INTRINSIC constexpr void
-  __for_each(const _SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b,
-	     _Fp&& __fun)
-  {
-    static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first);
-  }
+  __for_each(const _SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b, _Fp&& __fun)
+  { static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first); }
 
 template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
 	  typename... _As, typename _Fp>
@@ -939,8 +915,7 @@ __for_each(const _SimdTuple<_Tp, _A0, _A1, _As...>& __a,
 
 // }}}1
 // __extract_part(_SimdTuple) {{{
-template <int _Index, int _Total, int _Combine, typename _Tp, typename _A0,
-	  typename... _As>
+template <int _Index, int _Total, int _Combine, typename _Tp, typename _A0, typename... _As>
   _GLIBCXX_SIMD_INTRINSIC auto // __vector_type_t or _SimdTuple
   __extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x)
   {
@@ -1062,8 +1037,8 @@ struct __autocvt_to_simd
       return &_M_data;
     }
 
-    _GLIBCXX_SIMD_INTRINSIC
-    constexpr __autocvt_to_simd(_Tp dd) : _M_data(dd) {}
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    __autocvt_to_simd(_Tp dd) : _M_data(dd) {}
 
     template <typename _Abi>
       _GLIBCXX_SIMD_INTRINSIC
@@ -1073,18 +1048,12 @@ struct __autocvt_to_simd
     template <typename _Abi>
       _GLIBCXX_SIMD_INTRINSIC
       operator simd<typename _TT::value_type, _Abi>&()
-      {
-	return *reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(
-	  &_M_data);
-      }
+      { return *reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data); }
 
     template <typename _Abi>
       _GLIBCXX_SIMD_INTRINSIC
       operator simd<typename _TT::value_type, _Abi>*()
-      {
-	return reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(
-	  &_M_data);
-      }
+      { return reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data); }
   };
 
 template <typename _Tp>
@@ -1197,12 +1166,12 @@ struct _SimdBase
 	  _SimdBase(const _SimdBase&) {}
 	  _SimdBase() = default;
 
-	  _GLIBCXX_SIMD_ALWAYS_INLINE
-	  explicit operator const _SimdMember &() const
+	  _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+	  operator const _SimdMember &() const
 	  { return static_cast<const simd<_Tp, _Fixed>*>(this)->_M_data; }
 
-	  _GLIBCXX_SIMD_ALWAYS_INLINE
-	  explicit operator array<_Tp, _Np>() const
+	  _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+	  operator array<_Tp, _Np>() const
 	  {
 	    array<_Tp, _Np> __r;
 	    // _SimdMember can be larger because of higher alignment
@@ -1224,10 +1193,12 @@ struct _SimdCastType
 	{
 	  _GLIBCXX_SIMD_ALWAYS_INLINE
 	  _SimdCastType(const array<_Tp, _Np>&);
+
 	  _GLIBCXX_SIMD_ALWAYS_INLINE
 	  _SimdCastType(const _SimdMember& dd) : _M_data(dd) {}
-	  _GLIBCXX_SIMD_ALWAYS_INLINE
-	  explicit operator const _SimdMember &() const { return _M_data; }
+
+	  _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+	  operator const _SimdMember &() const { return _M_data; }
 
 	private:
 	  const _SimdMember& _M_data;
@@ -1466,8 +1437,7 @@ __for_each(
     // _S_min, _S_max {{{2
     template <typename _Tp, typename... _As>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
-      _S_min(const _SimdTuple<_Tp, _As...>& __a,
-	     const _SimdTuple<_Tp, _As...>& __b)
+      _S_min(const _SimdTuple<_Tp, _As...>& __a, const _SimdTuple<_Tp, _As...>& __b)
       {
 	return __a._M_apply_per_chunk(
 	  [](auto __impl, auto __aa, auto __bb) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1478,8 +1448,7 @@ _S_min(const _SimdTuple<_Tp, _As...>& __a,
 
     template <typename _Tp, typename... _As>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple<_Tp, _As...>
-      _S_max(const _SimdTuple<_Tp, _As...>& __a,
-	     const _SimdTuple<_Tp, _As...>& __b)
+      _S_max(const _SimdTuple<_Tp, _As...>& __a, const _SimdTuple<_Tp, _As...>& __b)
       {
 	return __a._M_apply_per_chunk(
 	  [](auto __impl, auto __aa, auto __bb) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1692,7 +1661,7 @@ _S_frexp(const _SimdTuple<_Tp, _As...>& __x,
 #define _GLIBCXX_SIMD_TEST_ON_TUPLE_(name_)                                              \
     template <typename _Tp, typename... _As>                                             \
       static inline _MaskMember                                                          \
-	_S_##name_(const _SimdTuple<_Tp, _As...>& __x) noexcept                          \
+      _S_##name_(const _SimdTuple<_Tp, _As...>& __x) noexcept                            \
       {                                                                                  \
 	return _M_test([] (auto __impl, auto __xx) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA  { \
 		 return __impl._S_##name_(__xx);                                         \
@@ -1754,8 +1723,8 @@ __for_each(
 
     // smart_reference access {{{2
     template <typename _Tp, typename... _As, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC static void _S_set(_SimdTuple<_Tp, _As...>& __v,
-						 int __i, _Up&& __x) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_set(_SimdTuple<_Tp, _As...>& __v, int __i, _Up&& __x) noexcept
       { __v._M_set(__i, static_cast<_Up&&>(__x)); }
 
     // _S_masked_assign {{{2
@@ -1789,10 +1758,9 @@ __for_each(
 
     // _S_masked_cassign {{{2
     template <typename _Op, typename _Tp, typename... _As>
-      static inline void _S_masked_cassign(const _MaskMember __bits,
-					   _SimdTuple<_Tp, _As...>& __lhs,
-					   const _SimdTuple<_Tp, _As...>& __rhs,
-					   _Op __op)
+      static inline void
+      _S_masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+			const _SimdTuple<_Tp, _As...>& __rhs, _Op __op)
       {
 	__for_each(__lhs, __rhs,
 		   [&](auto __meta, auto& __native_lhs, auto __native_rhs)
@@ -1806,9 +1774,9 @@ __for_each(
     // Optimization for the case where the RHS is a scalar. No need to broadcast
     // the scalar to a simd first.
     template <typename _Op, typename _Tp, typename... _As>
-      static inline void _S_masked_cassign(const _MaskMember __bits,
-					   _SimdTuple<_Tp, _As...>& __lhs,
-					   const _Tp& __rhs, _Op __op)
+      static inline void
+      _S_masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+			const _Tp& __rhs, _Op __op)
       {
 	__for_each(
 	  __lhs, [&](auto __meta, auto& __native_lhs) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1906,7 +1874,8 @@ _S_convert(simd_mask<_Up, _UAbi> __x)
       { return __bits; }
 
     // _S_load {{{2
-    static inline _MaskMember _S_load(const bool* __mem) noexcept
+    static inline _MaskMember
+    _S_load(const bool* __mem) noexcept
     {
       // TODO: _UChar is not necessarily the best type to use here. For smaller
       // _Np _UShort, _UInt, _ULLong, float, and double can be more efficient.
@@ -1921,9 +1890,8 @@ _S_convert(simd_mask<_Up, _UAbi> __x)
     }
 
     // _S_masked_load {{{2
-    static inline _MaskMember _S_masked_load(_MaskMember __merge,
-					     _MaskMember __mask,
-					     const bool* __mem) noexcept
+    static inline _MaskMember
+    _S_masked_load(_MaskMember __merge, _MaskMember __mask, const bool* __mem) noexcept
     {
       _BitOps::_S_bit_iteration(__mask.to_ullong(),
 				[&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1933,8 +1901,8 @@ _S_convert(simd_mask<_Up, _UAbi> __x)
     }
 
     // _S_store {{{2
-    static inline void _S_store(const _MaskMember __bitmask,
-				bool* __mem) noexcept
+    static inline void
+    _S_store(const _MaskMember __bitmask, bool* __mem) noexcept
     {
       if constexpr (_Np == 1)
 	__mem[0] = __bitmask[0];
@@ -1943,8 +1911,8 @@ _S_convert(simd_mask<_Up, _UAbi> __x)
     }
 
     // _S_masked_store {{{2
-    static inline void _S_masked_store(const _MaskMember __v, bool* __mem,
-				       const _MaskMember __k) noexcept
+    static inline void
+    _S_masked_store(const _MaskMember __v, bool* __mem, const _MaskMember __k) noexcept
     {
       _BitOps::_S_bit_iteration(
 	__k, [&](auto __i) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { __mem[__i] = __v[__i]; });
@@ -1976,20 +1944,18 @@ _S_bit_xor(const _MaskMember& __x, const _MaskMember& __y) noexcept
     { return __x ^ __y; }
 
     // smart_reference access {{{2
-    _GLIBCXX_SIMD_INTRINSIC static void _S_set(_MaskMember& __k, int __i,
-					       bool __x) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_set(_MaskMember& __k, int __i, bool __x) noexcept
     { __k.set(__i, __x); }
 
     // _S_masked_assign {{{2
     _GLIBCXX_SIMD_INTRINSIC static void
-    _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs,
-		     const _MaskMember __rhs)
+    _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs, const _MaskMember __rhs)
     { __lhs = (__lhs & ~__k) | (__rhs & __k); }
 
     // Optimization for the case where the RHS is a scalar.
-    _GLIBCXX_SIMD_INTRINSIC static void _S_masked_assign(const _MaskMember __k,
-							 _MaskMember& __lhs,
-							 const bool __rhs)
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs, const bool __rhs)
     {
       if (__rhs)
 	__lhs |= __k;
@@ -2000,19 +1966,22 @@ _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs,
     // }}}2
     // _S_all_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_all_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_all_of(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).all(); }
 
     // }}}
     // _S_any_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_any_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_any_of(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).any(); }
 
     // }}}
     // _S_none_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_none_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_none_of(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).none(); }
 
     // }}}
@@ -2030,7 +1999,8 @@ _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs,
     // }}}
     // _S_popcount {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static int
+      _S_popcount(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).count(); }
 
     // }}}
diff --git a/libstdc++-v3/include/experimental/bits/simd_neon.h b/libstdc++-v3/include/experimental/bits/simd_neon.h
index 7e4cb17b205..637b121b130 100644
--- a/libstdc++-v3/include/experimental/bits/simd_neon.h
+++ b/libstdc++-v3/include/experimental/bits/simd_neon.h
@@ -134,7 +134,8 @@ _S_reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
     // math {{{
     // _S_sqrt {{{
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      _GLIBCXX_SIMD_INTRINSIC static _Tp _S_sqrt(_Tp __x)
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_sqrt(_Tp __x)
       {
 	if constexpr (__have_neon_a64)
 	  {
@@ -157,7 +158,8 @@ _S_reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
     // }}}
     // _S_trunc {{{
     template <typename _TW, typename _TVT = _VectorTraits<_TW>>
-      _GLIBCXX_SIMD_INTRINSIC static _TW _S_trunc(_TW __x)
+      _GLIBCXX_SIMD_INTRINSIC static _TW
+      _S_trunc(_TW __x)
       {
 	using _Tp = typename _TVT::value_type;
 	if constexpr (__have_neon_a32)
@@ -216,7 +218,8 @@ _S_round(_SimdWrapper<_Tp, _Np> __x)
     // }}}
     // _S_floor {{{
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      _GLIBCXX_SIMD_INTRINSIC static _Tp _S_floor(_Tp __x)
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_floor(_Tp __x)
       {
 	if constexpr (__have_neon_a32)
 	  {
@@ -239,7 +242,8 @@ _S_round(_SimdWrapper<_Tp, _Np> __x)
     // }}}
     // _S_ceil {{{
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      _GLIBCXX_SIMD_INTRINSIC static _Tp _S_ceil(_Tp __x)
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_ceil(_Tp __x)
       {
 	if constexpr (__have_neon_a32)
 	  {
@@ -400,7 +404,8 @@ struct _MaskImplNeon
 
     // _S_all_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_all_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_all_of(simd_mask<_Tp, _Abi> __k)
       {
 	const auto __kk
 	  = __vector_bitcast<char>(__k._M_data)
@@ -419,7 +424,8 @@ struct _MaskImplNeon
     // }}}
     // _S_any_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_any_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_any_of(simd_mask<_Tp, _Abi> __k)
       {
 	const auto __kk
 	  = __vector_bitcast<char>(__k._M_data)
@@ -438,7 +444,8 @@ struct _MaskImplNeon
     // }}}
     // _S_none_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_none_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_none_of(simd_mask<_Tp, _Abi> __k)
       {
 	const auto __kk = _Abi::_S_masked(__k._M_data);
 	if constexpr (sizeof(__k) == 16)
@@ -472,7 +479,8 @@ struct _MaskImplNeon
     // }}}
     // _S_popcount {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static int
+      _S_popcount(simd_mask<_Tp, _Abi> __k)
       {
 	if constexpr (sizeof(_Tp) == 1)
 	  {
diff --git a/libstdc++-v3/include/experimental/bits/simd_ppc.h b/libstdc++-v3/include/experimental/bits/simd_ppc.h
index 0d56912894a..eca1b34241b 100644
--- a/libstdc++-v3/include/experimental/bits/simd_ppc.h
+++ b/libstdc++-v3/include/experimental/bits/simd_ppc.h
@@ -124,7 +124,8 @@ struct _MaskImplPpc
 
     // _S_popcount {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static int
+      _S_popcount(simd_mask<_Tp, _Abi> __k)
       {
 	const auto __kv = __as_vector(__k);
 	if constexpr (__have_power10vec)
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
index f9ae70430db..1a1cc46fbe0 100644
--- a/libstdc++-v3/include/experimental/bits/simd_scalar.h
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -74,7 +74,8 @@ struct _IsValid
   template <typename _Tp>
     static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
 
-  _GLIBCXX_SIMD_INTRINSIC static constexpr bool _S_masked(bool __x)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_masked(bool __x)
   { return __x; }
 
   using _CommonImpl = _CommonImplScalar;
@@ -110,7 +111,8 @@ struct _CommonImplScalar
 {
   // _S_store {{{
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static void _S_store(_Tp __x, void* __addr)
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_store(_Tp __x, void* __addr)
     { __builtin_memcpy(__addr, &__x, sizeof(_Tp)); }
 
   // }}}
@@ -138,26 +140,26 @@ struct _SimdImplScalar
 
   // _S_broadcast {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp _S_broadcast(_Tp __x) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_broadcast(_Tp __x) noexcept
     { return __x; }
 
   // _S_generator {{{2
   template <typename _Fp, typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp _S_generator(_Fp&& __gen,
-							      _TypeTag<_Tp>)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_generator(_Fp&& __gen, _TypeTag<_Tp>)
     { return __gen(_SizeConstant<0>()); }
 
   // _S_load {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_load(const _Up* __mem,
-					       _TypeTag<_Tp>) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_load(const _Up* __mem, _TypeTag<_Tp>) noexcept
     { return static_cast<_Tp>(__mem[0]); }
 
   // _S_masked_load {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC
-    static _Tp _S_masked_load(_Tp __merge, bool __k,
-				     const _Up* __mem) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_masked_load(_Tp __merge, bool __k, const _Up* __mem) noexcept
     {
       if (__k)
 	__merge = static_cast<_Tp>(__mem[0]);
@@ -166,97 +168,95 @@ struct _SimdImplScalar
 
   // _S_store {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC
-    static void _S_store(_Tp __v, _Up* __mem, _TypeTag<_Tp>) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_store(_Tp __v, _Up* __mem, _TypeTag<_Tp>) noexcept
     { __mem[0] = static_cast<_Up>(__v); }
 
   // _S_masked_store {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC
-    static void _S_masked_store(const _Tp __v, _Up* __mem,
-				       const bool __k) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_masked_store(const _Tp __v, _Up* __mem, const bool __k) noexcept
     { if (__k) __mem[0] = __v; }
 
   // _S_negate {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr bool _S_negate(_Tp __x) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+    _S_negate(_Tp __x) noexcept
     { return !__x; }
 
   // _S_reduce {{{2
   template <typename _Tp, typename _BinaryOperation>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
     _S_reduce(const simd<_Tp, simd_abi::scalar>& __x, const _BinaryOperation&)
     { return __x._M_data; }
 
   // _S_min, _S_max {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_min(const _Tp __a, const _Tp __b)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_min(const _Tp __a, const _Tp __b)
     { return std::min(__a, __b); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_max(const _Tp __a, const _Tp __b)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_max(const _Tp __a, const _Tp __b)
     { return std::max(__a, __b); }
 
   // _S_complement {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_complement(_Tp __x) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_complement(_Tp __x) noexcept
     { return static_cast<_Tp>(~__x); }
 
   // _S_unary_minus {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_unary_minus(_Tp __x) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_unary_minus(_Tp __x) noexcept
     { return static_cast<_Tp>(-__x); }
 
   // arithmetic operators {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_plus(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_plus(_Tp __x, _Tp __y)
     {
       return static_cast<_Tp>(__promote_preserving_unsigned(__x)
 			      + __promote_preserving_unsigned(__y));
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_minus(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_minus(_Tp __x, _Tp __y)
     {
       return static_cast<_Tp>(__promote_preserving_unsigned(__x)
 			      - __promote_preserving_unsigned(__y));
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_multiplies(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_multiplies(_Tp __x, _Tp __y)
     {
       return static_cast<_Tp>(__promote_preserving_unsigned(__x)
 			      * __promote_preserving_unsigned(__y));
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_divides(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_divides(_Tp __x, _Tp __y)
     {
       return static_cast<_Tp>(__promote_preserving_unsigned(__x)
 			      / __promote_preserving_unsigned(__y));
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_modulus(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_modulus(_Tp __x, _Tp __y)
     {
       return static_cast<_Tp>(__promote_preserving_unsigned(__x)
 			      % __promote_preserving_unsigned(__y));
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_bit_and(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_bit_and(_Tp __x, _Tp __y)
     {
       if constexpr (is_floating_point_v<_Tp>)
 	{
@@ -269,8 +269,8 @@ struct _SimdImplScalar
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_bit_or(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_bit_or(_Tp __x, _Tp __y)
     {
       if constexpr (is_floating_point_v<_Tp>)
 	{
@@ -283,8 +283,8 @@ struct _SimdImplScalar
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_bit_xor(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_bit_xor(_Tp __x, _Tp __y)
     {
       if constexpr (is_floating_point_v<_Tp>)
 	{
@@ -297,13 +297,13 @@ struct _SimdImplScalar
     }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_bit_shift_left(_Tp __x, int __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_bit_shift_left(_Tp __x, int __y)
     { return static_cast<_Tp>(__promote_preserving_unsigned(__x) << __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    static constexpr _Tp _S_bit_shift_right(_Tp __x, int __y)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+    _S_bit_shift_right(_Tp __x, int __y)
     { return static_cast<_Tp>(__promote_preserving_unsigned(__x) >> __y); }
 
   // math {{{2
@@ -312,300 +312,362 @@ struct _SimdImplScalar
     using _ST = _SimdTuple<_Tp, simd_abi::scalar>;
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_acos(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_acos(_Tp __x)
     { return std::acos(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_asin(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_asin(_Tp __x)
     { return std::asin(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_atan(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_atan(_Tp __x)
     { return std::atan(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_cos(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_cos(_Tp __x)
     { return std::cos(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_sin(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_sin(_Tp __x)
     { return std::sin(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_tan(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_tan(_Tp __x)
     { return std::tan(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_acosh(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_acosh(_Tp __x)
     { return std::acosh(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_asinh(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_asinh(_Tp __x)
     { return std::asinh(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_atanh(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_atanh(_Tp __x)
     { return std::atanh(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_cosh(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_cosh(_Tp __x)
     { return std::cosh(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_sinh(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_sinh(_Tp __x)
     { return std::sinh(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_tanh(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_tanh(_Tp __x)
     { return std::tanh(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_atan2(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_atan2(_Tp __x, _Tp __y)
     { return std::atan2(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_exp(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_exp(_Tp __x)
     { return std::exp(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_exp2(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_exp2(_Tp __x)
     { return std::exp2(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_expm1(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_expm1(_Tp __x)
     { return std::expm1(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_log(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_log(_Tp __x)
     { return std::log(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_log10(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_log10(_Tp __x)
     { return std::log10(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_log1p(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_log1p(_Tp __x)
     { return std::log1p(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_log2(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_log2(_Tp __x)
     { return std::log2(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_logb(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_logb(_Tp __x)
     { return std::logb(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _ST<int> _S_ilogb(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _ST<int>
+    _S_ilogb(_Tp __x)
     { return {std::ilogb(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_pow(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_pow(_Tp __x, _Tp __y)
     { return std::pow(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_abs(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_abs(_Tp __x)
     { return std::abs(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_fabs(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_fabs(_Tp __x)
     { return std::fabs(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_sqrt(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_sqrt(_Tp __x)
     { return std::sqrt(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_cbrt(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_cbrt(_Tp __x)
     { return std::cbrt(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_erf(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_erf(_Tp __x)
     { return std::erf(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_erfc(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_erfc(_Tp __x)
     { return std::erfc(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_lgamma(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_lgamma(_Tp __x)
     { return std::lgamma(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_tgamma(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_tgamma(_Tp __x)
     { return std::tgamma(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_trunc(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_trunc(_Tp __x)
     { return std::trunc(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_floor(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_floor(_Tp __x)
     { return std::floor(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_ceil(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_ceil(_Tp __x)
     { return std::ceil(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_nearbyint(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_nearbyint(_Tp __x)
     { return std::nearbyint(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_rint(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_rint(_Tp __x)
     { return std::rint(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _ST<long> _S_lrint(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _ST<long>
+    _S_lrint(_Tp __x)
     { return {std::lrint(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _ST<long long> _S_llrint(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _ST<long long>
+    _S_llrint(_Tp __x)
     { return {std::llrint(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_round(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_round(_Tp __x)
     { return std::round(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _ST<long> _S_lround(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _ST<long>
+    _S_lround(_Tp __x)
     { return {std::lround(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _ST<long long> _S_llround(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC static _ST<long long>
+    _S_llround(_Tp __x)
     { return {std::llround(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_ldexp(_Tp __x, _ST<int> __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_ldexp(_Tp __x, _ST<int> __y)
     { return std::ldexp(__x, __y.first); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_scalbn(_Tp __x, _ST<int> __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_scalbn(_Tp __x, _ST<int> __y)
     { return std::scalbn(__x, __y.first); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_scalbln(_Tp __x, _ST<long> __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_scalbln(_Tp __x, _ST<long> __y)
     { return std::scalbln(__x, __y.first); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_fmod(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_fmod(_Tp __x, _Tp __y)
     { return std::fmod(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_remainder(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_remainder(_Tp __x, _Tp __y)
     { return std::remainder(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_nextafter(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_nextafter(_Tp __x, _Tp __y)
     { return std::nextafter(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_fdim(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_fdim(_Tp __x, _Tp __y)
     { return std::fdim(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_fmax(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_fmax(_Tp __x, _Tp __y)
     { return std::fmax(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_fmin(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_fmin(_Tp __x, _Tp __y)
     { return std::fmin(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_fma(_Tp __x, _Tp __y, _Tp __z)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_fma(_Tp __x, _Tp __y, _Tp __z)
     { return std::fma(__x, __y, __z); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_remquo(_Tp __x, _Tp __y, _ST<int>* __z)
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_remquo(_Tp __x, _Tp __y, _ST<int>* __z)
     { return std::remquo(__x, __y, &__z->first); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static _ST<int> _S_fpclassify(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static _ST<int>
+    _S_fpclassify(_Tp __x)
     { return {std::fpclassify(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isfinite(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isfinite(_Tp __x)
     { return std::isfinite(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isinf(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isinf(_Tp __x)
     { return std::isinf(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isnan(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isnan(_Tp __x)
     { return std::isnan(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isnormal(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isnormal(_Tp __x)
     { return std::isnormal(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_signbit(_Tp __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_signbit(_Tp __x)
     { return std::signbit(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isgreater(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isgreater(_Tp __x, _Tp __y)
     { return std::isgreater(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isgreaterequal(_Tp __x,
-								    _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isgreaterequal(_Tp __x, _Tp __y)
     { return std::isgreaterequal(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isless(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isless(_Tp __x, _Tp __y)
     { return std::isless(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_islessequal(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_islessequal(_Tp __x, _Tp __y)
     { return std::islessequal(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_islessgreater(_Tp __x,
-								   _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_islessgreater(_Tp __x, _Tp __y)
     { return std::islessgreater(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_isunordered(_Tp __x,
-								 _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_isunordered(_Tp __x, _Tp __y)
     { return std::isunordered(__x, __y); }
 
   // _S_increment & _S_decrement{{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    constexpr static void _S_increment(_Tp& __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _S_increment(_Tp& __x)
     { ++__x; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC
-    constexpr static void _S_decrement(_Tp& __x)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _S_decrement(_Tp& __x)
     { --__x; }
 
 
   // compares {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_equal_to(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_equal_to(_Tp __x, _Tp __y)
     { return __x == __y; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_not_equal_to(_Tp __x,
-								  _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_not_equal_to(_Tp __x, _Tp __y)
     { return __x != __y; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_less(_Tp __x, _Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_less(_Tp __x, _Tp __y)
     { return __x < __y; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool _S_less_equal(_Tp __x,
-								_Tp __y)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _S_less_equal(_Tp __x, _Tp __y)
     { return __x <= __y; }
 
   // smart_reference access {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC
-    constexpr static void _S_set(_Tp& __v, [[maybe_unused]] int __i,
-				 _Up&& __x) noexcept
+    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _S_set(_Tp& __v, [[maybe_unused]] int __i, _Up&& __x) noexcept
     {
       _GLIBCXX_DEBUG_ASSERT(__i == 0);
       __v = static_cast<_Up&&>(__x);
@@ -625,8 +687,8 @@ struct _SimdImplScalar
 
   // _S_masked_unary {{{2
   template <template <typename> class _Op, typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static _Tp _S_masked_unary(const bool __k,
-								 const _Tp __v)
+    _GLIBCXX_SIMD_INTRINSIC constexpr static _Tp
+    _S_masked_unary(const bool __k, const _Tp __v)
     { return static_cast<_Tp>(__k ? _Op<_Tp>{}(__v) : __v); }
 
   // }}}2
@@ -643,13 +705,15 @@ struct _MaskImplScalar
   // }}}
   // _S_broadcast {{{
   template <typename>
-    _GLIBCXX_SIMD_INTRINSIC static constexpr bool _S_broadcast(bool __x)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+    _S_broadcast(bool __x)
     { return __x; }
 
   // }}}
   // _S_load {{{
   template <typename>
-    _GLIBCXX_SIMD_INTRINSIC static constexpr bool _S_load(const bool* __mem)
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+    _S_load(const bool* __mem)
     { return __mem[0]; }
 
   // }}}
@@ -687,7 +751,8 @@ _S_convert(simd_mask<_Up, _UAbi> __x)
   }
 
   // _S_store {{{2
-  _GLIBCXX_SIMD_INTRINSIC static void _S_store(bool __v, bool* __mem) noexcept
+  _GLIBCXX_SIMD_INTRINSIC static void
+  _S_store(bool __v, bool* __mem) noexcept
   { __mem[0] = __v; }
 
   // _S_masked_store {{{2
@@ -699,42 +764,41 @@ _S_convert(simd_mask<_Up, _UAbi> __x)
   }
 
   // logical and bitwise operators {{{2
-  _GLIBCXX_SIMD_INTRINSIC
-  static constexpr bool _S_logical_and(bool __x, bool __y)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_logical_and(bool __x, bool __y)
   { return __x && __y; }
 
-  _GLIBCXX_SIMD_INTRINSIC
-  static constexpr bool _S_logical_or(bool __x, bool __y)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_logical_or(bool __x, bool __y)
   { return __x || __y; }
 
-  _GLIBCXX_SIMD_INTRINSIC
-  static constexpr bool _S_bit_not(bool __x)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_bit_not(bool __x)
   { return !__x; }
 
-  _GLIBCXX_SIMD_INTRINSIC
-  static constexpr bool _S_bit_and(bool __x, bool __y)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_bit_and(bool __x, bool __y)
   { return __x && __y; }
 
-  _GLIBCXX_SIMD_INTRINSIC
-  static constexpr bool _S_bit_or(bool __x, bool __y)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_bit_or(bool __x, bool __y)
   { return __x || __y; }
 
-  _GLIBCXX_SIMD_INTRINSIC
-  static constexpr bool _S_bit_xor(bool __x, bool __y)
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  _S_bit_xor(bool __x, bool __y)
   { return __x != __y; }
 
   // smart_reference access {{{2
-  _GLIBCXX_SIMD_INTRINSIC
-  constexpr static void _S_set(bool& __k, [[maybe_unused]] int __i,
-			       bool __x) noexcept
+  _GLIBCXX_SIMD_INTRINSIC constexpr static void
+  _S_set(bool& __k, [[maybe_unused]] int __i, bool __x) noexcept
   {
     _GLIBCXX_DEBUG_ASSERT(__i == 0);
     __k = __x;
   }
 
   // _S_masked_assign {{{2
-  _GLIBCXX_SIMD_INTRINSIC static void _S_masked_assign(bool __k, bool& __lhs,
-						       bool __rhs)
+  _GLIBCXX_SIMD_INTRINSIC static void
+  _S_masked_assign(bool __k, bool& __lhs, bool __rhs)
   {
     if (__k)
       __lhs = __rhs;
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index 8872ca301b9..61177fa0b04 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -40,10 +40,7 @@
 template <typename _Tp, size_t _Np>
   _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<__int_for_sizeof_t<_Tp>, _Np>
   __to_masktype(_SimdWrapper<_Tp, _Np> __x)
-  {
-    return reinterpret_cast<__vector_type_t<__int_for_sizeof_t<_Tp>, _Np>>(
-      __x._M_data);
-  }
+  { return reinterpret_cast<__vector_type_t<__int_for_sizeof_t<_Tp>, _Np>>( __x._M_data); }
 
 template <typename _TV,
 	  typename _TVT
@@ -434,7 +431,8 @@ struct _CommonImplX86
 #ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
   // _S_converts_via_decomposition {{{
   template <typename _From, typename _To, size_t _ToSize>
-    static constexpr bool _S_converts_via_decomposition()
+    static constexpr bool
+    _S_converts_via_decomposition()
     {
       if constexpr (is_integral_v<
 		      _From> && is_integral_v<_To> && sizeof(_From) == 8
@@ -465,8 +463,8 @@ struct _CommonImplX86
   using _CommonImplBuiltin::_S_store;
 
   template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static void _S_store(_SimdWrapper<_Tp, _Np> __x,
-						 void* __addr)
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_store(_SimdWrapper<_Tp, _Np> __x, void* __addr)
     {
       constexpr size_t _Bytes = _Np * sizeof(_Tp);
 
@@ -702,8 +700,8 @@ _pdep_u32(
   // Requires: _Tp to be an intrinsic type (integers blend per byte) and 16/32
   //           Bytes wide
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp _S_blend_intrin(_Tp __k, _Tp __a,
-						       _Tp __b) noexcept
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_blend_intrin(_Tp __k, _Tp __a, _Tp __b) noexcept
     {
       static_assert(is_same_v<decltype(__to_intrin(__a)), _Tp>);
       constexpr struct
@@ -843,6 +841,7 @@ struct _SimdImplX86
 	= (sizeof(_Tp) >= 4 && __have_avx512f) || __have_avx512bw  ? 64
 	  : (is_floating_point_v<_Tp>&& __have_avx) || __have_avx2 ? 32
 								   : 16;
+
     using _MaskImpl = typename _Abi::_MaskImpl;
 
     // _S_masked_load {{{
@@ -1033,8 +1032,7 @@ _S_masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
     // _S_masked_store_nocvt {{{
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static void
-      _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem,
-			    _SimdWrapper<bool, _Np> __k)
+      _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _SimdWrapper<bool, _Np> __k)
       {
 	[[maybe_unused]] const auto __vi = __to_intrin(__v);
 	if constexpr (sizeof(__vi) == 64)
@@ -1301,7 +1299,8 @@ _S_masked_store(const _SimdWrapper<_Tp, _Np> __v, _Up* __mem,
     // }}}
     // _S_multiplies {{{
     template <typename _V, typename _VVT = _VectorTraits<_V>>
-      _GLIBCXX_SIMD_INTRINSIC static constexpr _V _S_multiplies(_V __x, _V __y)
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _V
+      _S_multiplies(_V __x, _V __y)
       {
 	using _Tp = typename _VVT::value_type;
 	if (__builtin_is_constant_evaluated() || __x._M_is_constprop()
@@ -2739,7 +2738,8 @@ _S_round(_SimdWrapper<_Tp, _Np> __x)
     // }}}
     // _S_nearbyint {{{
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      _GLIBCXX_SIMD_INTRINSIC static _Tp _S_nearbyint(_Tp __x) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_nearbyint(_Tp __x) noexcept
       {
 	if constexpr (_TVT::template _S_is<float, 16>)
 	  return _mm512_roundscale_ps(__x, 0x0c);
@@ -2764,7 +2764,8 @@ _S_round(_SimdWrapper<_Tp, _Np> __x)
     // }}}
     // _S_rint {{{
     template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
-      _GLIBCXX_SIMD_INTRINSIC static _Tp _S_rint(_Tp __x) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _S_rint(_Tp __x) noexcept
       {
 	if constexpr (_TVT::template _S_is<float, 16>)
 	  return _mm512_roundscale_ps(__x, 0x04);
@@ -2912,7 +2913,8 @@ _S_signbit(_SimdWrapper<_Tp, _Np> __x)
     // _S_isnonzerovalue_mask {{{
     // (isnormal | is subnormal == !isinf & !isnan & !is zero)
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static auto _S_isnonzerovalue_mask(_Tp __x)
+      _GLIBCXX_SIMD_INTRINSIC static auto
+      _S_isnonzerovalue_mask(_Tp __x)
       {
 	using _Traits = _VectorTraits<_Tp>;
 	if constexpr (__have_avx512dq_vl)
@@ -3179,8 +3181,8 @@ _S_isnan(_SimdWrapper<_Tp, _Np> __x)
     // }}}
     // _S_isgreater {{{
     template <typename _Tp, size_t _Np>
-      static constexpr _MaskMember<_Tp> _S_isgreater(_SimdWrapper<_Tp, _Np> __x,
-						     _SimdWrapper<_Tp, _Np> __y)
+      static constexpr _MaskMember<_Tp>
+      _S_isgreater(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
       {
 	const auto __xi = __to_intrin(__x);
 	const auto __yi = __to_intrin(__y);
@@ -3297,8 +3299,8 @@ _S_isgreaterequal(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
     // }}}
     // _S_isless {{{
     template <typename _Tp, size_t _Np>
-      static constexpr _MaskMember<_Tp> _S_isless(_SimdWrapper<_Tp, _Np> __x,
-						  _SimdWrapper<_Tp, _Np> __y)
+      static constexpr _MaskMember<_Tp>
+      _S_isless(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
       {
 	const auto __xi = __to_intrin(__x);
 	const auto __yi = __to_intrin(__y);
@@ -3462,11 +3464,9 @@ _S_islessgreater(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
       }
 
     //}}} }}}
-    template <template <typename> class _Op, typename _Tp, typename _K,
-	      size_t _Np>
+    template <template <typename> class _Op, typename _Tp, typename _K, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
-      _S_masked_unary(const _SimdWrapper<_K, _Np> __k,
-		      const _SimdWrapper<_Tp, _Np> __v)
+      _S_masked_unary(const _SimdWrapper<_K, _Np> __k, const _SimdWrapper<_Tp, _Np> __v)
       {
 	if (__k._M_is_constprop_none_of())
 	  return __v;
@@ -3543,8 +3543,8 @@ struct _MaskImplX86Mixin
 
   // _S_to_maskvector(bool) {{{
   template <typename _Up, size_t _ToN = 1, typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC static constexpr enable_if_t<
-      is_same_v<_Tp, bool>, _SimdWrapper<_Up, _ToN>>
+    _GLIBCXX_SIMD_INTRINSIC static constexpr
+    enable_if_t<is_same_v<_Tp, bool>, _SimdWrapper<_Up, _ToN>>
     _S_to_maskvector(_Tp __x)
     {
       static_assert(is_same_v<_Up, __int_for_sizeof_t<_Up>>);
@@ -3554,8 +3554,7 @@ _S_to_maskvector(_Tp __x)
 
   // }}}
   // _S_to_maskvector(_SanitizedBitMask) {{{
-  template <typename _Up, size_t _UpN = 0, size_t _Np,
-	    size_t _ToN = _UpN == 0 ? _Np : _UpN>
+  template <typename _Up, size_t _UpN = 0, size_t _Np, size_t _ToN = _UpN == 0 ? _Np : _UpN>
     _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
     _S_to_maskvector(_SanitizedBitMask<_Np> __x)
     {
@@ -4626,8 +4625,8 @@ _mm256_cvtepi8_epi64(
 
     // _S_store {{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void _S_store(_SimdWrapper<_Tp, _Np> __v,
-						   bool* __mem) noexcept
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_store(_SimdWrapper<_Tp, _Np> __v, bool* __mem) noexcept
       {
 	if constexpr (__is_avx512_abi<_Abi>())
 	  {
@@ -4791,8 +4790,7 @@ _S_masked_store(const _SimdWrapper<_Tp, _Np> __v, bool* __mem,
     // logical and bitwise operators {{{2
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_logical_and(const _SimdWrapper<_Tp, _Np>& __x,
-		     const _SimdWrapper<_Tp, _Np>& __y)
+      _S_logical_and(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
@@ -4813,8 +4811,7 @@ _S_logical_and(const _SimdWrapper<_Tp, _Np>& __x,
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_logical_or(const _SimdWrapper<_Tp, _Np>& __x,
-		    const _SimdWrapper<_Tp, _Np>& __y)
+      _S_logical_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
@@ -4860,8 +4857,7 @@ _S_bit_not(const _SimdWrapper<_Tp, _Np>& __x)
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_bit_and(const _SimdWrapper<_Tp, _Np>& __x,
-		 const _SimdWrapper<_Tp, _Np>& __y)
+      _S_bit_and(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
@@ -4882,8 +4878,7 @@ _S_bit_and(const _SimdWrapper<_Tp, _Np>& __x,
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_bit_or(const _SimdWrapper<_Tp, _Np>& __x,
-		const _SimdWrapper<_Tp, _Np>& __y)
+      _S_bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
@@ -4904,8 +4899,7 @@ _S_bit_or(const _SimdWrapper<_Tp, _Np>& __x,
 
     template <typename _Tp, size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
-      _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
-		 const _SimdWrapper<_Tp, _Np>& __y)
+      _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
@@ -4929,8 +4923,7 @@ _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
     template <size_t _Np>
       _GLIBCXX_SIMD_INTRINSIC static void
       _S_masked_assign(_SimdWrapper<bool, _Np> __k,
-		       _SimdWrapper<bool, _Np>& __lhs,
-		       _SimdWrapper<bool, _Np> __rhs)
+		       _SimdWrapper<bool, _Np>& __lhs, _SimdWrapper<bool, _Np> __rhs)
       {
 	__lhs._M_data
 	  = (~__k._M_data & __lhs._M_data) | (__k._M_data & __rhs._M_data);
@@ -4952,7 +4945,8 @@ _S_masked_assign(_SimdWrapper<bool, _Np> __k,
     //}}}
     // _S_all_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_all_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_all_of(simd_mask<_Tp, _Abi> __k)
       {
 	if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
 	  {
@@ -5008,7 +5002,8 @@ _S_masked_assign(_SimdWrapper<bool, _Np> __k,
     // }}}
     // _S_any_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_any_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_any_of(simd_mask<_Tp, _Abi> __k)
       {
 	if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
 	  {
@@ -5043,7 +5038,8 @@ _S_masked_assign(_SimdWrapper<bool, _Np> __k,
     // }}}
     // _S_none_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_none_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_none_of(simd_mask<_Tp, _Abi> __k)
       {
 	if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
 	  {
@@ -5078,7 +5074,8 @@ _S_masked_assign(_SimdWrapper<bool, _Np> __k,
     // }}}
     // _S_some_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool _S_some_of(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static bool
+      _S_some_of(simd_mask<_Tp, _Abi> __k)
       {
 	if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
 	  {
@@ -5119,7 +5116,8 @@ _S_masked_assign(_SimdWrapper<bool, _Np> __k,
     // }}}
     // _S_popcount {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> __k)
+      _GLIBCXX_SIMD_INTRINSIC static int
+      _S_popcount(simd_mask<_Tp, _Abi> __k)
       {
 	constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
 	const auto __kk = _Abi::_S_masked(__k._M_data)._M_data;

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 7/8] libstdc++: Fix -Wsign-compare issue
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
                   ` (5 preceding siblings ...)
  2023-02-23  8:50 ` [PATCH 6/8] libstdc++: Fix formatting Matthias Kretz
@ 2023-02-23  8:50 ` Matthias Kretz
  2023-02-23 11:07   ` Jonathan Wakely
  2023-02-23  8:50 ` [PATCH 8/8] libstdc++: Test that integral simd reductions are precise Matthias Kretz
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:50 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]



Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	* include/experimental/bits/simd_builtin.h (_S_set): Compare as
	int. The actual range of these indexes is very small.
---
 libstdc++-v3/include/experimental/bits/simd_builtin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0007-libstdc-Fix-Wsign-compare-issue.patch --]
[-- Type: text/x-patch, Size: 653 bytes --]

diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index 0e75f941288..30bbfa7d478 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -2875,7 +2875,7 @@ _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
 		__k = __generate_from_n_evaluations<_Np,
 						    __vector_type_t<_Tp, _Np>>(
 		  [&](auto __j) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
-		    if (__i == __j)
+		    if (__i == static_cast<int>(__j))
 		      return _Tp(-__x);
 		    else
 		      return __k[+__j];

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 8/8] libstdc++: Test that integral simd reductions are precise
  2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
                   ` (6 preceding siblings ...)
  2023-02-23  8:50 ` [PATCH 7/8] libstdc++: Fix -Wsign-compare issue Matthias Kretz
@ 2023-02-23  8:50 ` Matthias Kretz
  2023-02-23 11:08   ` Jonathan Wakely
  7 siblings, 1 reply; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23  8:50 UTC (permalink / raw)
  To: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 918 bytes --]



Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	* testsuite/experimental/simd/tests/reductions.cc: Introduce
	max_distance as the type-dependent max error.
---
 libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: 0008-libstdc-Test-that-integral-simd-reductions-are-preci.patch --]
[-- Type: text/x-patch, Size: 665 bytes --]

diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
index 0c4c79feb20..fed164314d7 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
@@ -112,6 +112,7 @@ template <typename V>
       T acc = x[0];
       for (size_t i = 1; i < V::size(); ++i)
 	acc += x[i];
-      ULP_COMPARE(reduce(x), acc, V::size() / 2).on_failure("x = ", x);
+      const T max_distance = std::is_integral_v<T> ? 0 : V::size() / 2;
+      ULP_COMPARE(reduce(x), acc, max_distance).on_failure("x = ", x);
     });
   }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/8] libstdc++: Simplify three helper functions into one
  2023-02-23  8:49 ` [PATCH 1/8] libstdc++: Simplify three helper functions into one Matthias Kretz
@ 2023-02-23 11:05   ` Jonathan Wakely
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-23 11:05 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:53, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>
>
> Broadcast is a very common function. This should reduce compile-time
> effort.

OK for all branches.

> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/108030
>         * include/experimental/bits/simd.h (__vector_broadcast):
>         Implement via __vector_broadcast_impl instead of
>         __call_with_n_evaluations + 2 lambdas.
>         (__vector_broadcast_impl): New.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/8] libstdc++: Fix simd build failure on clang
  2023-02-23  8:49 ` [PATCH 2/8] libstdc++: Fix simd build failure on clang Matthias Kretz
@ 2023-02-23 11:06   ` Jonathan Wakely
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-23 11:06 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:54, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>
>
> Clang does not support __attribute__ on lambdas. Therefore, only set
> _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA if __clang__ is not defined.

OK for all branches.

> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/108030
>         * include/experimental/bits/simd_detail.h
>         (_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA): Define as empty for
>         __clang__.
> ---
>  libstdc++-v3/include/experimental/bits/simd_detail.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation
  2023-02-23  8:49 ` [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation Matthias Kretz
@ 2023-02-23 11:07   ` Jonathan Wakely
  2023-02-23 11:33     ` Matthias Kretz
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-23 11:07 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:55, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>
>
> Resolves -Wtautological-compare warnings about `if
> (__builtin_is_constant_evaluated())` in the implementations of these
> functions.

The 'inline' is redundant now, because these are unconditionally
constexpr which implies inline.

OK for all branches, with or without removing the 'inline'.


>
> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         * include/experimental/bits/simd_x86.h (_S_bit_shift_left)
>         (_S_bit_shift_right): Declare constexpr. The implementation was
>         already expecting constexpr evaluation.
> ---
>  libstdc++-v3/include/experimental/bits/simd_x86.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 7/8] libstdc++: Fix -Wsign-compare issue
  2023-02-23  8:50 ` [PATCH 7/8] libstdc++: Fix -Wsign-compare issue Matthias Kretz
@ 2023-02-23 11:07   ` Jonathan Wakely
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-23 11:07 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:51, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>

OK for all branches.

> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         * include/experimental/bits/simd_builtin.h (_S_set): Compare as
>         int. The actual range of these indexes is very small.
> ---
>  libstdc++-v3/include/experimental/bits/simd_builtin.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 8/8] libstdc++: Test that integral simd reductions are precise
  2023-02-23  8:50 ` [PATCH 8/8] libstdc++: Test that integral simd reductions are precise Matthias Kretz
@ 2023-02-23 11:08   ` Jonathan Wakely
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-23 11:08 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:51, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>

OK for all branches.

>
> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         * testsuite/experimental/simd/tests/reductions.cc: Introduce
>         max_distance as the type-dependent max error.
> ---
>  libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation
  2023-02-23 11:07   ` Jonathan Wakely
@ 2023-02-23 11:33     ` Matthias Kretz
  0 siblings, 0 replies; 19+ messages in thread
From: Matthias Kretz @ 2023-02-23 11:33 UTC (permalink / raw)
  To: gcc-patches, libstdc++; +Cc: Jonathan Wakely

On Thursday, 23 February 2023 12:07:11 CET Jonathan Wakely wrote:
> On Thu, 23 Feb 2023 at 08:55, Matthias Kretz via Libstdc++
> 
> <libstdc++@gcc.gnu.org> wrote:
> > Resolves -Wtautological-compare warnings about `if
> > (__builtin_is_constant_evaluated())` in the implementations of these
> > functions.
> 
> The 'inline' is redundant now, because these are unconditionally
> constexpr which implies inline.

In the simd implementation I always have to make a conscious choice of 
always_inline vs. inline. Having the inline keyword there helps documenting 
that choice and helps revisiting all not-always_inline functions quickly.

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation
  2023-02-23  8:49 ` [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation Matthias Kretz
@ 2023-02-24 17:10   ` Jonathan Wakely
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-24 17:10 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:54, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>
>
> For simd, the inlining behavior should be similar to builtin types. (No
> operator on buitin types is ever translated into a function call.)
> Therefore, always_inline is the right choice (i.e. inline on -O0 as
> well).

OK for trunk (and OK for backport later if no problems show up from
the extra inlining).


>
> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/108030
>         * include/experimental/bits/simd_fixed_size.h
>         (_SimdImplFixedSize::_S_broadcast): Replace inline with
>         _GLIBCXX_SIMD_INTRINSIC.
>         (_SimdImplFixedSize::_S_generate): Likewise.
>         (_SimdImplFixedSize::_S_load): Likewise.
>         (_SimdImplFixedSize::_S_masked_load): Likewise.
>         (_SimdImplFixedSize::_S_store): Likewise.
>         (_SimdImplFixedSize::_S_masked_store): Likewise.
>         (_SimdImplFixedSize::_S_min): Likewise.
>         (_SimdImplFixedSize::_S_max): Likewise.
>         (_SimdImplFixedSize::_S_complement): Likewise.
>         (_SimdImplFixedSize::_S_unary_minus): Likewise.
>         (_SimdImplFixedSize::_S_plus): Likewise.
>         (_SimdImplFixedSize::_S_minus): Likewise.
>         (_SimdImplFixedSize::_S_multiplies): Likewise.
>         (_SimdImplFixedSize::_S_divides): Likewise.
>         (_SimdImplFixedSize::_S_modulus): Likewise.
>         (_SimdImplFixedSize::_S_bit_and): Likewise.
>         (_SimdImplFixedSize::_S_bit_or): Likewise.
>         (_SimdImplFixedSize::_S_bit_xor): Likewise.
>         (_SimdImplFixedSize::_S_bit_shift_left): Likewise.
>         (_SimdImplFixedSize::_S_bit_shift_right): Likewise.
>         (_SimdImplFixedSize::_S_remquo): Add inline keyword (to be
>         explicit about not always-inline, yet).
>         (_SimdImplFixedSize::_S_isinf): Likewise.
>         (_SimdImplFixedSize::_S_isfinite): Likewise.
>         (_SimdImplFixedSize::_S_isnan): Likewise.
>         (_SimdImplFixedSize::_S_isnormal): Likewise.
>         (_SimdImplFixedSize::_S_signbit): Likewise.
> ---
>  .../experimental/bits/simd_fixed_size.h       | 60 +++++++++----------
>  1 file changed, 30 insertions(+), 30 deletions(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation
  2023-02-23  8:49 ` [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation Matthias Kretz
@ 2023-02-24 17:12   ` Jonathan Wakely
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-24 17:12 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:55, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>


OK for trunk (and maybe backport later if you want to).


>
> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/108856
>         * include/experimental/bits/simd_builtin.h
>         (_SimdImplBuiltin::_S_masked_unary): More efficient
>         implementation of masked inc-/decrement for integers and floats
>         without AVX2.
>         * include/experimental/bits/simd_x86.h
>         (_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
>         builtins for masked inc-/decrement.
> ---
>  .../include/experimental/bits/simd_builtin.h  | 27 +++++++-
>  .../include/experimental/bits/simd_x86.h      | 68 +++++++++++++++++++
>  2 files changed, 93 insertions(+), 2 deletions(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/8] libstdc++: Fix formatting
  2023-02-23  8:50 ` [PATCH 6/8] libstdc++: Fix formatting Matthias Kretz
@ 2023-02-24 17:14   ` Jonathan Wakely
  2023-02-24 18:44     ` Matthias Kretz
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Wakely @ 2023-02-24 17:14 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-patches, libstdc++

On Thu, 23 Feb 2023 at 08:54, Matthias Kretz via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
>
>
> Whitespace changes only.

Looks like there are a few remaining spaces that could be removed
where you've joined lines, e.g.

+    { return static_cast<_Up*>( __builtin_assume_aligned(__ptr,
_S_alignment<_Tp, _Up>)); }

and

+  { return __vector_type_t<_Tp, _Np>{
static_cast<_Tp>(__gen(_SizeConstant<_I>()))...}; }

OK for trunk anyway (and the branches if you want).


>
> Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         * include/experimental/bits/simd.h: Line breaks and indenting
>         fixed to follow the libstdc++ standard.
>         * include/experimental/bits/simd_builtin.h: Likewise.
>         * include/experimental/bits/simd_fixed_size.h: Likewise.
>         * include/experimental/bits/simd_neon.h: Likewise.
>         * include/experimental/bits/simd_ppc.h: Likewise.
>         * include/experimental/bits/simd_scalar.h: Likewise.
>         * include/experimental/bits/simd_x86.h: Likewise.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h | 473 ++++++------
>  .../include/experimental/bits/simd_builtin.h  | 692 +++++++++---------
>  .../experimental/bits/simd_fixed_size.h       | 228 +++---
>  .../include/experimental/bits/simd_neon.h     |  24 +-
>  .../include/experimental/bits/simd_ppc.h      |   3 +-
>  .../include/experimental/bits/simd_scalar.h   | 362 +++++----
>  .../include/experimental/bits/simd_x86.h      |  90 ++-
>  7 files changed, 942 insertions(+), 930 deletions(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  stdₓ::simd
> ──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/8] libstdc++: Fix formatting
  2023-02-24 17:14   ` Jonathan Wakely
@ 2023-02-24 18:44     ` Matthias Kretz
  0 siblings, 0 replies; 19+ messages in thread
From: Matthias Kretz @ 2023-02-24 18:44 UTC (permalink / raw)
  To: gcc-patches, libstdc++; +Cc: Jonathan Wakely

On Friday, 24 February 2023 18:14:53 CET Jonathan Wakely wrote:
> Looks like there are a few remaining spaces that could be removed
> where you've joined lines, e.g.

Fixed and pushed.

> OK for trunk anyway (and the branches if you want).

I'll likely backport after I backported all other patches to trunk that came 
before this one.

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-02-24 18:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-23  8:48 [PATCH 0/8] std::experimental::simd patchset Matthias Kretz
2023-02-23  8:49 ` [PATCH 1/8] libstdc++: Simplify three helper functions into one Matthias Kretz
2023-02-23 11:05   ` Jonathan Wakely
2023-02-23  8:49 ` [PATCH 2/8] libstdc++: Fix simd build failure on clang Matthias Kretz
2023-02-23 11:06   ` Jonathan Wakely
2023-02-23  8:49 ` [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation Matthias Kretz
2023-02-24 17:12   ` Jonathan Wakely
2023-02-23  8:49 ` [PATCH 4/8] libstdc++: Add missing constexpr on simd shift implementation Matthias Kretz
2023-02-23 11:07   ` Jonathan Wakely
2023-02-23 11:33     ` Matthias Kretz
2023-02-23  8:49 ` [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation Matthias Kretz
2023-02-24 17:10   ` Jonathan Wakely
2023-02-23  8:50 ` [PATCH 6/8] libstdc++: Fix formatting Matthias Kretz
2023-02-24 17:14   ` Jonathan Wakely
2023-02-24 18:44     ` Matthias Kretz
2023-02-23  8:50 ` [PATCH 7/8] libstdc++: Fix -Wsign-compare issue Matthias Kretz
2023-02-23 11:07   ` Jonathan Wakely
2023-02-23  8:50 ` [PATCH 8/8] libstdc++: Test that integral simd reductions are precise Matthias Kretz
2023-02-23 11:08   ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).